Text Generation
Transformers
Safetensors
PyTorch
English
modernbert
fill-mask
text-diffusion
discrete-diffusion
mdlm
seed-diffusion
generative-ai
conversational
Instructions to use JorgeVanco/diffusionGPT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use JorgeVanco/diffusionGPT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="JorgeVanco/diffusionGPT") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("JorgeVanco/diffusionGPT") model = AutoModelForMaskedLM.from_pretrained("JorgeVanco/diffusionGPT") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use JorgeVanco/diffusionGPT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "JorgeVanco/diffusionGPT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JorgeVanco/diffusionGPT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/JorgeVanco/diffusionGPT
- SGLang
How to use JorgeVanco/diffusionGPT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "JorgeVanco/diffusionGPT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JorgeVanco/diffusionGPT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "JorgeVanco/diffusionGPT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JorgeVanco/diffusionGPT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use JorgeVanco/diffusionGPT with Docker Model Runner:
docker model run hf.co/JorgeVanco/diffusionGPT
| language: | |
| - en | |
| license: mit | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
| tags: | |
| - text-diffusion | |
| - discrete-diffusion | |
| - pytorch | |
| - mdlm | |
| - seed-diffusion | |
| - generative-ai | |
| model_index: | |
| - name: diffusionGPT | |
| results: [] | |
| custom_pipelines: | |
| text-diffusion: | |
| impl: pipeline.TextDiffusionPipeline | |
| pt: | |
| - AutoModelForMaskedLM | |
| # diffusionGPT | |
| [**GitHub Repository**](https://github.com/JorgeVanco/diffusionGPT) | [**Model License: MIT**](https://opensource.org/licenses/MIT) | |
| DiffusionGPT is a **Discrete Diffusion Language Model (MDLM)** fine-tuned for conversational AI. Unlike traditional autoregressive models (like GPT-4 or Llama) that predict text one token at a time from left to right, DiffusionGPT generates text through an iterative denoising process. | |
| This approach allows for parallel decoding, flexible text infilling, and "Seed Diffusion" editing capabilities. | |
| ## Key Features | |
| * **Parallel Decoding:** Generates and refines tokens simultaneously across the sequence. | |
| * **Seed Diffusion Editing:** Implements advanced editing logic (per [arXiv:2508.02193](https://arxiv.org/pdf/2508.02193)) to refine existing text while maintaining context. | |
| * **Semi-Autoregressive Generation:** Supports block-wise generation for long-form content, combining the strengths of diffusion with the length-scaling of autoregression. | |
| * **Custom Pipeline:** Built-in support for `TextDiffusionPipeline` which handles the complex ancestral sampling and confidence-based unmasking automatically. | |
| --- | |
| ## Quickstart | |
| To use this model, ensure you have the `pipeline.py` file from the repository in your local directory (Hugging Face will download it automatically if `trust_remote_code=True`). | |
| ### 1. Basic Chat Completion | |
| ```python | |
| from transformers import pipeline | |
| pipe = pipeline( | |
| "text-diffusion", | |
| model="JorgeVanco/diffusionGPT", | |
| trust_remote_code=True | |
| ) | |
| messages = [{"role": "user", "content": "Explain diffusion models in simple terms."}] | |
| prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) | |
| # Generate using standard diffusion | |
| result = pipe(prompt, num_steps=50) | |
| print(result["decoded_texts"][0]) | |
| ``` | |
| ### 2. Streaming Intermediate Denoising | |
| Watch the model "think" as it refines the text from masks to a final response. | |
| ```python | |
| for partial_text in pipe.stream_generation(prompt, num_steps=32): | |
| print(f"\033[H\033[J{partial_text}") # Clears terminal for animation effect | |
| ``` | |
| ### 3. Block-wise (Semi-Autoregressive) Generation | |
| For longer responses that exceed the standard sequence length: | |
| ```python | |
| response = pipe.stream_semi_autoregressive_generate( | |
| input_text=prompt, | |
| block_size=64, | |
| max_length=256, | |
| num_steps=32 | |
| ) | |
| for step in response: | |
| print(step) | |
| ``` | |
| ## Technical Details | |
| ### Model Architecture | |
| The backbone is a Transformer Encoder (`AutoModelForMaskedLM`) configured for discrete diffusion. | |
| - **Training Objective:** Multi-step corruption and reconstruction (MDLM formulation). | |
| - **Corruption Strategy:** Uses a `DiscreteDiffusionCollator` which applies random masking and optional "Insertion Corruption" using a `<|delete|>` token. | |
| ### Sampling Parameters | |
| In the `pipe()`, you can tune the generation using: | |
| - `num_steps`: Higher steps generally lead to higher quality but slower inference. | |
| - `use_confidence`: When `True`, the model uses confidence-based unmasking (Top-K) instead of random unmasking. | |
| - `allow_edits`: Enables Seed Diffusion logic to refine previously "visible" tokens (leave at `True` for better generation). | |
| ## Training Setup | |
| The model was trained using the `DiffusionTrainer` class provided in the [source repository](https://github.com/JorgeVanco/diffusionGPT). | |
| ### Hardware & Config: | |
| - **Optimizer:** AdamW with linear schedule. | |
| - **Loss:** Time-weighted Cross-Entropy (MDLM). | |
| - **Curriculum:** Includes a `SeedDiffusionCurriculumCallback` that introduces corruption stages gradually to improve model robustness. | |
| ### Example Training Command: | |
| ```bash | |
| uv run train.py \ | |
| --num_hidden_layers 12 \ | |
| --hidden_size 768 \ | |
| --num_diffusion_steps 32 \ | |
| --max_seq_length 128 \ | |
| --target_param_data_ratio 20 | |
| ``` | |
| ## ⚠️ Limitations & Bias | |
| - **Factual Accuracy:** Like all LLMs, this model can hallucinate. It is not optimized for factual retrieval. | |
| - **Coherence:** While excellent for short-to-medium chat, very long-range coherence is currently under development through the semi-autoregressive block method. | |
| - **Special Tokens:** The model relies on specific tokens like `<|im_start|>` and `<|im_end|>` for chat structure. | |
| ## Citation & Acknowledgments | |
| This implementation is inspired by recent research in discrete diffusion for language: | |
| - **MDLM:** [Simple and Effective Masked Diffusion Language Models](https://s-sahoo.com/mdlm/) | |
| - **Seed Diffusion:** [Seed Diffusion: Continuous Training of Discrete Diffusion Language Models](https://seed.bytedance.com/en/seed_diffusion) | |
| ## License | |
| This model and its associated code are relased under the **MIT License**. |