Text Generation
Transformers
Safetensors
llada
feature-extraction
diffusion
fast-inference
d3llm
conversational
custom_code
Instructions to use d3LLM/d3LLM_LLaDA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use d3LLM/d3LLM_LLaDA with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="d3LLM/d3LLM_LLaDA", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("d3LLM/d3LLM_LLaDA", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use d3LLM/d3LLM_LLaDA with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "d3LLM/d3LLM_LLaDA" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "d3LLM/d3LLM_LLaDA", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/d3LLM/d3LLM_LLaDA
- SGLang
How to use d3LLM/d3LLM_LLaDA with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "d3LLM/d3LLM_LLaDA" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "d3LLM/d3LLM_LLaDA", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "d3LLM/d3LLM_LLaDA" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "d3LLM/d3LLM_LLaDA", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use d3LLM/d3LLM_LLaDA with Docker Model Runner:
docker model run hf.co/d3LLM/d3LLM_LLaDA
Commit History
Improve model card: add paper link, citation, license, and library_name (#1) fe1e0a5 verified
Update README.md 283ff04 verified
Update README.md 661003b verified
Update README.md 3045886 verified
Update README.md 70730fe verified
Update README.md 5beaf72 verified
Update README.md 098382b verified
Create README.md 5c882e1 verified
Upload tokenizer.json with huggingface_hub 3993cf9 verified
Chien commited on
Upload special_tokens_map.json with huggingface_hub 3c74dd0 verified
Chien commited on
Upload tokenizer_config.json with huggingface_hub 13c9bdb verified
Chien commited on
Upload model.safetensors.index.json with huggingface_hub 7f962f2 verified
Chien commited on
Upload model-00004-of-00004.safetensors with huggingface_hub 6dfaea3 verified
Chien commited on
Upload model-00003-of-00004.safetensors with huggingface_hub f659299 verified
Chien commited on
Upload model-00002-of-00004.safetensors with huggingface_hub fdee2b0 verified
Chien commited on
Upload model-00001-of-00004.safetensors with huggingface_hub 7e8e28c verified
Chien commited on
Upload generation_config.json with huggingface_hub d4ed13e verified
Chien commited on
Upload config.json with huggingface_hub 95f4efd verified
Chien commited on
initial commit 1f82ba6 verified
Chien commited on