Instructions to use nvidia/Cosmos3-Super-Text2Image with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Cosmos
How to use nvidia/Cosmos3-Super-Text2Image with Cosmos:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Diffusers
How to use nvidia/Cosmos3-Super-Text2Image with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("nvidia/Cosmos3-Super-Text2Image", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
Add SGLang serving instructions
#11
by MickJ - opened
README.md
CHANGED
|
@@ -9,6 +9,8 @@ tags:
|
|
| 9 |
- cosmos
|
| 10 |
- cosmos3
|
| 11 |
- vllm-omni
|
|
|
|
|
|
|
| 12 |
- diffusers
|
| 13 |
- text-to-image
|
| 14 |
- image-generation
|
|
@@ -381,6 +383,41 @@ print("Saved image to /tmp/cosmos3_t2i.png")
|
|
| 381 |
|
| 382 |

|
| 383 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 384 |
### Diffusers
|
| 385 |
|
| 386 |
Cosmos3 is fully supported within the popular HuggingFace Diffusers package. This integration makes it a supported inference backend, allowing developers to easily incorporate Cosmos3's capabilities - such as text-to-image generation - into their pipelines using the Cosmos3OmniPipeline class, as demonstrated by the provided code examples (see examples for other modalities on the HuggingFace Cosmos3 page).
|
|
|
|
| 9 |
- cosmos
|
| 10 |
- cosmos3
|
| 11 |
- vllm-omni
|
| 12 |
+
- sglang
|
| 13 |
+
- sglang-diffusion
|
| 14 |
- diffusers
|
| 15 |
- text-to-image
|
| 16 |
- image-generation
|
|
|
|
| 383 |
|
| 384 |

|
| 385 |
|
| 386 |
+
### SGLang
|
| 387 |
+
|
| 388 |
+
SGLang-Diffusion can serve `nvidia/Cosmos3-Super-Text2Image` through the OpenAI-compatible image generation endpoint. Install SGLang from source with diffusion dependencies, then start the server:
|
| 389 |
+
|
| 390 |
+
```bash
|
| 391 |
+
git clone https://github.com/sgl-project/sglang.git
|
| 392 |
+
cd sglang
|
| 393 |
+
pip install -e "python[diffusion]"
|
| 394 |
+
pip install "cosmos-guardrail==0.3.1"
|
| 395 |
+
|
| 396 |
+
sglang serve \
|
| 397 |
+
--model-path nvidia/Cosmos3-Super-Text2Image \
|
| 398 |
+
--num-gpus 4
|
| 399 |
+
```
|
| 400 |
+
|
| 401 |
+
Example text-to-image request:
|
| 402 |
+
|
| 403 |
+
```bash
|
| 404 |
+
curl -sS -X POST http://localhost:30000/v1/images/generations \
|
| 405 |
+
-H "Content-Type: application/json" \
|
| 406 |
+
-d '{
|
| 407 |
+
"prompt": "A warehouse robot folds a blue cloth on a clean workbench.",
|
| 408 |
+
"size": "1280x720",
|
| 409 |
+
"n": 1,
|
| 410 |
+
"num_inference_steps": 35,
|
| 411 |
+
"guidance_scale": 6.0,
|
| 412 |
+
"flow_shift": 10.0,
|
| 413 |
+
"seed": 0,
|
| 414 |
+
"extra_args": {
|
| 415 |
+
"use_resolution_template": false,
|
| 416 |
+
"guardrails": true
|
| 417 |
+
}
|
| 418 |
+
}'
|
| 419 |
+
```
|
| 420 |
+
|
| 421 |
### Diffusers
|
| 422 |
|
| 423 |
Cosmos3 is fully supported within the popular HuggingFace Diffusers package. This integration makes it a supported inference backend, allowing developers to easily incorporate Cosmos3's capabilities - such as text-to-image generation - into their pipelines using the Cosmos3OmniPipeline class, as demonstrated by the provided code examples (see examples for other modalities on the HuggingFace Cosmos3 page).
|