nvidia
/

Cosmos3-Super-Text2Image

image-generation

Model card Files Files and versions

Add SGLang serving instructions

#11

by MickJ - opened about 12 hours ago

base: refs/heads/main

←

from: refs/pr/11

Discussion Files changed

Files changed (1) hide show

README.md +37 -0

README.md CHANGED Viewed

@@ -9,6 +9,8 @@ tags:
   - cosmos
   - cosmos3
   - vllm-omni
   - diffusers
   - text-to-image
   - image-generation
@@ -381,6 +383,41 @@ print("Saved image to /tmp/cosmos3_t2i.png")
 ![example_image](assets/example_image.png)
 ### Diffusers
 Cosmos3 is fully supported within the popular HuggingFace Diffusers package. This integration makes it a supported inference backend, allowing developers to easily incorporate Cosmos3's capabilities - such as text-to-image generation - into their pipelines using the Cosmos3OmniPipeline class, as demonstrated by the provided code examples (see examples for other modalities on the HuggingFace Cosmos3 page).

   - cosmos
   - cosmos3
   - vllm-omni
+  - sglang
+  - sglang-diffusion
   - diffusers
   - text-to-image
   - image-generation
 ![example_image](assets/example_image.png)
+### SGLang
+SGLang-Diffusion can serve `nvidia/Cosmos3-Super-Text2Image` through the OpenAI-compatible image generation endpoint. Install SGLang from source with diffusion dependencies, then start the server:
+```bash
+git clone https://github.com/sgl-project/sglang.git
+cd sglang
+pip install -e "python[diffusion]"
+pip install "cosmos-guardrail==0.3.1"
+sglang serve \
+  --model-path nvidia/Cosmos3-Super-Text2Image \
+  --num-gpus 4
+```
+Example text-to-image request:
+```bash
+curl -sS -X POST http://localhost:30000/v1/images/generations \
+  -H "Content-Type: application/json" \
+  -d '{
+    "prompt": "A warehouse robot folds a blue cloth on a clean workbench.",
+    "size": "1280x720",
+    "n": 1,
+    "num_inference_steps": 35,
+    "guidance_scale": 6.0,
+    "flow_shift": 10.0,
+    "seed": 0,
+    "extra_args": {
+      "use_resolution_template": false,
+      "guardrails": true
+    }
+  }'
+```
 ### Diffusers
 Cosmos3 is fully supported within the popular HuggingFace Diffusers package. This integration makes it a supported inference backend, allowing developers to easily incorporate Cosmos3's capabilities - such as text-to-image generation - into their pipelines using the Cosmos3OmniPipeline class, as demonstrated by the provided code examples (see examples for other modalities on the HuggingFace Cosmos3 page).