KU-DFI
/

TelecomGPT-R1

wbhVince829 commited on 17 days ago

Commit

05e5499

1 Parent(s): f4c4ef8

update hardware and mention the 9B and 0.1B coming soon.

Files changed (1) hide show

README.md CHANGED Viewed

@@ -96,9 +96,11 @@ vllm serve KU-DFI/TelecomGPT-R1 \
 (Scale `--tensor-parallel-size`, `--max-model-len`, and `--gpu-memory-utilization` up as needed for multi-GPU nodes or higher-throughput serving.)
-**Hardware**: TelecomGPT-R1 (27B, bf16) fits on a single H100 80GB or MI300X with the default settings above; multi-GPU nodes allow longer contexts and larger batches behind an operator firewall.
 ---

 (Scale `--tensor-parallel-size`, `--max-model-len`, and `--gpu-memory-utilization` up as needed for multi-GPU nodes or higher-throughput serving.)
+**Hardware**: Following the official [Qwen3.5-27B](https://huggingface.co/Qwen/Qwen3.5-27B) deployment guidance, TelecomGPT-R1 (27B, bf16) runs on a single **A100 80GB** (or equivalent **H100 80GB** / **MI300X**) with the default settings above. Multi-GPU nodes allow longer contexts and larger batches behind an operator firewall.
+**Smaller TelecomGPT-R1 variants — coming soon.** A **9B** checkpoint trained with the same recipe is ready for release, and a **0.1B** edge-scale variant is in preparation. Together the family covers the full telecom-deployment spectrum: from data-center GPU at 27B, to single-card workstation / consumer 24–48 GB GPU at 9B, all the way down to edge / device-side inference at 0.1B.
 ---