Safetensors
qwen3_5
wbhVince829 commited on
Commit
05e5499
·
1 Parent(s): f4c4ef8

update hardware and mention the 9B and 0.1B coming soon.

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -96,9 +96,11 @@ vllm serve KU-DFI/TelecomGPT-R1 \
96
 
97
  (Scale `--tensor-parallel-size`, `--max-model-len`, and `--gpu-memory-utilization` up as needed for multi-GPU nodes or higher-throughput serving.)
98
 
99
- **Hardware**: TelecomGPT-R1 (27B, bf16) fits on a single H100 80GB or MI300X with the default settings above; multi-GPU nodes allow longer contexts and larger batches behind an operator firewall.
100
 
101
 
 
 
102
  ---
103
 
104
 
 
96
 
97
  (Scale `--tensor-parallel-size`, `--max-model-len`, and `--gpu-memory-utilization` up as needed for multi-GPU nodes or higher-throughput serving.)
98
 
99
+ **Hardware**: Following the official [Qwen3.5-27B](https://huggingface.co/Qwen/Qwen3.5-27B) deployment guidance, TelecomGPT-R1 (27B, bf16) runs on a single **A100 80GB** (or equivalent **H100 80GB** / **MI300X**) with the default settings above. Multi-GPU nodes allow longer contexts and larger batches behind an operator firewall.
100
 
101
 
102
+ **Smaller TelecomGPT-R1 variants — coming soon.** A **9B** checkpoint trained with the same recipe is ready for release, and a **0.1B** edge-scale variant is in preparation. Together the family covers the full telecom-deployment spectrum: from data-center GPU at 27B, to single-card workstation / consumer 24–48 GB GPU at 9B, all the way down to edge / device-side inference at 0.1B.
103
+
104
  ---
105
 
106