Ill-Ness commited on
Commit
4d4a2b4
·
verified ·
1 Parent(s): 6473ed5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +150 -0
README.md CHANGED
@@ -1,3 +1,153 @@
1
  ---
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: transformers
3
  license: apache-2.0
4
+ license_link: LICENSE
5
+ pipeline_tag: text-generation
6
+ base_model:
7
+ - Qwen/Qwen3.5-2B
8
+ tags:
9
+ - verus
10
+ - coding
11
+ - reasoning
12
+ - r1
13
+ language:
14
+ - en
15
  ---
16
+
17
+ # Verus-r1
18
+
19
+ [![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
20
+ [![Model Size](https://img.shields.io/badge/Parameters-2B-brightgreen)]()
21
+ [![Context](https://img.shields.io/badge/Context-262K%20tokens-orange)]()
22
+ [![HF Transformers](https://img.shields.io/badge/Transformers-%E2%89%A54.52-red)](https://github.com/huggingface/transformers)
23
+
24
+ > [!Note]
25
+ > This repository contains model weights and configuration files for **Verus-r1** in the Hugging Face Transformers format.
26
+ >
27
+ > Compatible with Hugging Face Transformers, vLLM, SGLang, and other major inference frameworks.
28
+ >
29
+ > Built for **coding**, **reasoning**, **debugging**, and concise general assistance.
30
+
31
+ ## Verus-r1 Highlights
32
+
33
+ - **Coding-Focused**: Writes, fixes, explains, and reviews code.
34
+ - **Reasoning-Oriented**: Works through multi-step problems clearly.
35
+ - **Long Context**: Can handle large prompts, files, and long conversations.
36
+ - **Instruction Following**: Responds in the format and style requested.
37
+ - **Efficient**: A compact 2B model for local or hosted inference.
38
+
39
+ ## Model Overview
40
+
41
+ | Property | Value |
42
+ |---|---|
43
+ | Parameters | ~2B |
44
+ | Context Length | **262,144 tokens** |
45
+ | Architecture | Qwen3.5 |
46
+ | Chat Format | ChatML (`<\|im_start\|>` / `<\|im_end\|>`) |
47
+ | Dtype | bfloat16 |
48
+ | License | Apache 2.0 |
49
+
50
+ ## Quickstart
51
+
52
+ ### Installation
53
+
54
+ ```bash
55
+ pip install "transformers>=4.52.0" accelerate torch
56
+ ```
57
+
58
+ ### Code Generation
59
+
60
+ ```python
61
+ from transformers import AutoTokenizer, AutoModelForCausalLM
62
+ import torch
63
+
64
+ MODEL_ID = "8F-ai/Verus-r1"
65
+
66
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
67
+ model = AutoModelForCausalLM.from_pretrained(
68
+ MODEL_ID,
69
+ torch_dtype=torch.bfloat16,
70
+ device_map="auto",
71
+ )
72
+ model.eval()
73
+
74
+ messages = [
75
+ {
76
+ "role": "system",
77
+ "content": "You are Verus-r1, a reasoning coding assistant made by 8F-ai. You think through problems carefully before responding."
78
+ },
79
+ {
80
+ "role": "user",
81
+ "content": "Write a Python async context manager that manages a PostgreSQL connection pool using asyncpg."
82
+ }
83
+ ]
84
+
85
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
86
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
87
+
88
+ with torch.inference_mode():
89
+ generated_ids = model.generate(**inputs, max_new_tokens=2048, temperature=0.6, top_p=0.95)
90
+
91
+ output = tokenizer.decode(generated_ids[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
92
+ print(output)
93
+ ```
94
+
95
+ ### Quantized Inference (4-bit NF4, ~2 GB VRAM)
96
+
97
+ ```python
98
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
99
+ import torch
100
+
101
+ quantization_config = BitsAndBytesConfig(
102
+ load_in_4bit=True,
103
+ bnb_4bit_compute_dtype=torch.bfloat16,
104
+ bnb_4bit_use_double_quant=True,
105
+ bnb_4bit_quant_type="nf4",
106
+ )
107
+
108
+ tokenizer = AutoTokenizer.from_pretrained("8F-ai/Verus-r1")
109
+ model = AutoModelForCausalLM.from_pretrained(
110
+ "8F-ai/Verus-r1",
111
+ quantization_config=quantization_config,
112
+ device_map="auto",
113
+ )
114
+ ```
115
+
116
+ ## Intended Use Cases
117
+
118
+ | Use Case | Example |
119
+ |---|---|
120
+ | **Code Generation** | Write functions, classes, and scripts |
121
+ | **Debugging** | Fix bugs from code or error messages |
122
+ | **Code Review** | Explain code and suggest improvements |
123
+ | **Reasoning** | Break down multi-step problems |
124
+ | **Long Context** | Work with long prompts and files |
125
+ | **General Q&A** | Answer clearly and concisely |
126
+
127
+ ## Limitations
128
+
129
+ - **English-Primary**: Fine-tuning was conducted predominantly on English-language code and documentation.
130
+
131
+ ## Citation
132
+
133
+ ```bibtex
134
+ @misc{verusr12026,
135
+ title = {Verus-r1: A Reasoning-Focused Coding Language Model with 262K Context},
136
+ author = {8F-ai},
137
+ year = {2026},
138
+ howpublished = {\url{https://huggingface.co/8F-ai/Verus-r1}},
139
+ note = {Apache 2.0 License}
140
+ }
141
+ ```
142
+
143
+ ## License
144
+
145
+ Verus-r1 is released under the **Apache License 2.0**. See [LICENSE](LICENSE) for full terms.
146
+
147
+ Derived from [Qwen/Qwen3.5-2B](https://huggingface.co/Qwen/Qwen3.5-2B) (Apache 2.0).
148
+
149
+ ---
150
+
151
+ <div align="center">
152
+ <sub>Built by the 8F-ai Team</sub>
153
+ </div>