Update README: Canonical DeepBrainz Model Card

Browse files

Files changed (1) hide show

README.md +93 -6

README.md CHANGED Viewed

@@ -2,14 +2,101 @@
 license: apache-2.0
 language:
   - en
 tags:
-  - deepbrainz
-  - reasoning
-  - 4b
   - qwen3
 ---
 # DeepBrainz-R1-4B-16K
-**DeepBrainz-R1-4B-16K** is a 4B parameter reasoning model trained by DeepBrainz AI.
-- **Context:** 16,384
-- **Architecture:** Qwen3-4B (Hybrid Sharding Reconstruction)

 license: apache-2.0
 language:
   - en
+pipeline_tag: text-generation
 tags:
   - qwen3
+  - reasoning
+  - long-context
+  - distillation
+  - math
+  - enterprise
+  - research
+base_model: Qwen/Qwen3-4B
 ---
 # DeepBrainz-R1-4B-16K
+**DeepBrainz-R1-4B-16K** is a high-performance reasoning model in the **DeepBrainz-R series**, designed for structured problem-solving, analysis, and enterprise research workflows.
+It is distilled from the **Qwen3-32B** teacher model into a compact **4B** architecture using **Online Policy Distillation (OPD)**, emphasizing reasoning quality and instruction robustness over a **16K context window**.
+---
+## Model Highlights
+- **4B Parameters**: Optimized balance of performance and inference cost.
+- **16K Context Length**: Capable of processing medium-to-long documents and reasoning chains.
+- **Distilled Precision**: Trained via NeMo-RL OPD from a **Qwen3-32B** teacher.
+- **Architecture**: Standard Qwen3 (Dense), optimized for modern GPU inference.
+---
+## Intended Use
+- **Complex Reasoning**: Multi-step math, logic puzzles, and code analysis.
+- **Agentic Workflows**: Reliable planning and tool use within 16K context.
+- **Research**: Investigating distillation scaling laws (32B $\to$ 4B).
+- **Efficient Deployment**: Fits easily on consumer GPUs and edge servers.
+*Note: This model is optimized for reasoning tasks. For general conversational chit-chat, we recommend applying a specific instruction template.*
+---
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_id = "DeepBrainz/DeepBrainz-R1-4B-16K"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype="bfloat16",
+    device_map="auto"
+)
+# Example: Math Reasoning
+prompt = "Solve step by step: If 3x + 7 = 22, what is x?"
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=512,
+    temperature=0.6,
+    top_p=0.95,
+    do_sample=True
+)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+---
+## Training Summary
+The model was produced using a **multi-stage optimization process** involving large-scale supervision and iterative refinement to improve reasoning quality and robustness.
+- **Teacher**: Qwen3-32B (Dense)
+- **Student**: Qwen3-4B
+- **Method**: Online Policy Distillation (OPD)
+- **Context**: 16,384 tokens
+---
+## Limitations
+Performance depends on task complexity and inference configuration. While significantly stronger than smaller models, it may still hallucinate on obscure facts compared to 30B+ models.
+---
+## License
+Apache 2.0
+---
+## About DeepBrainz
+DeepBrainz builds reasoning-first AI systems focused on efficiency, structure, and real-world problem-solving.