3amthoughts commited on
Commit
f8a6b8a
·
verified ·
1 Parent(s): ada37ef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +104 -10
README.md CHANGED
@@ -1,20 +1,114 @@
1
  ---
 
 
2
  tags:
 
 
 
 
3
  - gguf
4
- - llama.cpp
 
5
  - unsloth
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
  ---
8
 
9
- # DeepLink-R1-GGUF : GGUF
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
- This model was finetuned and converted to GGUF format using [Unsloth](https://github.com/unslothai/unsloth).
 
 
 
 
 
12
 
13
- **Example usage**:
14
- - For text only LLMs: `llama-cli -hf 3amthoughts/DeepLink-R1-GGUF --jinja`
15
- - For multimodal models: `llama-mtmd-cli -hf 3amthoughts/DeepLink-R1-GGUF --jinja`
 
16
 
17
- ## Available Model files:
18
- - `deepseek-r1-distill-qwen-7b.Q4_K_M.gguf`
19
- This was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth)
20
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
+ base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
3
+ library_name: transformers
4
  tags:
5
+ - reasoning
6
+ - chain-of-thought
7
+ - deepseek
8
+ - qwen
9
  - gguf
10
+ - bnb
11
+ - 4bit
12
  - unsloth
13
+ - chatml
14
+ - agent
15
+ - code
16
+ - thinking
17
+ - distilled
18
+ license: apache-2.0
19
+ ---
20
+
21
+ # 🌌 DeepLink-R1
22
+
23
+ **DeepLink-R1** is a highly specialized, reasoning-focused large language model designed to act as a **"Logical Architect."** Built on top of the **`deepseek-ai/DeepSeek-R1-Distill-Qwen-7B`** architecture, this model doesn't just provide answers—it constructs transparent, mathematically rigorous blueprints of thought.
24
+
25
+ It is trained to "think" before it speaks using `<think>` tags, exposing its internal logical deduction process before delivering a final, refined response.
26
+
27
+ Created by **3amthoughts**.
28
+
29
+ ### ⚡ Model Highlights
30
+ * **Architecture:** 7B Parameters (Base: `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B`)
31
+ * **Format:** Available in BF16/FP16 (Transformers) and GGUF (Q4_K_M for local execution via llama.cpp/Ollama)
32
+ * **Capabilities:** Deep logical reasoning, mathematical rigor, complex problem-solving, and transparent thought generation.
33
+ * **Training:** Fine-tuned using QLoRA via Unsloth on a high-fidelity mixture of elite reasoning datasets.
34
+
35
+ ### 🎯 Key Capabilities
36
+ * **Transparent Reasoning:** Trained to output its internal thought processes inside `<think>` tags, drastically reducing hallucinations and logical leaps.
37
+ * **High-Tier Logic:** Excels at math, coding, and multi-step deduction by breaking down complex problems into manageable architectural blueprints.
38
+ * **Lightweight & Fast:** Distilled into a 7B parameter footprint, making elite-level reasoning runnable on consumer hardware (MacBooks, RTX 3090/4090s).
39
 
40
  ---
41
 
42
+ ### 🛠️ How it Works (Prompting & Output)
43
+ To get the model to trigger its reasoning engine, you must use the standard ChatML template and prompt it to think step-by-step.
44
+
45
+ **System Prompt:**
46
+ > "You are a logical architect. Think step-by-step."
47
+
48
+ **Example Interaction**
49
+ **User:**
50
+ > "If I have a 5-liter jug and a 3-liter jug, how can I measure exactly 4 liters of water?"
51
+
52
+ **DeepLink-R1 Output:**
53
+ ```text
54
+ <think>
55
+ To measure exactly 4 liters using a 5-liter jug (J5) and a 3-liter jug (J3):
56
+ 1. Fill J5 to the top (J5 = 5L, J3 = 0L).
57
+ 2. Pour water from J5 into J3 until J3 is full. This takes 3 liters. (J5 = 2L, J3 = 3L).
58
+ 3. Empty J3 completely. (J5 = 2L, J3 = 0L).
59
+ 4. Pour the remaining 2 liters from J5 into J3. (J5 = 0L, J3 = 2L).
60
+ 5. Fill J5 to the top again. (J5 = 5L, J3 = 2L).
61
+ 6. Pour water from J5 into J3 until J3 is full. Since J3 already has 2 liters, it only takes 1 liter to fill it. (J5 = 4L, J3 = 3L).
62
+ Now, J5 contains exactly 4 liters.
63
+ </think>
64
+ To measure exactly 4 liters, follow these steps:
65
+ 1. Fill the 5-liter jug completely.
66
+ 2. Pour water from the 5-liter jug into the 3-liter jug until it is full, leaving exactly 2 liters in the 5-liter jug.
67
+ 3. Empty the 3-liter jug.
68
+ 4. Pour the 2 liters from the 5-liter jug into the empty 3-liter jug.
69
+ 5. Refill the 5-liter jug completely.
70
+ 6. Carefully pour water from the 5-liter jug into the 3-liter jug until the 3-liter jug is full. Since it already contains 2 liters, it will only take 1 liter to fill.
71
+ You now have exactly 4 liters of water remaining in the 5-liter jug.
72
+ ```
73
+
74
+
75
+ 💻 Prompt Format (ChatML)
76
+ DeepLink-R1 strictly uses the ChatML prompt format.
77
+ ``` code
78
+ Text
79
+ <|im_start|>system
80
+ You are a logical architect. Think step-by-step.<|im_end|>
81
+ <|im_start|>user
82
+ How many 'r's are in the word strawberry?<|im_end|>
83
+ <|im_start|>assistant
84
+ <think>
85
+ ...
86
+ </think>
87
+ ...<|im_end|>
88
+ ```
89
+
90
+ 🚀 Usage
91
+ Using transformers (Python)
92
+ ```code
93
+ Python
94
+ from transformers import AutoModelForCausalLM, AutoTokenizer
95
+ import torch
96
+
97
+ model_id = "3amthoughts/DeepLink-R1"
98
 
99
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
100
+ model = AutoModelForCausalLM.from_pretrained(
101
+ model_id,
102
+ torch_dtype=torch.bfloat16,
103
+ device_map="auto",
104
+ )
105
 
106
+ messages = [
107
+ {"role": "system", "content": "You are a logical architect. Think step-by-step."},
108
+ {"role": "user", "content": "How many 'r's are in the word strawberry?"}
109
+ ]
110
 
111
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to("cuda")
112
+ outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.6)
113
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
114
+ ```