3amthoughts commited on
Commit
1398fdd
·
verified ·
1 Parent(s): fdfa028

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -45
README.md CHANGED
@@ -1,50 +1,72 @@
1
- ---
2
- ---
3
- license: apache-2.0
4
- base_model: unsloth/DeepSeek-R1-Distill-Qwen-7B-unsloth-bnb-4bit
5
- tags:
6
- - reasoning
7
- - chain-of-thought
8
- - cot
9
- - qwen2.5
10
- - unsloth
11
- - logic
12
- - mathematical-reasoning
13
- ---
14
-
15
- # 🌌 3amthoughts/DeepLink-R1: The Pinnacle of Logical Architecture
16
-
17
- **DeepLink-R1** represents a quantum leap in distilled reasoning capabilities. Built upon the formidable Qwen2.5 7B framework and infused with the sophisticated logic of DeepSeek-R1 via advanced LoRA fine-tuning, this model is engineered for those who demand absolute structural integrity in every response.
18
-
19
- ## 🧠 The Persona: The Master Logical Architect
20
-
21
- DeepLink-R1 does not merely process data; it architects truth. It is designed to be the ultimate intellectual companion for complex problem-solving.
22
-
23
- ### **Core Directives:**
24
- * **Unrivaled Analytical Depth**: Every query is met with an exhaustive breakdown of its constituent parts.
25
- * **Total Transparency**: The `<think>` process is not just a feature; it is a testament to the model's rigorous internal verification.
26
- * **Mathematical Supremacy**: Built to excel where others falter—in the realms of complex calculus, discrete mathematics, and algorithmic theory.
27
- * **Architectural Precision**: Responses are structured with the elegance of a blueprint, ensuring no logical stone is left unturned.
28
 
29
- ## 🚀 Elite Features
30
- - **Next-Gen Reasoning**: Distilled from the world's most capable reasoning models.
31
- - **Optimized context**: Full 4096-token context window for handling massive multi-step problems.
32
- - **Unsloth Powered**: Training and inference optimized for maximum speed and efficiency.
33
- - **Perfected Format**: Native ChatML support for seamless integration into modern AI workflows.
34
 
35
- ## 🛠️ Deployment
 
 
 
 
 
36
 
37
- ```python
38
- from unsloth import FastLanguageModel
39
- import torch
 
40
 
41
- model, tokenizer = FastLanguageModel.from_pretrained(
42
- model_name = "3amthoughts/DeepLink-R1",
43
- max_seq_length = 4096,
44
- load_in_4bit = True,
45
- )
46
- FastLanguageModel.for_inference(model)
47
 
48
- # Experience the future of thought
49
- ---
50
- *Developed with precision by 3amthoughts.*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 🌌 DeepLink-R1
2
+ <div align="center">
3
+ <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/ai2.png" width="200" alt="DeepLink-R1 Logo Concept"/>
4
+ </div>
5
+ DeepLink-R1 is a reasoning-focused Large Language Model built on the Qwen2.5-7B architecture and distilled from DeepSeek-R1. Engineered to embody the persona of a "Logical Architect," this model doesn't just provide answers—it constructs transparent, mathematically rigorous blueprints of thought.
6
+ By utilizing the <think> tag, DeepLink-R1 exposes its internal reasoning process before delivering its final, refined response.
7
+ 🔗 Quick Links
8
+ Primary Model (BF16/FP16): 3amthoughts/DeepLink-R1
9
+ Quantized Model (GGUF): 3amthoughts/DeepLink-R1-GGUF
10
+ 🧠 The "Logical Architect" Persona
11
+ DeepLink-R1 is designed for complex problem-solving, coding, and mathematical reasoning. When prompted, the model will output a structured thought process enclosed in <think> ... </think> tags, allowing users to follow the logical steps taken to arrive at the conclusion.
12
+ 💻 Usage & Inference
13
+ DeepLink-R1 uses the ChatML prompt format.
14
+ Option 1: Using transformers (Python)
15
+ code
16
+ Python
17
+ from transformers import AutoModelForCausalLM, AutoTokenizer
18
+ import torch
 
 
 
 
 
 
 
 
 
19
 
20
+ model_id = "3amthoughts/DeepLink-R1"
 
 
 
 
21
 
22
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
23
+ model = AutoModelForCausalLM.from_pretrained(
24
+ model_id,
25
+ torch_dtype=torch.bfloat16,
26
+ device_map="auto"
27
+ )
28
 
29
+ messages = [
30
+ {"role": "system", "content": "You are a logical architect. Think step-by-step."},
31
+ {"role": "user", "content": "How many 'r's are in the word strawberry?"}
32
+ ]
33
 
34
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
35
+ inputs = tokenizer(text, return_tensors="pt").to("cuda")
 
 
 
 
36
 
37
+ outputs = model.generate(**inputs, max_new_tokens=1024)
38
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
39
+ Option 2: Using llama.cpp or Ollama (GGUF)
40
+ For local, CPU-friendly, or low-VRAM inference, use the GGUF version.
41
+ code
42
+ Bash
43
+ # Example using llama.cpp
44
+ ./main -m Qwen3.5-4B.Q4_K_M.gguf -n 1024 -p "<|im_start|>system\nYou are a logical architect.<|im_end|>\n<|im_start|>user\nSolve this math problem...<|im_end|>\n<|im_start|>assistant\n<think>\n"
45
+ 🏗️ Training Methodology: The Forge
46
+ DeepLink-R1 was trained using Unsloth for 2x faster, memory-efficient fine-tuning, successfully navigating the constraints of a single Tesla T4 (16GB VRAM) GPU.
47
+ Hardware & Framework Optimizations
48
+ Framework: Unsloth & Hugging Face trl
49
+ Hardware: 1x NVIDIA Tesla T4 (16GB)
50
+ Memory Management:
51
+ Loaded in 4-bit quantization via bitsandbytes.
52
+ Enabled Unsloth's optimized Gradient Checkpointing.
53
+ Dynamic Max Sequence Length (2048 - 4096) to maintain stability during specific training phases.
54
+ LoRA Configuration
55
+ We utilized Low-Rank Adaptation (LoRA) to efficiently update the model's weights:
56
+ Target Modules: All linear layers (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj)
57
+ Rank (r): 16
58
+ Alpha: 16
59
+ Dropout: 0
60
+ Hyperparameters
61
+ Optimizer: AdamW 8-bit
62
+ Learning Rate: 2e-4
63
+ Global Batch Size: 8 (1 per device × 8 gradient accumulation steps)
64
+ Training Steps: 350
65
+ Note: Training successfully managed a runtime restart by resuming from an uploaded adapter state, ensuring zero progress loss.
66
+ 📚 Dataset Engineering: The Knowledge
67
+ To forge the "Logical Architect," we engineered a high-fidelity intelligence mixture by streaming and combining three elite reasoning datasets. All data was strictly aligned to the ChatML template to ensure seamless integration.
68
+ ServiceNow-AI/R1-Distill-SFT: Provided the foundational reasoning logic and structured thought generation.
69
+ open-r1/Mixture-of-Thoughts: Introduced highly diverse cognitive patterns and problem-solving approaches.
70
+ bespokelabs/Bespoke-Stratos-17k: Applied for high-tier refinement, mathematical rigor, and complex multi-step logic.
71
+ 🏆 The Result
72
+ DeepLink-R1 stands as a testament to efficient distillation. It proves that with precise dataset curation, ChatML alignment, and aggressive memory optimization (Unsloth + 4-bit LoRA), a 7B parameter model can achieve elite logical depth on highly accessible hardware.