3amthoughts
/

DeepLink-R1

@@ -1,50 +1,72 @@
----
----
-license: apache-2.0
-base_model: unsloth/DeepSeek-R1-Distill-Qwen-7B-unsloth-bnb-4bit
-tags:
-- reasoning
-- chain-of-thought
-- cot
-- qwen2.5
-- unsloth
-- logic
-- mathematical-reasoning
----
-# 🌌 3amthoughts/DeepLink-R1: The Pinnacle of Logical Architecture
-**DeepLink-R1** represents a quantum leap in distilled reasoning capabilities. Built upon the formidable Qwen2.5 7B framework and infused with the sophisticated logic of DeepSeek-R1 via advanced LoRA fine-tuning, this model is engineered for those who demand absolute structural integrity in every response.
-## 🧠 The Persona: The Master Logical Architect
-DeepLink-R1 does not merely process data; it architects truth. It is designed to be the ultimate intellectual companion for complex problem-solving.
-### **Core Directives:**
-*   **Unrivaled Analytical Depth**: Every query is met with an exhaustive breakdown of its constituent parts.
-*   **Total Transparency**: The `<think>` process is not just a feature; it is a testament to the model's rigorous internal verification.
-*   **Mathematical Supremacy**: Built to excel where others falter—in the realms of complex calculus, discrete mathematics, and algorithmic theory.
-*   **Architectural Precision**: Responses are structured with the elegance of a blueprint, ensuring no logical stone is left unturned.
-## 🚀 Elite Features
-- **Next-Gen Reasoning**: Distilled from the world's most capable reasoning models.
-- **Optimized context**: Full 4096-token context window for handling massive multi-step problems.
-- **Unsloth Powered**: Training and inference optimized for maximum speed and efficiency.
-- **Perfected Format**: Native ChatML support for seamless integration into modern AI workflows.
-## 🛠️ Deployment
-```python
-from unsloth import FastLanguageModel
-import torch
-model, tokenizer = FastLanguageModel.from_pretrained(
-    model_name = "3amthoughts/DeepLink-R1",
-    max_seq_length = 4096,
-    load_in_4bit = True,
-)
-FastLanguageModel.for_inference(model)
-# Experience the future of thought
----
-*Developed with precision by 3amthoughts.*

+🌌 DeepLink-R1
+<div align="center">
+<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/ai2.png" width="200" alt="DeepLink-R1 Logo Concept"/>
+</div>
+DeepLink-R1 is a reasoning-focused Large Language Model built on the Qwen2.5-7B architecture and distilled from DeepSeek-R1. Engineered to embody the persona of a "Logical Architect," this model doesn't just provide answers—it constructs transparent, mathematically rigorous blueprints of thought.
+By utilizing the <think> tag, DeepLink-R1 exposes its internal reasoning process before delivering its final, refined response.
+🔗 Quick Links
+Primary Model (BF16/FP16): 3amthoughts/DeepLink-R1
+Quantized Model (GGUF): 3amthoughts/DeepLink-R1-GGUF
+🧠 The "Logical Architect" Persona
+DeepLink-R1 is designed for complex problem-solving, coding, and mathematical reasoning. When prompted, the model will output a structured thought process enclosed in <think> ... </think> tags, allowing users to follow the logical steps taken to arrive at the conclusion.
+💻 Usage & Inference
+DeepLink-R1 uses the ChatML prompt format.
+Option 1: Using transformers (Python)
+code
+Python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_id = "3amthoughts/DeepLink-R1"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+messages = [
+    {"role": "system", "content": "You are a logical architect. Think step-by-step."},
+    {"role": "user", "content": "How many 'r's are in the word strawberry?"}
+]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt").to("cuda")
+outputs = model.generate(**inputs, max_new_tokens=1024)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+Option 2: Using llama.cpp or Ollama (GGUF)
+For local, CPU-friendly, or low-VRAM inference, use the GGUF version.
+code
+Bash
+# Example using llama.cpp
+./main -m Qwen3.5-4B.Q4_K_M.gguf -n 1024 -p "<|im_start|>system\nYou are a logical architect.<|im_end|>\n<|im_start|>user\nSolve this math problem...<|im_end|>\n<|im_start|>assistant\n<think>\n"
+🏗️ Training Methodology: The Forge
+DeepLink-R1 was trained using Unsloth for 2x faster, memory-efficient fine-tuning, successfully navigating the constraints of a single Tesla T4 (16GB VRAM) GPU.
+Hardware & Framework Optimizations
+Framework: Unsloth & Hugging Face trl
+Hardware: 1x NVIDIA Tesla T4 (16GB)
+Memory Management:
+Loaded in 4-bit quantization via bitsandbytes.
+Enabled Unsloth's optimized Gradient Checkpointing.
+Dynamic Max Sequence Length (2048 - 4096) to maintain stability during specific training phases.
+LoRA Configuration
+We utilized Low-Rank Adaptation (LoRA) to efficiently update the model's weights:
+Target Modules: All linear layers (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj)
+Rank (r): 16
+Alpha: 16
+Dropout: 0
+Hyperparameters
+Optimizer: AdamW 8-bit
+Learning Rate: 2e-4
+Global Batch Size: 8 (1 per device × 8 gradient accumulation steps)
+Training Steps: 350
+Note: Training successfully managed a runtime restart by resuming from an uploaded adapter state, ensuring zero progress loss.
+📚 Dataset Engineering: The Knowledge
+To forge the "Logical Architect," we engineered a high-fidelity intelligence mixture by streaming and combining three elite reasoning datasets. All data was strictly aligned to the ChatML template to ensure seamless integration.
+ServiceNow-AI/R1-Distill-SFT: Provided the foundational reasoning logic and structured thought generation.
+open-r1/Mixture-of-Thoughts: Introduced highly diverse cognitive patterns and problem-solving approaches.
+bespokelabs/Bespoke-Stratos-17k: Applied for high-tier refinement, mathematical rigor, and complex multi-step logic.
+🏆 The Result
+DeepLink-R1 stands as a testament to efficient distillation. It proves that with precise dataset curation, ChatML alignment, and aggressive memory optimization (Unsloth + 4-bit LoRA), a 7B parameter model can achieve elite logical depth on highly accessible hardware.