--- language: en license: apache-2.0 tags: - code - coding-agent - instruction-tuned - hermit-code - qwen2.5-coder - text-generation - python - javascript - rust - go pipeline_tag: text-generation base_model: - Qwen/Qwen2.5-Coder-7B-Instruct ---

The official coding model for the Hermit AI Agent

--- ## 📑 Table of Contents - [Quick Start](#-quick-start) - [Capabilities](#-capabilities) - [Model Details](#-model-details) - [Usage](#-usage) - [Transformers](#transformers) - [vLLM (Production)](#vllm-recommended-for-production) - [Inference API](#inference-api) - [Interactive Examples](#-interactive-examples) - [Benchmarks](#-benchmarks) - [Acknowledgments](#-acknowledgments) --- ## 🚀 Quick Start Get up and running in **3 lines of code**: ```python from transformers import pipeline # Initialize the model pipe = pipeline("text-generation", model="Soloman2002/hermit-code-7b") # Start coding chat = [{"role": "user", "content": "Write a Python function to reverse a linked list"}] response = pipe(chat, max_new_tokens=512) print(response[0]["generated_text"][-1]["content"]) ``` > 💡 **Tip:** For best results, use `temperature=0.2` and `top_p=0.95` for deterministic code generation. --- ## 🎯 Capabilities

### ✨ What Makes Hermit Code Special? - **🌐 Multi-Language Mastery** — Native fluency in 6+ programming languages - **📦 Project-Scale Context** — Understand entire codebases with 128K token context - **🔍 Debugging Expert** — Identifies bugs, explains why they happen, and fixes them - **🎓 Educational** — Explains complex concepts with clear, step-by-step reasoning - **⚙️ Production-Ready** — Optimized for both research and deployment via vLLM --- ## 🧠 Model Details

| Property | Specification | Notes | |:---|:---|:---| | **Base Model** | [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) | State-of-the-art code foundation | | **Architecture** | Qwen2.5 Dense Transformer | Optimized for code understanding | | **Parameters** | 7.61B (6.53B non-embedding) | Efficient size-to-performance ratio | | **Layers** | 28 | Deep representation learning | | **Attention** | GQA — 28 Q heads, 4 KV heads | Fast inference with grouped queries | | **Context Length** | 131,072 tokens | ~100K+ lines of code context | | **License** | Apache 2.0 | Commercial use permitted | | **Format** | Safetensors (BF16) | Safe, efficient serialization |

--- ## 💻 Usage ### Transformers Perfect for **prototyping** and **local development**: ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch # Load model and tokenizer model_id = "Soloman2002/hermit-code-7b" model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_id) # Prepare chat messages messages = [ {"role": "system", "content": "You are Hermit Code, an expert coding assistant."}, {"role": "user", "content": "Write a Rust function that checks if a string is a palindrome."} ] # Apply chat template text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) # Generate inputs = tokenizer([text], return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=512, temperature=0.2, do_sample=True ) # Decode response response = tokenizer.decode( outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True ) print(response) ``` --- ### vLLM (Recommended for Production) For **high-throughput** serving and API deployment: **1. Install & Launch:** ```bash pip install vllm vllm serve "Soloman2002/hermit-code-7b" --dtype bfloat16 ``` **2. Query via API:** ```bash curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ -d '{ "model": "Soloman2002/hermit-code-7b", "messages": [ {"role": "system", "content": "You are Hermit Code, a coding assistant."}, {"role": "user", "content": "Explain closures in JavaScript with examples."} ], "temperature": 0.2, "max_tokens": 512 }' ``` > 🚀 **vLLM Benefits:** Continuous batching, PagedAttention, and 10-20x throughput improvement over standard Transformers. --- ### Inference API Use **Hugging Face's hosted infrastructure** for instant access: ```python from huggingface_hub import InferenceClient client = InferenceClient(token="hf_YOUR_TOKEN") response = client.chat_completion( model="Soloman2002/hermit-code-7b", messages=[ {"role": "user", "content": "Write a Go function to merge two sorted arrays"} ], max_tokens=512, temperature=0.2, stream=False ) print(response.choices[0].message.content) ``` --- ## 🎨 Interactive Examples Click any language below to expand and see Hermit Code in action:

🐍 Python — Quick Sort Implementation

```python def quick_sort(arr: list[int]) -> list[int]: """ Sorts an array using the quicksort algorithm. Time Complexity: O(n log n) average, O(n²) worst case Space Complexity: O(log n) due to recursion """ if len(arr) <= 1: return arr pivot = arr[len(arr) // 2] left = [x for x in arr if x < pivot] middle = [x for x in arr if x == pivot] right = [x for x in arr if x > pivot] return quick_sort(left) + middle + quick_sort(right) # Example usage if __name__ == "__main__": data = [3, 6, 8, 10, 1, 2, 1] print(f"Original: {data}") print(f"Sorted: {quick_sort(data)}") ``` **Key Features Demonstrated:** - ✅ Type hints for better code clarity - ✅ Docstrings with complexity analysis - ✅ Recursive divide-and-conquer approach - ✅ Idiomatic Python list comprehensions

🦀 Rust — Palindrome Checker

```rust /// Checks if a string is a palindrome, ignoring non-alphanumeric characters /// and case differences. /// /// # Examples /// ``` /// assert!(is_palindrome("A man, a plan, a canal: Panama")); /// assert!(!is_palindrome("Hello, World!")); /// ``` fn is_palindrome(s: &str) -> bool { let chars: Vec = s .chars() .filter(|c| c.is_alphanumeric()) .map(|c| c.to_ascii_lowercase()) .collect(); let len = chars.len(); for i in 0..len / 2 { if chars[i] != chars[len - 1 - i] { return false; } } true } fn main() { let test_cases = vec![ "racecar", "A man, a plan, a canal: Panama", "Hello, World!", ]; for case in test_cases { println!("'{}' -> {}", case, is_palindrome(case)); } } ``` **Key Features Demonstrated:** - ✅ Memory-safe string processing - ✅ Functional iterator chains - ✅ Comprehensive documentation - ✅ Efficient two-pointer comparison

🐹 Go — Merge Sorted Arrays

```go package main import "fmt" // mergeSorted combines two sorted integer slices into a single sorted slice. // It runs in O(n + m) time where n and m are the lengths of the inputs. func mergeSorted(a, b []int) []int { result := make([]int, 0, len(a)+len(b)) i, j := 0, 0 // Merge while both arrays have elements for i < len(a) && j < len(b) { if a[i] < b[j] { result = append(result, a[i]) i++ } else { result = append(result, b[j]) j++ } } // Append remaining elements result = append(result, a[i:]...) result = append(result, b[j:]...) return result } func main() { a := []int{1, 3, 5, 7} b := []int{2, 4, 6, 8} merged := mergeSorted(a, b) fmt.Printf("Merged: %v\n", merged) // [1 2 3 4 5 6 7 8] } ``` **Key Features Demonstrated:** - ✅ Pre-allocated slices for zero-allocation growth - ✅ Two-pointer technique for optimal performance - ✅ Idiomatic Go error-free design - ✅ Clean, readable control flow

⚡ JavaScript — Closure Example

```javascript /** * Creates a counter with private state using closures. * Demonstrates lexical scoping and data encapsulation. */ function createCounter(initialValue = 0) { let count = initialValue; // Private variable return { increment() { count += 1; return count; }, decrement() { count -= 1; return count; }, getValue() { return count; }, reset() { count = initialValue; return count; } }; } // Usage const counter = createCounter(10); console.log(counter.increment()); // 11 console.log(counter.increment()); // 12 console.log(counter.getValue()); // 12 console.log(counter.reset()); // 10 ``` **Key Features Demonstrated:** - ✅ True private state via closures - ✅ Clean object interface - ✅ ES6 method shorthand syntax - ✅ Default parameter values

--- ## 📊 Benchmarks

| Benchmark | Score | Status | Comparison to Base | |:---|:---:|:---:|:---| | **HumanEval (Python)** | *TBD* | 🔄 Pending | vs Qwen2.5-Coder-7B | | **HumanEval (Multi-Lang)** | *TBD* | 🔄 Pending | vs Qwen2.5-Coder-7B | | **MBPP (Python)** | *TBD* | 🔄 Pending | vs Qwen2.5-Coder-7B | | **DS-1000 (Data Science)** | *TBD* | 🔄 Pending | vs Qwen2.5-Coder-7B |

> 📈 **Coming Soon:** Comprehensive evaluation results based on the Qwen2.5-Coder-7B-Instruct baseline with additional fine-tuning for agentic coding workflows. --- ## 🤝 Acknowledgments

| Contribution | Team / Resource | Link | |:---|:---|:---| | **🏗️ Base Model** | Qwen Team | [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) | | **🤖 Hermit AI Agent** | Hermit Team | [GitHub: Soloman2002](https://github.com/Soloman2002) | | **📦 Infrastructure** | Hugging Face | [Transformers](https://github.com/huggingface/transformers) & [vLLM](https://github.com/vllm-project/vllm) |

---

## 🌟 Star Us on GitHub! If you find Hermit Code useful, please consider starring the repository and sharing with your network.

_{Built with ❤️ for the coding community by the Hermit Team}