---
language: en
license: apache-2.0
tags:
- code
- coding-agent
- instruction-tuned
- hermit-code
- qwen2.5-coder
- text-generation
- python
- javascript
- rust
- go
pipeline_tag: text-generation
base_model:
- Qwen/Qwen2.5-Coder-7B-Instruct
---
---
## š Table of Contents
- [Quick Start](#-quick-start)
- [Capabilities](#-capabilities)
- [Model Details](#-model-details)
- [Usage](#-usage)
- [Transformers](#transformers)
- [vLLM (Production)](#vllm-recommended-for-production)
- [Inference API](#inference-api)
- [Interactive Examples](#-interactive-examples)
- [Benchmarks](#-benchmarks)
- [Acknowledgments](#-acknowledgments)
---
## š Quick Start
Get up and running in **3 lines of code**:
```python
from transformers import pipeline
# Initialize the model
pipe = pipeline("text-generation", model="Soloman2002/hermit-code-7b")
# Start coding
chat = [{"role": "user", "content": "Write a Python function to reverse a linked list"}]
response = pipe(chat, max_new_tokens=512)
print(response[0]["generated_text"][-1]["content"])
```
> š” **Tip:** For best results, use `temperature=0.2` and `top_p=0.95` for deterministic code generation.
---
## šÆ Capabilities
| š» **Languages** | šļø **Code Gen** | š **Explain** | š **Debug** | ā” **Refactor** | š§ **Context** |
|:---:|:---:|:---:|:---:|:---:|:---:|
| Python | Functions | Breakdowns | Bug Finding | Performance | 128K Tokens |
| JavaScript / TypeScript | Classes | Documentation | Fixing | Cleanup | Multi-file |
| Go | Scripts | Architecture | Optimization | Restructuring | Projects |
| Rust | Full Projects | Best Practices | Analysis | Modernization | Understanding |
| C++ | | | | | |
| Java | | | | | |
### ⨠What Makes Hermit Code Special?
- **š Multi-Language Mastery** ā Native fluency in 6+ programming languages
- **š¦ Project-Scale Context** ā Understand entire codebases with 128K token context
- **š Debugging Expert** ā Identifies bugs, explains why they happen, and fixes them
- **š Educational** ā Explains complex concepts with clear, step-by-step reasoning
- **āļø Production-Ready** ā Optimized for both research and deployment via vLLM
---
## š§ Model Details
| Property | Specification | Notes |
|:---|:---|:---|
| **Base Model** | [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) | State-of-the-art code foundation |
| **Architecture** | Qwen2.5 Dense Transformer | Optimized for code understanding |
| **Parameters** | 7.61B (6.53B non-embedding) | Efficient size-to-performance ratio |
| **Layers** | 28 | Deep representation learning |
| **Attention** | GQA ā 28 Q heads, 4 KV heads | Fast inference with grouped queries |
| **Context Length** | 131,072 tokens | ~100K+ lines of code context |
| **License** | Apache 2.0 | Commercial use permitted |
| **Format** | Safetensors (BF16) | Safe, efficient serialization |
---
## š» Usage
### Transformers
Perfect for **prototyping** and **local development**:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_id = "Soloman2002/hermit-code-7b"
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
# Prepare chat messages
messages = [
{"role": "system", "content": "You are Hermit Code, an expert coding assistant."},
{"role": "user", "content": "Write a Rust function that checks if a string is a palindrome."}
]
# Apply chat template
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
# Generate
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.2,
do_sample=True
)
# Decode response
response = tokenizer.decode(
outputs[0][inputs.input_ids.shape[1]:],
skip_special_tokens=True
)
print(response)
```
---
### vLLM (Recommended for Production)
For **high-throughput** serving and API deployment:
**1. Install & Launch:**
```bash
pip install vllm
vllm serve "Soloman2002/hermit-code-7b" --dtype bfloat16
```
**2. Query via API:**
```bash
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "Soloman2002/hermit-code-7b",
"messages": [
{"role": "system", "content": "You are Hermit Code, a coding assistant."},
{"role": "user", "content": "Explain closures in JavaScript with examples."}
],
"temperature": 0.2,
"max_tokens": 512
}'
```
> š **vLLM Benefits:** Continuous batching, PagedAttention, and 10-20x throughput improvement over standard Transformers.
---
### Inference API
Use **Hugging Face's hosted infrastructure** for instant access:
```python
from huggingface_hub import InferenceClient
client = InferenceClient(token="hf_YOUR_TOKEN")
response = client.chat_completion(
model="Soloman2002/hermit-code-7b",
messages=[
{"role": "user", "content": "Write a Go function to merge two sorted arrays"}
],
max_tokens=512,
temperature=0.2,
stream=False
)
print(response.choices[0].message.content)
```
---
## šØ Interactive Examples
Click any language below to expand and see Hermit Code in action:
š Python ā Quick Sort Implementation
```python
def quick_sort(arr: list[int]) -> list[int]:
"""
Sorts an array using the quicksort algorithm.
Time Complexity: O(n log n) average, O(n²) worst case
Space Complexity: O(log n) due to recursion
"""
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)
# Example usage
if __name__ == "__main__":
data = [3, 6, 8, 10, 1, 2, 1]
print(f"Original: {data}")
print(f"Sorted: {quick_sort(data)}")
```
**Key Features Demonstrated:**
- ā
Type hints for better code clarity
- ā
Docstrings with complexity analysis
- ā
Recursive divide-and-conquer approach
- ā
Idiomatic Python list comprehensions
š¦ Rust ā Palindrome Checker
```rust
/// Checks if a string is a palindrome, ignoring non-alphanumeric characters
/// and case differences.
///
/// # Examples
/// ```
/// assert!(is_palindrome("A man, a plan, a canal: Panama"));
/// assert!(!is_palindrome("Hello, World!"));
/// ```
fn is_palindrome(s: &str) -> bool {
let chars: Vec = s
.chars()
.filter(|c| c.is_alphanumeric())
.map(|c| c.to_ascii_lowercase())
.collect();
let len = chars.len();
for i in 0..len / 2 {
if chars[i] != chars[len - 1 - i] {
return false;
}
}
true
}
fn main() {
let test_cases = vec![
"racecar",
"A man, a plan, a canal: Panama",
"Hello, World!",
];
for case in test_cases {
println!("'{}' -> {}", case, is_palindrome(case));
}
}
```
**Key Features Demonstrated:**
- ā
Memory-safe string processing
- ā
Functional iterator chains
- ā
Comprehensive documentation
- ā
Efficient two-pointer comparison
š¹ Go ā Merge Sorted Arrays
```go
package main
import "fmt"
// mergeSorted combines two sorted integer slices into a single sorted slice.
// It runs in O(n + m) time where n and m are the lengths of the inputs.
func mergeSorted(a, b []int) []int {
result := make([]int, 0, len(a)+len(b))
i, j := 0, 0
// Merge while both arrays have elements
for i < len(a) && j < len(b) {
if a[i] < b[j] {
result = append(result, a[i])
i++
} else {
result = append(result, b[j])
j++
}
}
// Append remaining elements
result = append(result, a[i:]...)
result = append(result, b[j:]...)
return result
}
func main() {
a := []int{1, 3, 5, 7}
b := []int{2, 4, 6, 8}
merged := mergeSorted(a, b)
fmt.Printf("Merged: %v\n", merged) // [1 2 3 4 5 6 7 8]
}
```
**Key Features Demonstrated:**
- ā
Pre-allocated slices for zero-allocation growth
- ā
Two-pointer technique for optimal performance
- ā
Idiomatic Go error-free design
- ā
Clean, readable control flow
ā” JavaScript ā Closure Example
```javascript
/**
* Creates a counter with private state using closures.
* Demonstrates lexical scoping and data encapsulation.
*/
function createCounter(initialValue = 0) {
let count = initialValue; // Private variable
return {
increment() {
count += 1;
return count;
},
decrement() {
count -= 1;
return count;
},
getValue() {
return count;
},
reset() {
count = initialValue;
return count;
}
};
}
// Usage
const counter = createCounter(10);
console.log(counter.increment()); // 11
console.log(counter.increment()); // 12
console.log(counter.getValue()); // 12
console.log(counter.reset()); // 10
```
**Key Features Demonstrated:**
- ā
True private state via closures
- ā
Clean object interface
- ā
ES6 method shorthand syntax
- ā
Default parameter values
---
## š Benchmarks
| Benchmark | Score | Status | Comparison to Base |
|:---|:---:|:---:|:---|
| **HumanEval (Python)** | *TBD* | š Pending | vs Qwen2.5-Coder-7B |
| **HumanEval (Multi-Lang)** | *TBD* | š Pending | vs Qwen2.5-Coder-7B |
| **MBPP (Python)** | *TBD* | š Pending | vs Qwen2.5-Coder-7B |
| **DS-1000 (Data Science)** | *TBD* | š Pending | vs Qwen2.5-Coder-7B |
> š **Coming Soon:** Comprehensive evaluation results based on the Qwen2.5-Coder-7B-Instruct baseline with additional fine-tuning for agentic coding workflows.
---
## š¤ Acknowledgments
| Contribution | Team / Resource | Link |
|:---|:---|:---|
| **šļø Base Model** | Qwen Team | [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) |
| **š¤ Hermit AI Agent** | Hermit Team | [GitHub: Soloman2002](https://github.com/Soloman2002) |
| **š¦ Infrastructure** | Hugging Face | [Transformers](https://github.com/huggingface/transformers) & [vLLM](https://github.com/vllm-project/vllm) |
---
## š Star Us on GitHub!
If you find Hermit Code useful, please consider starring the repository and sharing with your network.
Built with ā¤ļø for the coding community by the Hermit Team