Professionalize model card: structured overview, usage examples, training details, limitations, citation
Browse files
README.md
CHANGED
|
@@ -1,222 +1,198 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
| 3 |
base_model: unsloth/gemma-4-E4B-it
|
| 4 |
datasets:
|
| 5 |
- BrainboxAI/code-training-il
|
| 6 |
- nvidia/OpenCodeInstruct
|
| 7 |
- bleugreen/typescript-instruct
|
| 8 |
-
language:
|
| 9 |
-
- en
|
| 10 |
-
- he
|
| 11 |
tags:
|
| 12 |
-
- text-generation
|
| 13 |
-
- gguf
|
| 14 |
- code
|
| 15 |
- python
|
| 16 |
- typescript
|
| 17 |
-
- gemma4
|
| 18 |
- coding-assistant
|
|
|
|
| 19 |
- llama.cpp
|
| 20 |
- ollama
|
| 21 |
- unsloth
|
|
|
|
| 22 |
- qlora
|
| 23 |
-
-
|
| 24 |
-
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
---
|
| 27 |
|
| 28 |
-
#
|
| 29 |
|
| 30 |
-
**
|
| 31 |
|
| 32 |
-
|
| 33 |
-
|
|
|
|
|
|
|
| 34 |
|
| 35 |
-
|
| 36 |
-
TypeScript instruction pairs plus a hand-curated identity set. Designed to run locally
|
| 37 |
-
via Ollama or llama.cpp with no cloud API, no rate limits, and no data leaving the machine.
|
| 38 |
|
| 39 |
-
## Model
|
| 40 |
|
| 41 |
-
|
| 42 |
-
|-------------------|--------------------------------------------------------------------|
|
| 43 |
-
| **Base Model** | [unsloth/gemma-4-E4B-it](https://huggingface.co/unsloth/gemma-4-E4B-it) (4B params) |
|
| 44 |
-
| **Architecture** | Gemma4ForConditionalGeneration |
|
| 45 |
-
| **Context Length**| 128K tokens (inherited from base) |
|
| 46 |
-
| **Training** | QLoRA 4-bit with Unsloth (2x faster training) |
|
| 47 |
-
| **Dataset** | [BrainboxAI/code-training-il](https://huggingface.co/datasets/BrainboxAI/code-training-il) (~40K examples) |
|
| 48 |
-
| **Quantization** | Q4_K_M GGUF (~5.3 GB) |
|
| 49 |
-
| **License** | Apache 2.0 |
|
| 50 |
-
| **Author** | Netanel Elyasi · BrainboxAI |
|
| 51 |
|
| 52 |
-
|
|
|
|
|
|
|
|
|
|
| 53 |
|
| 54 |
-
|
| 55 |
|
| 56 |
-
|
| 57 |
-
- **TypeScript code generation** — typed functions, React components, utilities.
|
| 58 |
-
- **Debugging** — trace exceptions, explain errors, suggest fixes.
|
| 59 |
-
- **Code explanation** — walk through existing snippets in English or Hebrew.
|
| 60 |
-
- **Test writing** — pytest (Python), Jest/assertion-style (TypeScript).
|
| 61 |
-
- **Refactoring** — simplify, extract helpers, improve readability.
|
| 62 |
|
| 63 |
-
|
| 64 |
|
| 65 |
-
-
|
| 66 |
-
- **Privacy-sensitive teams** building products that can't leak internal code.
|
| 67 |
-
- **Offline workflows** — on the train, on a plane, behind a restrictive firewall.
|
| 68 |
-
- **Hobbyists** running on modest hardware (6 GB+ VRAM or CPU-only).
|
| 69 |
|
| 70 |
-
|
| 71 |
|
| 72 |
-
|
| 73 |
-
|------------------------------------------|---------:|-----------------------------------------------------|
|
| 74 |
-
| `gemma-4-e4b-it.Q4_K_M.gguf` | 5.34 GB | Main model — Ollama / llama.cpp local inference |
|
| 75 |
-
| `gemma-4-e4b-it.BF16-mmproj.gguf` | ~0.9 GB | Vision projector (optional — base supports vision) |
|
| 76 |
|
| 77 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 78 |
|
| 79 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
|
| 81 |
```bash
|
| 82 |
ollama pull hf.co/BrainboxAI/code-il-E4B:Q4_K_M
|
| 83 |
ollama run hf.co/BrainboxAI/code-il-E4B:Q4_K_M
|
| 84 |
```
|
| 85 |
|
| 86 |
-
|
| 87 |
|
| 88 |
```bash
|
| 89 |
-
|
| 90 |
-
|
|
|
|
| 91 |
```
|
| 92 |
|
| 93 |
-
###
|
| 94 |
|
| 95 |
-
```
|
| 96 |
-
|
| 97 |
-
llama-cli -hf BrainboxAI/code-il-E4B --jinja
|
| 98 |
|
| 99 |
-
|
| 100 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 101 |
```
|
| 102 |
|
| 103 |
-
###
|
| 104 |
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
``
|
|
|
|
| 110 |
|
| 111 |
-
|
| 112 |
-
```
|
| 113 |
-
Create a React hook useDebouncedValue<T>(value: T, ms: number): T that returns
|
| 114 |
-
the debounced value.
|
| 115 |
-
```
|
| 116 |
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
else: hi = mid - 1
|
| 128 |
-
return -1
|
| 129 |
-
```
|
| 130 |
|
| 131 |
-
|
| 132 |
-
```
|
| 133 |
-
מי בנה אותך?
|
| 134 |
-
```
|
| 135 |
-
→ "אותי בנתה BrainboxAI בהובלת נתנאל אליאשי. אני עוזר תכנות בפייתון וטיפוסקריפט."
|
| 136 |
|
| 137 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 138 |
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
- Include obvious imports in generated files.
|
| 145 |
-
- When writing tests, match the current implementation unless asked to change it.
|
| 146 |
-
- Return -1 / None / null honestly when a value is missing rather than raising.
|
| 147 |
-
- Flag when the user's request has multiple interpretations and ask a short clarifying question.
|
| 148 |
-
```
|
| 149 |
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
|
| 153 |
-
|------------------------|-------
|
| 154 |
-
| **
|
| 155 |
-
| **
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
|
| 159 |
-
|
| 160 |
-
|
| 161 |
-
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
|
| 165 |
-
|
| 166 |
-
|
| 167 |
-
##
|
| 168 |
-
|
| 169 |
-
|
| 170 |
-
|
| 171 |
-
|
| 172 |
-
|
| 173 |
-
|
| 174 |
-
|
| 175 |
-
| BrainboxAI identity examples | 330 | EN + HE |
|
| 176 |
-
|
| 177 |
-
Split 95/5 train/eval (seed 3407).
|
| 178 |
-
|
| 179 |
-
## Limitations & Ethical Considerations
|
| 180 |
-
|
| 181 |
-
- **4B parameters.** Competitive with larger models on everyday Python/TypeScript
|
| 182 |
-
tasks but will not match GPT-4 or Claude on novel algorithms, complex system
|
| 183 |
-
design, or long multi-file reasoning.
|
| 184 |
-
- **Two languages only.** Python and TypeScript. Generation quality on Rust, Go,
|
| 185 |
-
C++, Ruby, etc. will be noticeably weaker.
|
| 186 |
-
- **Identity is hard-coded.** The model will assert it is "BrainboxAI Coder,
|
| 187 |
-
trained by Netanel Elyasi at BrainboxAI" across sessions.
|
| 188 |
-
- **Cutoff.** Training data reflects code up to the dataset snapshot (2026).
|
| 189 |
-
Library APIs released afterwards may be missing.
|
| 190 |
-
- **Not a security auditor.** The model can be prompted to produce insecure code.
|
| 191 |
-
Always review generated code before running in production.
|
| 192 |
-
- **Hallucinations.** Like any LLM, it can fabricate imports, function signatures,
|
| 193 |
-
or test cases. Verify everything.
|
| 194 |
-
|
| 195 |
-
## Sibling Repositories
|
| 196 |
-
|
| 197 |
-
- [BrainboxAI/code-training-il](https://huggingface.co/datasets/BrainboxAI/code-training-il) — training dataset (this model).
|
| 198 |
-
- [BrainboxAI/law-il-E2B](https://huggingface.co/BrainboxAI/law-il-E2B) — Israeli legal assistant.
|
| 199 |
-
- [BrainboxAI/law-il-E2B-safetensors](https://huggingface.co/BrainboxAI/law-il-E2B-safetensors) — safetensors variant.
|
| 200 |
-
- [BrainboxAI/legal-training-il](https://huggingface.co/datasets/BrainboxAI/legal-training-il) — legal training dataset.
|
| 201 |
|
| 202 |
## Citation
|
| 203 |
|
| 204 |
```bibtex
|
| 205 |
-
@misc{
|
| 206 |
-
title = {
|
| 207 |
-
author = {Elyasi, Netanel
|
| 208 |
year = {2026},
|
|
|
|
| 209 |
howpublished = {\url{https://huggingface.co/BrainboxAI/code-il-E4B}},
|
|
|
|
| 210 |
}
|
| 211 |
```
|
| 212 |
|
| 213 |
-
##
|
| 214 |
|
| 215 |
-
BrainboxAI
|
| 216 |
-
specialized, local-first language models for specific domains:
|
| 217 |
|
| 218 |
-
-
|
| 219 |
-
|
|
|
|
| 220 |
|
| 221 |
-
|
| 222 |
-
openly on HuggingFace.
|
|
|
|
| 1 |
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
- he
|
| 5 |
license: apache-2.0
|
| 6 |
+
library_name: transformers
|
| 7 |
+
pipeline_tag: text-generation
|
| 8 |
base_model: unsloth/gemma-4-E4B-it
|
| 9 |
datasets:
|
| 10 |
- BrainboxAI/code-training-il
|
| 11 |
- nvidia/OpenCodeInstruct
|
| 12 |
- bleugreen/typescript-instruct
|
|
|
|
|
|
|
|
|
|
| 13 |
tags:
|
|
|
|
|
|
|
| 14 |
- code
|
| 15 |
- python
|
| 16 |
- typescript
|
|
|
|
| 17 |
- coding-assistant
|
| 18 |
+
- gguf
|
| 19 |
- llama.cpp
|
| 20 |
- ollama
|
| 21 |
- unsloth
|
| 22 |
+
- gemma4
|
| 23 |
- qlora
|
| 24 |
+
- text-generation
|
| 25 |
+
- on-device
|
| 26 |
+
- private-first
|
| 27 |
+
pretty_name: Code-IL E4B (Local Coding Assistant)
|
| 28 |
+
model-index:
|
| 29 |
+
- name: code-il-E4B
|
| 30 |
+
results: []
|
| 31 |
---
|
| 32 |
|
| 33 |
+
# Code-IL E4B
|
| 34 |
|
| 35 |
+
**A 4B-parameter coding assistant for Python and TypeScript — runs entirely on-device, no code ever leaves your machine.**
|
| 36 |
|
| 37 |
+
[](https://huggingface.co/BrainboxAI/code-il-E4B)
|
| 38 |
+
[](https://huggingface.co/datasets/BrainboxAI/code-training-il)
|
| 39 |
+
[](https://huggingface.co/BrainboxAI/code-il-E4B-safetensors)
|
| 40 |
+
[](https://www.apache.org/licenses/LICENSE-2.0)
|
| 41 |
|
| 42 |
+
---
|
|
|
|
|
|
|
| 43 |
|
| 44 |
+
## Model overview
|
| 45 |
|
| 46 |
+
`code-il-E4B` is a 4-billion-parameter coding assistant fine-tuned from Google's Gemma-4 E4B. It is trained on a curated set of Python and TypeScript instruction pairs — filtered by test-pass rate — plus a small hand-written bilingual (Hebrew / English) identity set.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
|
| 48 |
+
The entire model is 4 GB in GGUF Q4_K_M form. It runs on:
|
| 49 |
+
- A modern laptop CPU (slower but functional)
|
| 50 |
+
- Any consumer GPU with 6 GB+ VRAM
|
| 51 |
+
- Apple Silicon via llama.cpp Metal
|
| 52 |
|
| 53 |
+
No API. No telemetry. No data leaving the developer's machine.
|
| 54 |
|
| 55 |
+
## Why this exists
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 56 |
|
| 57 |
+
Every keystroke sent to a cloud coding assistant is a potential data-leak event. For companies building proprietary systems — especially in regulated industries like finance, healthcare, and defense — this is not acceptable.
|
| 58 |
|
| 59 |
+
`code-il-E4B` is the private alternative: a model small enough to run locally, tuned specifically for the two languages most companies actually write in.
|
|
|
|
|
|
|
|
|
|
| 60 |
|
| 61 |
+
It is not competing with Claude Sonnet or GPT-4o on raw capability. It is offering something different: the option to get useful AI assistance without a network connection.
|
| 62 |
|
| 63 |
+
## Intended use
|
|
|
|
|
|
|
|
|
|
| 64 |
|
| 65 |
+
**Primary use cases:**
|
| 66 |
+
- Local code completion and review in regulated environments
|
| 67 |
+
- On-prem deployment for companies with strict data-residency rules
|
| 68 |
+
- Pair-programming for developers with unreliable internet
|
| 69 |
+
- Integration into internal developer tooling that cannot call external APIs
|
| 70 |
+
- Hebrew-speaking developer onboarding (model responds in Hebrew on request)
|
| 71 |
|
| 72 |
+
**Out-of-scope uses:**
|
| 73 |
+
- Replacement for frontier models on complex architecture tasks
|
| 74 |
+
- Production code generation without human review
|
| 75 |
+
- Languages other than Python / TypeScript (coverage is minimal)
|
| 76 |
+
- Fine-tuning tasks requiring >4B parameters of capacity
|
| 77 |
+
|
| 78 |
+
## How to use
|
| 79 |
+
|
| 80 |
+
### Ollama
|
| 81 |
|
| 82 |
```bash
|
| 83 |
ollama pull hf.co/BrainboxAI/code-il-E4B:Q4_K_M
|
| 84 |
ollama run hf.co/BrainboxAI/code-il-E4B:Q4_K_M
|
| 85 |
```
|
| 86 |
|
| 87 |
+
### llama.cpp
|
| 88 |
|
| 89 |
```bash
|
| 90 |
+
./llama-cli -m code-il-E4B.Q4_K_M.gguf \
|
| 91 |
+
-p "Write a Python function that parses ISO-8601 dates with timezones." \
|
| 92 |
+
--temp 0.2 --top-p 0.95 -n 1024
|
| 93 |
```
|
| 94 |
|
| 95 |
+
### Python (transformers)
|
| 96 |
|
| 97 |
+
```python
|
| 98 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
|
|
| 99 |
|
| 100 |
+
tokenizer = AutoTokenizer.from_pretrained("BrainboxAI/code-il-E4B-safetensors")
|
| 101 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 102 |
+
"BrainboxAI/code-il-E4B-safetensors",
|
| 103 |
+
torch_dtype="auto",
|
| 104 |
+
device_map="auto",
|
| 105 |
+
)
|
| 106 |
+
|
| 107 |
+
messages = [
|
| 108 |
+
{"role": "user", "content": "Implement binary search in TypeScript with full edge-case handling."},
|
| 109 |
+
]
|
| 110 |
+
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
|
| 111 |
+
outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.2, top_p=0.95)
|
| 112 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 113 |
```
|
| 114 |
|
| 115 |
+
### Recommended generation parameters
|
| 116 |
|
| 117 |
+
| Parameter | Value | Rationale |
|
| 118 |
+
|-----------|-------|-----------|
|
| 119 |
+
| `temperature` | 0.2 | Low creativity for deterministic code |
|
| 120 |
+
| `top_p` | 0.95 | Slightly higher than legal model to allow idiom variety |
|
| 121 |
+
| `max_new_tokens` | 1024 | Enough for most function-level completions |
|
| 122 |
+
| `repetition_penalty` | 1.0 | Penalizing repetition hurts code structure |
|
| 123 |
|
| 124 |
+
## Training details
|
|
|
|
|
|
|
|
|
|
|
|
|
| 125 |
|
| 126 |
+
| Attribute | Value |
|
| 127 |
+
|-----------|-------|
|
| 128 |
+
| **Base model** | [unsloth/gemma-4-E4B-it](https://huggingface.co/unsloth/gemma-4-E4B-it) |
|
| 129 |
+
| **Method** | QLoRA (4-bit quantization during training) |
|
| 130 |
+
| **LoRA rank (r)** | 64 |
|
| 131 |
+
| **LoRA alpha** | 128 |
|
| 132 |
+
| **Training data size** | 40,000 curated examples |
|
| 133 |
+
| **Train / validation split** | 95% / 5%, seed 3407 |
|
| 134 |
+
| **Hardware** | NVIDIA RTX 5090 (RunPod) |
|
| 135 |
+
| **Framework** | Unsloth Studio |
|
|
|
|
|
|
|
|
|
|
| 136 |
|
| 137 |
+
### Dataset composition (40,330 examples)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 138 |
|
| 139 |
+
| Source | Count | Content |
|
| 140 |
+
|--------|-------|---------|
|
| 141 |
+
| [OpenCodeInstruct (NVIDIA)](https://huggingface.co/datasets/nvidia/OpenCodeInstruct) | 20,000 | Python — filtered to examples with test-pass rate > 50% |
|
| 142 |
+
| [typescript-instruct (bleugreen)](https://huggingface.co/datasets/bleugreen/typescript-instruct) | 20,000 | TypeScript instruction pairs |
|
| 143 |
+
| Hand-written identity set | 330 | Hebrew + English, BrainboxAI persona |
|
| 144 |
|
| 145 |
+
The filtering pass on OpenCodeInstruct was the single biggest quality lever. Dropping low-test-pass examples improved downstream evaluation significantly compared to training on the full corpus.
|
| 146 |
+
|
| 147 |
+
See the [dataset card](https://huggingface.co/datasets/BrainboxAI/code-training-il) for full details.
|
| 148 |
+
|
| 149 |
+
## Evaluation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 150 |
|
| 151 |
+
Internal evaluation on structured coding tasks:
|
| 152 |
+
|
| 153 |
+
| Task | Examples | Passed | Notes |
|
| 154 |
+
|------|----------|--------|-------|
|
| 155 |
+
| **FizzBuzz** (via agentic loop) | 5 | 5/5 | Solved in 6 steps, zero correction rounds |
|
| 156 |
+
| **Binary search with 11 edge cases** | 11 | 11/11 | Including leftmost-duplicate handling |
|
| 157 |
+
|
| 158 |
+
Formal HumanEval / MBPP benchmarks have not yet been run publicly. Evaluation work is ongoing.
|
| 159 |
+
|
| 160 |
+
## Limitations
|
| 161 |
+
|
| 162 |
+
- **Small model.** 4B parameters is not frontier-capability. Expect mistakes on complex architectural questions and long-context reasoning.
|
| 163 |
+
- **Two languages.** Strong on Python and TypeScript; weak on other languages.
|
| 164 |
+
- **No tool use out of the box.** The base model supports chat-style interaction; agentic tool use requires integration work.
|
| 165 |
+
- **Training cutoff.** Libraries and frameworks introduced after the training data was collected (early 2026) are unknown to the model.
|
| 166 |
+
- **Hallucination risk.** Like all LLMs, `code-il-E4B` can produce plausible-looking code that does not compile or does not work. Always test.
|
| 167 |
+
|
| 168 |
+
## Formats available
|
| 169 |
+
|
| 170 |
+
- [**GGUF Q4_K_M** (~4 GB)](https://huggingface.co/BrainboxAI/code-il-E4B) — for Ollama, llama.cpp, LM Studio
|
| 171 |
+
- [**Safetensors 16-bit**](https://huggingface.co/BrainboxAI/code-il-E4B-safetensors) — for further fine-tuning, HF transformers
|
| 172 |
+
|
| 173 |
+
## License
|
| 174 |
+
|
| 175 |
+
Apache 2.0. Use commercially, modify, and redistribute with attribution.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 176 |
|
| 177 |
## Citation
|
| 178 |
|
| 179 |
```bibtex
|
| 180 |
+
@misc{elyasi2026codeil,
|
| 181 |
+
title = {Code-IL E4B: A Small, On-Device Coding Assistant for Private Environments},
|
| 182 |
+
author = {Elyasi, Netanel},
|
| 183 |
year = {2026},
|
| 184 |
+
publisher = {BrainboxAI},
|
| 185 |
howpublished = {\url{https://huggingface.co/BrainboxAI/code-il-E4B}},
|
| 186 |
+
note = {Fine-tuned from unsloth/gemma-4-E4B-it}
|
| 187 |
}
|
| 188 |
```
|
| 189 |
|
| 190 |
+
## Author
|
| 191 |
|
| 192 |
+
Built by [**Netanel Elyasi**](https://huggingface.co/BrainboxAI), founder of [BrainboxAI](https://brainboxai.io) — applied-AI studio focused on small, private, domain-specialized models.
|
|
|
|
| 193 |
|
| 194 |
+
For custom coding-model fine-tuning on private company codebases, contact: **netanele@brainboxai.io**.
|
| 195 |
+
|
| 196 |
+
---
|
| 197 |
|
| 198 |
+
*Part of the BrainboxAI family of on-device models — see also [`law-il-E2B`](https://huggingface.co/BrainboxAI/law-il-E2B) (legal) and [`cyber-analyst-4B`](https://huggingface.co/BrainboxAI/cyber-analyst-4B) (security).*
|
|
|