BrainboxAI commited on
Commit
eabc971
·
verified ·
1 Parent(s): c4bedb7

Professionalize model card: structured overview, usage examples, training details, limitations, citation

Browse files
Files changed (1) hide show
  1. README.md +133 -157
README.md CHANGED
@@ -1,222 +1,198 @@
1
  ---
 
 
 
2
  license: apache-2.0
 
 
3
  base_model: unsloth/gemma-4-E4B-it
4
  datasets:
5
  - BrainboxAI/code-training-il
6
  - nvidia/OpenCodeInstruct
7
  - bleugreen/typescript-instruct
8
- language:
9
- - en
10
- - he
11
  tags:
12
- - text-generation
13
- - gguf
14
  - code
15
  - python
16
  - typescript
17
- - gemma4
18
  - coding-assistant
 
19
  - llama.cpp
20
  - ollama
21
  - unsloth
 
22
  - qlora
23
- - brainboxai
24
- library_name: transformers
25
- pipeline_tag: text-generation
 
 
 
 
26
  ---
27
 
28
- # BrainboxAI/code-il-E4B
29
 
30
- **Local-First Python & TypeScript Coding Assistant (GGUF)**
31
 
32
- Built by [**BrainboxAI**](https://huggingface.co/BrainboxAI), founded by **Netanel Elyasi**.
33
- Sister model of [BrainboxAI/law-il-E2B](https://huggingface.co/BrainboxAI/law-il-E2B).
 
 
34
 
35
- A lightweight coding model, fine-tuned from Google's Gemma 4 E4B on ~40K Python and
36
- TypeScript instruction pairs plus a hand-curated identity set. Designed to run locally
37
- via Ollama or llama.cpp with no cloud API, no rate limits, and no data leaving the machine.
38
 
39
- ## Model Details
40
 
41
- | Attribute | Value |
42
- |-------------------|--------------------------------------------------------------------|
43
- | **Base Model** | [unsloth/gemma-4-E4B-it](https://huggingface.co/unsloth/gemma-4-E4B-it) (4B params) |
44
- | **Architecture** | Gemma4ForConditionalGeneration |
45
- | **Context Length**| 128K tokens (inherited from base) |
46
- | **Training** | QLoRA 4-bit with Unsloth (2x faster training) |
47
- | **Dataset** | [BrainboxAI/code-training-il](https://huggingface.co/datasets/BrainboxAI/code-training-il) (~40K examples) |
48
- | **Quantization** | Q4_K_M GGUF (~5.3 GB) |
49
- | **License** | Apache 2.0 |
50
- | **Author** | Netanel Elyasi · BrainboxAI |
51
 
52
- ## Intended Use
 
 
 
53
 
54
- ### Primary Tasks
55
 
56
- - **Python code generation** — functions, classes, algorithms, data structures.
57
- - **TypeScript code generation** — typed functions, React components, utilities.
58
- - **Debugging** — trace exceptions, explain errors, suggest fixes.
59
- - **Code explanation** — walk through existing snippets in English or Hebrew.
60
- - **Test writing** — pytest (Python), Jest/assertion-style (TypeScript).
61
- - **Refactoring** — simplify, extract helpers, improve readability.
62
 
63
- ### Target Users
64
 
65
- - **Developers** who want local-first coding help without sending code to cloud APIs.
66
- - **Privacy-sensitive teams** building products that can't leak internal code.
67
- - **Offline workflows** — on the train, on a plane, behind a restrictive firewall.
68
- - **Hobbyists** running on modest hardware (6 GB+ VRAM or CPU-only).
69
 
70
- ## Available Files
71
 
72
- | File | Size | Use |
73
- |------------------------------------------|---------:|-----------------------------------------------------|
74
- | `gemma-4-e4b-it.Q4_K_M.gguf` | 5.34 GB | Main model — Ollama / llama.cpp local inference |
75
- | `gemma-4-e4b-it.BF16-mmproj.gguf` | ~0.9 GB | Vision projector (optional — base supports vision) |
76
 
77
- ## Quick Start
 
 
 
 
 
78
 
79
- ### With Ollama
 
 
 
 
 
 
 
 
80
 
81
  ```bash
82
  ollama pull hf.co/BrainboxAI/code-il-E4B:Q4_K_M
83
  ollama run hf.co/BrainboxAI/code-il-E4B:Q4_K_M
84
  ```
85
 
86
- Optional — tag it with a short name:
87
 
88
  ```bash
89
- ollama cp hf.co/BrainboxAI/code-il-E4B:Q4_K_M brainbox-coder
90
- ollama run brainbox-coder
 
91
  ```
92
 
93
- ### With llama.cpp
94
 
95
- ```bash
96
- # Text-only
97
- llama-cli -hf BrainboxAI/code-il-E4B --jinja
98
 
99
- # With vision (if you also download the mmproj file)
100
- llama-mtmd-cli -hf BrainboxAI/code-il-E4B --jinja
 
 
 
 
 
 
 
 
 
 
 
101
  ```
102
 
103
- ### Example Prompts
104
 
105
- **Python:**
106
- ```
107
- Write a Python function that returns the leftmost index of a target in a sorted
108
- array with possible duplicates, or -1 if not found.
109
- ```
 
110
 
111
- **TypeScript:**
112
- ```
113
- Create a React hook useDebouncedValue<T>(value: T, ms: number): T that returns
114
- the debounced value.
115
- ```
116
 
117
- **Debugging:**
118
- ```
119
- This pytest fails with AssertionError. What's wrong with my binary_search?
120
-
121
- def binary_search(arr, target):
122
- lo, hi = 0, len(arr)
123
- while lo < hi:
124
- mid = (lo + hi) // 2
125
- if arr[mid] == target: return mid
126
- elif arr[mid] < target: lo = mid + 1
127
- else: hi = mid - 1
128
- return -1
129
- ```
130
 
131
- **Hebrew (identity):**
132
- ```
133
- מי בנה אותך?
134
- ```
135
- → "אותי בנתה BrainboxAI בהובלת נתנאל אליאשי. אני עוזר תכנות בפייתון וטיפוסקריפט."
136
 
137
- ## Recommended System Prompt
 
 
 
 
138
 
139
- ```
140
- You are BrainboxAI Coder, a local coding assistant fine-tuned from Gemma 4 by
141
- Netanel Elyasi at BrainboxAI. You specialize in Python and TypeScript.
142
-
143
- Prefer concise, correct code over verbose explanations. Always:
144
- - Include obvious imports in generated files.
145
- - When writing tests, match the current implementation unless asked to change it.
146
- - Return -1 / None / null honestly when a value is missing rather than raising.
147
- - Flag when the user's request has multiple interpretations and ask a short clarifying question.
148
- ```
149
 
150
- ## Training Details
151
-
152
- | Stage | Value |
153
- |------------------------|----------------------------------------------------|
154
- | **Method** | QLoRA 4-bit supervised fine-tuning (SFT) |
155
- | **Framework** | Unsloth + TRL `SFTTrainer` |
156
- | **Hardware** | NVIDIA RTX 5090 (32 GB VRAM) |
157
- | **LoRA rank** | 16 (alpha 16, dropout 0) |
158
- | **Target modules** | q_proj, k_proj, v_proj, o_proj, gate/up/down_proj |
159
- | **Batch** | 2 × 4 grad accum = 16 effective |
160
- | **Learning rate** | 2e-4, linear decay, 10-step warmup |
161
- | **Steps** | 500 |
162
- | **Sequence length** | 2,048 tokens |
163
- | **Final loss** | ~0.8 (from ~2.4 average at start) |
164
- | **Gradient checkpointing** | `"unsloth"` (≈30% VRAM savings) |
165
- | **Seed** | 3407 |
166
-
167
- ## Dataset
168
-
169
- Trained on [BrainboxAI/code-training-il](https://huggingface.co/datasets/BrainboxAI/code-training-il):
170
-
171
- | Source | Samples | Language |
172
- |-------------------------------------|--------:|------------------|
173
- | nvidia/OpenCodeInstruct (score≥0.5) | 20,000 | English / Python |
174
- | bleugreen/typescript-instruct | 20,000 | English / TS |
175
- | BrainboxAI identity examples | 330 | EN + HE |
176
-
177
- Split 95/5 train/eval (seed 3407).
178
-
179
- ## Limitations & Ethical Considerations
180
-
181
- - **4B parameters.** Competitive with larger models on everyday Python/TypeScript
182
- tasks but will not match GPT-4 or Claude on novel algorithms, complex system
183
- design, or long multi-file reasoning.
184
- - **Two languages only.** Python and TypeScript. Generation quality on Rust, Go,
185
- C++, Ruby, etc. will be noticeably weaker.
186
- - **Identity is hard-coded.** The model will assert it is "BrainboxAI Coder,
187
- trained by Netanel Elyasi at BrainboxAI" across sessions.
188
- - **Cutoff.** Training data reflects code up to the dataset snapshot (2026).
189
- Library APIs released afterwards may be missing.
190
- - **Not a security auditor.** The model can be prompted to produce insecure code.
191
- Always review generated code before running in production.
192
- - **Hallucinations.** Like any LLM, it can fabricate imports, function signatures,
193
- or test cases. Verify everything.
194
-
195
- ## Sibling Repositories
196
-
197
- - [BrainboxAI/code-training-il](https://huggingface.co/datasets/BrainboxAI/code-training-il) — training dataset (this model).
198
- - [BrainboxAI/law-il-E2B](https://huggingface.co/BrainboxAI/law-il-E2B) — Israeli legal assistant.
199
- - [BrainboxAI/law-il-E2B-safetensors](https://huggingface.co/BrainboxAI/law-il-E2B-safetensors) — safetensors variant.
200
- - [BrainboxAI/legal-training-il](https://huggingface.co/datasets/BrainboxAI/legal-training-il) — legal training dataset.
201
 
202
  ## Citation
203
 
204
  ```bibtex
205
- @misc{brainboxai_code_il_e4b,
206
- title = {BrainboxAI Coder (code-il-E4B)},
207
- author = {Elyasi, Netanel and BrainboxAI},
208
  year = {2026},
 
209
  howpublished = {\url{https://huggingface.co/BrainboxAI/code-il-E4B}},
 
210
  }
211
  ```
212
 
213
- ## About BrainboxAI
214
 
215
- BrainboxAI is an Israeli AI company founded by **Netanel Elyasi**, building
216
- specialized, local-first language models for specific domains:
217
 
218
- - **law-il** Hebrew-first Israeli legal AI.
219
- - **code-il** (this model) — local Python + TypeScript coding assistant.
 
220
 
221
- All BrainboxAI releases are permissively licensed (Apache 2.0) and published
222
- openly on HuggingFace.
 
1
  ---
2
+ language:
3
+ - en
4
+ - he
5
  license: apache-2.0
6
+ library_name: transformers
7
+ pipeline_tag: text-generation
8
  base_model: unsloth/gemma-4-E4B-it
9
  datasets:
10
  - BrainboxAI/code-training-il
11
  - nvidia/OpenCodeInstruct
12
  - bleugreen/typescript-instruct
 
 
 
13
  tags:
 
 
14
  - code
15
  - python
16
  - typescript
 
17
  - coding-assistant
18
+ - gguf
19
  - llama.cpp
20
  - ollama
21
  - unsloth
22
+ - gemma4
23
  - qlora
24
+ - text-generation
25
+ - on-device
26
+ - private-first
27
+ pretty_name: Code-IL E4B (Local Coding Assistant)
28
+ model-index:
29
+ - name: code-il-E4B
30
+ results: []
31
  ---
32
 
33
+ # Code-IL E4B
34
 
35
+ **A 4B-parameter coding assistant for Python and TypeScript runs entirely on-device, no code ever leaves your machine.**
36
 
37
+ [![HF Model](https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Model-yellow)](https://huggingface.co/BrainboxAI/code-il-E4B)
38
+ [![Dataset](https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Dataset-blue)](https://huggingface.co/datasets/BrainboxAI/code-training-il)
39
+ [![Safetensors](https://img.shields.io/badge/Format-Safetensors-green)](https://huggingface.co/BrainboxAI/code-il-E4B-safetensors)
40
+ [![License](https://img.shields.io/badge/License-Apache_2.0-lightgrey)](https://www.apache.org/licenses/LICENSE-2.0)
41
 
42
+ ---
 
 
43
 
44
+ ## Model overview
45
 
46
+ `code-il-E4B` is a 4-billion-parameter coding assistant fine-tuned from Google's Gemma-4 E4B. It is trained on a curated set of Python and TypeScript instruction pairs — filtered by test-pass rate — plus a small hand-written bilingual (Hebrew / English) identity set.
 
 
 
 
 
 
 
 
 
47
 
48
+ The entire model is 4 GB in GGUF Q4_K_M form. It runs on:
49
+ - A modern laptop CPU (slower but functional)
50
+ - Any consumer GPU with 6 GB+ VRAM
51
+ - Apple Silicon via llama.cpp Metal
52
 
53
+ No API. No telemetry. No data leaving the developer's machine.
54
 
55
+ ## Why this exists
 
 
 
 
 
56
 
57
+ Every keystroke sent to a cloud coding assistant is a potential data-leak event. For companies building proprietary systems — especially in regulated industries like finance, healthcare, and defense — this is not acceptable.
58
 
59
+ `code-il-E4B` is the private alternative: a model small enough to run locally, tuned specifically for the two languages most companies actually write in.
 
 
 
60
 
61
+ It is not competing with Claude Sonnet or GPT-4o on raw capability. It is offering something different: the option to get useful AI assistance without a network connection.
62
 
63
+ ## Intended use
 
 
 
64
 
65
+ **Primary use cases:**
66
+ - Local code completion and review in regulated environments
67
+ - On-prem deployment for companies with strict data-residency rules
68
+ - Pair-programming for developers with unreliable internet
69
+ - Integration into internal developer tooling that cannot call external APIs
70
+ - Hebrew-speaking developer onboarding (model responds in Hebrew on request)
71
 
72
+ **Out-of-scope uses:**
73
+ - Replacement for frontier models on complex architecture tasks
74
+ - Production code generation without human review
75
+ - Languages other than Python / TypeScript (coverage is minimal)
76
+ - Fine-tuning tasks requiring >4B parameters of capacity
77
+
78
+ ## How to use
79
+
80
+ ### Ollama
81
 
82
  ```bash
83
  ollama pull hf.co/BrainboxAI/code-il-E4B:Q4_K_M
84
  ollama run hf.co/BrainboxAI/code-il-E4B:Q4_K_M
85
  ```
86
 
87
+ ### llama.cpp
88
 
89
  ```bash
90
+ ./llama-cli -m code-il-E4B.Q4_K_M.gguf \
91
+ -p "Write a Python function that parses ISO-8601 dates with timezones." \
92
+ --temp 0.2 --top-p 0.95 -n 1024
93
  ```
94
 
95
+ ### Python (transformers)
96
 
97
+ ```python
98
+ from transformers import AutoTokenizer, AutoModelForCausalLM
 
99
 
100
+ tokenizer = AutoTokenizer.from_pretrained("BrainboxAI/code-il-E4B-safetensors")
101
+ model = AutoModelForCausalLM.from_pretrained(
102
+ "BrainboxAI/code-il-E4B-safetensors",
103
+ torch_dtype="auto",
104
+ device_map="auto",
105
+ )
106
+
107
+ messages = [
108
+ {"role": "user", "content": "Implement binary search in TypeScript with full edge-case handling."},
109
+ ]
110
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
111
+ outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.2, top_p=0.95)
112
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
113
  ```
114
 
115
+ ### Recommended generation parameters
116
 
117
+ | Parameter | Value | Rationale |
118
+ |-----------|-------|-----------|
119
+ | `temperature` | 0.2 | Low creativity for deterministic code |
120
+ | `top_p` | 0.95 | Slightly higher than legal model to allow idiom variety |
121
+ | `max_new_tokens` | 1024 | Enough for most function-level completions |
122
+ | `repetition_penalty` | 1.0 | Penalizing repetition hurts code structure |
123
 
124
+ ## Training details
 
 
 
 
125
 
126
+ | Attribute | Value |
127
+ |-----------|-------|
128
+ | **Base model** | [unsloth/gemma-4-E4B-it](https://huggingface.co/unsloth/gemma-4-E4B-it) |
129
+ | **Method** | QLoRA (4-bit quantization during training) |
130
+ | **LoRA rank (r)** | 64 |
131
+ | **LoRA alpha** | 128 |
132
+ | **Training data size** | 40,000 curated examples |
133
+ | **Train / validation split** | 95% / 5%, seed 3407 |
134
+ | **Hardware** | NVIDIA RTX 5090 (RunPod) |
135
+ | **Framework** | Unsloth Studio |
 
 
 
136
 
137
+ ### Dataset composition (40,330 examples)
 
 
 
 
138
 
139
+ | Source | Count | Content |
140
+ |--------|-------|---------|
141
+ | [OpenCodeInstruct (NVIDIA)](https://huggingface.co/datasets/nvidia/OpenCodeInstruct) | 20,000 | Python — filtered to examples with test-pass rate > 50% |
142
+ | [typescript-instruct (bleugreen)](https://huggingface.co/datasets/bleugreen/typescript-instruct) | 20,000 | TypeScript instruction pairs |
143
+ | Hand-written identity set | 330 | Hebrew + English, BrainboxAI persona |
144
 
145
+ The filtering pass on OpenCodeInstruct was the single biggest quality lever. Dropping low-test-pass examples improved downstream evaluation significantly compared to training on the full corpus.
146
+
147
+ See the [dataset card](https://huggingface.co/datasets/BrainboxAI/code-training-il) for full details.
148
+
149
+ ## Evaluation
 
 
 
 
 
150
 
151
+ Internal evaluation on structured coding tasks:
152
+
153
+ | Task | Examples | Passed | Notes |
154
+ |------|----------|--------|-------|
155
+ | **FizzBuzz** (via agentic loop) | 5 | 5/5 | Solved in 6 steps, zero correction rounds |
156
+ | **Binary search with 11 edge cases** | 11 | 11/11 | Including leftmost-duplicate handling |
157
+
158
+ Formal HumanEval / MBPP benchmarks have not yet been run publicly. Evaluation work is ongoing.
159
+
160
+ ## Limitations
161
+
162
+ - **Small model.** 4B parameters is not frontier-capability. Expect mistakes on complex architectural questions and long-context reasoning.
163
+ - **Two languages.** Strong on Python and TypeScript; weak on other languages.
164
+ - **No tool use out of the box.** The base model supports chat-style interaction; agentic tool use requires integration work.
165
+ - **Training cutoff.** Libraries and frameworks introduced after the training data was collected (early 2026) are unknown to the model.
166
+ - **Hallucination risk.** Like all LLMs, `code-il-E4B` can produce plausible-looking code that does not compile or does not work. Always test.
167
+
168
+ ## Formats available
169
+
170
+ - [**GGUF Q4_K_M** (~4 GB)](https://huggingface.co/BrainboxAI/code-il-E4B) — for Ollama, llama.cpp, LM Studio
171
+ - [**Safetensors 16-bit**](https://huggingface.co/BrainboxAI/code-il-E4B-safetensors) — for further fine-tuning, HF transformers
172
+
173
+ ## License
174
+
175
+ Apache 2.0. Use commercially, modify, and redistribute with attribution.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
176
 
177
  ## Citation
178
 
179
  ```bibtex
180
+ @misc{elyasi2026codeil,
181
+ title = {Code-IL E4B: A Small, On-Device Coding Assistant for Private Environments},
182
+ author = {Elyasi, Netanel},
183
  year = {2026},
184
+ publisher = {BrainboxAI},
185
  howpublished = {\url{https://huggingface.co/BrainboxAI/code-il-E4B}},
186
+ note = {Fine-tuned from unsloth/gemma-4-E4B-it}
187
  }
188
  ```
189
 
190
+ ## Author
191
 
192
+ Built by [**Netanel Elyasi**](https://huggingface.co/BrainboxAI), founder of [BrainboxAI](https://brainboxai.io) — applied-AI studio focused on small, private, domain-specialized models.
 
193
 
194
+ For custom coding-model fine-tuning on private company codebases, contact: **netanele@brainboxai.io**.
195
+
196
+ ---
197
 
198
+ *Part of the BrainboxAI family of on-device models see also [`law-il-E2B`](https://huggingface.co/BrainboxAI/law-il-E2B) (legal) and [`cyber-analyst-4B`](https://huggingface.co/BrainboxAI/cyber-analyst-4B) (security).*