File size: 5,598 Bytes
95c1d5f f8adfb4 95c1d5f f8adfb4 95c1d5f f8adfb4 95c1d5f f8adfb4 95c1d5f f8adfb4 95c1d5f f8adfb4 95c1d5f b07b0f2 95c1d5f f8adfb4 b07b0f2 f8adfb4 b07b0f2 f8adfb4 b07b0f2 f8adfb4 b07b0f2 f8adfb4 b07b0f2 f8adfb4 b07b0f2 f8adfb4 b07b0f2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | ---
language:
- ja
- en
license: apache-2.0
base_model: google/gemma-4-31b-it
tags:
- gemma4
- code
- agent
- japanese
- qlora
- react
- mcp
- claude-code
datasets:
- custom
pipeline_tag: text-generation
---
# gemma4-31b-ja-agent-coder
**Japanese-enhanced agentic coding model** — Fine-tuned gemma4-31b-it for autonomous coding agents with Japanese language support.
## Highlights
- **Agentic behavior**: ReAct reasoning, multi-step tool calling, self-correction
- **Japanese coding**: Code generation, review, debugging in Japanese
- **Claude Code compatible**: Designed as a local subagent for Claude Code via MCP
- **Function calling**: Native Ollama/OpenAI tool use format
- **Zero API cost**: Runs locally on 20GB+ VRAM
## Benchmark Results
Evaluated on 12 task categories across agentic coding capabilities. Each criterion is scored 0-1, averaged per category (scale 0-10).
| Category | Base (gemma4-31b-it) | Fine-tuned (v2) | Delta |
|----------|:---:|:---:|:---:|
| ReAct Tool Call | 10.0 | **10.0** | — |
| Function Calling | 8.0 | **10.0** | +2.0 |
| Multi-step ReAct | 8.0 | **10.0** | +2.0 |
| JP Code Gen (API) | 10.0 | **10.0** | — |
| JP Code Gen (Algorithm) | 10.0 | **10.0** | — |
| JP Code Gen (Database) | 9.0 | **10.0** | +1.0 |
| JP Debug (TypeError) | 10.0 | **10.0** | — |
| JP Debug (KeyError) | 10.0 | **10.0** | — |
| JP Code Review | 8.0 | **10.0** | +2.0 |
| JP Git Strategy | 10.0 | **10.0** | — |
| JP Self-correction | 10.0 | **10.0** | — |
| JP Documentation | 10.0 | **10.0** | — |
| **Overall** | **9.4** | **10.0** | **+0.6** |
### Key Improvements
- **Function Calling**: Clean `<tool_call>` JSON format output (base model adds extra explanation)
- **Multi-step ReAct**: Structured JSON reasoning with proper Thought/Action/Observation flow
- **Code Review**: Parameterized query suggestions for SQL injection fixes
- **Database CRUD**: Complete Create/Read/Update/Delete coverage
### Inference Test Results (v2 adapter)
| Test | Input | Result |
|------|-------|--------|
| ReAct | "Read src/main.py using read_file tool" | Correct JSON with thought + action |
| JP Code Gen | "FastAPIでヘルスチェックエンドポイントを作成" | Clean Python with `/healthz` endpoint |
| JP Debug | "TypeError: 'NoneType' is not subscriptable の原因と修正" | Japanese explanation + fix code |
| Function Calling | "Use read_file to read README.md" | Clean `<tool_call>` JSON format |
## Training Details
| Parameter | Value |
|-----------|-------|
| Base model | google/gemma-4-31b-it |
| Method | QLoRA (4-bit NF4) |
| LoRA rank | 16 |
| LoRA alpha | 32 |
| Target modules | q/k/v/o_proj, gate/up/down_proj |
| Trainable params | 133M / 31B (0.43%) |
| Training data | 1,546 custom samples (v2) |
| Epochs | 2 (3rd epoch interrupted, checkpoint-388 used) |
| Learning rate | 1.5e-4 (cosine) |
| Final loss | 0.98 |
| Token accuracy | 96.8% |
| Training time | ~1.5 hours |
| Hardware | NVIDIA RTX PRO 6000 (96GB VRAM) |
## Training Data Categories
| Category | Samples | Description |
|----------|---------|-------------|
| ReAct Tool Calling | ~120 | Single/chained tool calls |
| Multi-step Agentic Trajectory | ~100 | Plan→Tool→Observe→Correct→Answer loops |
| Self-correction | ~40 | Error recovery patterns |
| Function Calling | ~50 | Ollama native tool format |
| Japanese Code Generation | ~200 | JP instruction → Python/TS code |
| Japanese Code Review | ~100 | Security, refactoring, best practices |
| Japanese Error Explanation | ~80 | Error → JP diagnosis + fix |
| Japanese Comprehension | ~50 | Reading, reasoning, summarization |
| Debugging & Troubleshooting | ~100 | Error analysis → root cause → fix |
| Git & CI/CD | ~80 | Branch strategy, PR, GitHub Actions |
| Project Planning | ~80 | Requirements → task decomposition |
| Technical Documentation | ~80 | README, API docs, specs |
| Algorithms & Data Structures | ~200 | Binary search, DP, graph, sorting |
| Web Frameworks | ~200 | FastAPI, Django, React, Next.js |
| Database Operations | ~150 | SQLAlchemy, PostgreSQL, Redis |
| Testing & DevOps | ~150 | pytest, Docker, K8s, Terraform |
## Use with Ollama
```bash
# After GGUF conversion
ollama create gemma4-ja-agent-coder -f Modelfile
ollama run gemma4-ja-agent-coder
```
## Use with helix-agents (Claude Code MCP)
Reduce Claude Code API token consumption by delegating routine tasks to this local model.
```json
{
"mcpServers": {
"helix-agents": {
"command": "uv",
"args": ["run", "--directory", "/path/to/helix-agent", "python", "server.py"]
}
}
}
```
## Use with transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16)
base = AutoModelForCausalLM.from_pretrained("google/gemma-4-31b-it",
quantization_config=bnb, device_map="auto")
model = PeftModel.from_pretrained(base, "Tsunamayo7/gemma4-31b-ja-agent-coder")
tokenizer = AutoTokenizer.from_pretrained("Tsunamayo7/gemma4-31b-ja-agent-coder")
```
> **Note**: Gemma4 uses `Gemma4ClippableLinear` which requires a PEFT monkey-patch. See [this gist](https://gist.github.com/) for the workaround.
## License
Apache 2.0 (same as base model)
## Author
[tsunamayo7](https://github.com/tsunamayo7) — Builder of [helix-agents](https://github.com/tsunamayo7/helix-agents), a local LLM delegation framework for Claude Code.
|