Update README.md

ff05d27 verified 10 days ago

6.12 kB

	---
	library_name: transformers
	license: mit
	base_model: Qwen/Qwen3-4B-Instruct-2507
	tags:
	- code
	- agent
	- tool-calling
	- distillation
	- qwen3
	- ms-swift
	- codebase-analysis
	language:
	- en
	pipeline_tag: text-generation
	---

	<div align="center">
	<img src="assets/locotrainer.png" width="55%" alt="LocoTrainer" />
	</div>

	<br>

	<div align="center">

	[![PyPI](https://img.shields.io/badge/PyPI-3775A9?style=for-the-badge&logo=pypi&logoColor=white)](https://pypi.org/project/locotrainer/)
	[![MODEL](https://img.shields.io/badge/Model-FFB300?style=for-the-badge&logo=huggingface&logoColor=white)](https://huggingface.co/LocoreMind/LocoTrainer-4B)
	[![GGUF](https://img.shields.io/badge/GGUF-FF6F00?style=for-the-badge&logo=huggingface&logoColor=white)](https://huggingface.co/LocoreMind/LocoTrainer-4B-GGUF)
	[![Colab](https://img.shields.io/badge/Colab-F9AB00?style=for-the-badge&logo=googlecolab&logoColor=white)](https://colab.research.google.com/github/LocoreMind/LocoTrainer/blob/main/LocoTrainer_4B.ipynb)
	[![GitHub](https://img.shields.io/badge/GitHub-181717?style=for-the-badge&logo=github&logoColor=white)](https://github.com/LocoreMind/LocoTrainer)

	</div>

	## Introduction

	LocoTrainer-4B is a 4B-parameter MS-SWIFT domain expert agent trained via knowledge distillation from Qwen3-Coder-Next. Unlike general-purpose code agents, it combines multi-turn tool-calling with deep MS-SWIFT framework knowledge — enabling it to analyze codebases and generate comprehensive markdown reports without a separate reasoning model.

	## Demo

	<div align="center">
	<img src="assets/demo.gif" width="90%" alt="LocoTrainer Demo" />
	</div>

	LocoTrainer analyzing MS-SWIFT codebase with LocoTrainer-4B model via vLLM

	\| \| LocoTrainer-4B \|
	\|:--\|:--\|
	\| Base Model \| [Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) \|
	\| Teacher Model \| Qwen3-Coder-Next \|
	\| Training Method \| Full-parameter SFT (distillation) \|
	\| Training Data \| 361,830 samples (agent trajectory + MS-SWIFT knowledge + project paths) \|
	\| Max Sequence Length \| 32,768 tokens \|
	\| Training Hardware \| 8x NVIDIA H100 80GB \|
	\| Training Time \| ~25 hours \|
	\| Framework \| MS-SWIFT \|

	## Key Features

	- MS-SWIFT Domain Expert: Trained on MS-SWIFT documentation, CLI parameters, and project structure paths — answers framework questions accurately
	- Tool-Calling Agent: Generates structured `<tool_call>` JSON for Read, Grep, Glob, Bash, and Write tools
	- End-to-End Reports: From a single question to a complete, well-structured markdown analysis report
	- Long Context: 32K training covers 90% of long-context analysis scenarios
	- Local Deployment: GGUF quantized version available for zero API cost inference

	## Quick Start

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "LocoreMind/LocoTrainer-4B"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)

	messages = [
	{
	"role": "system",
	"content": "You are Claude Code, Anthropic's official CLI for Claude.\n\nYou are an interactive agent that helps users with software engineering tasks.\n\nCRITICAL CONSTRAINTS:\n1. ALWAYS use absolute file paths in tool calls.\n2. EFFICIENCY: Use multiple tool calls to explore the codebase.\n3. OUTPUT: Save your findings as a well-structured markdown document.\n\nENV: Working directory is /Users/developer/workspace (macOS, zsh)."
	},
	{
	"role": "user",
	"content": "What are the default LoRA settings in ms-swift?\n\nAnalyze the codebase at /Users/developer/workspace/ms-swift and save your findings as a well-structured markdown document to /Users/developer/workspace/output/output.md."
	}
	]

	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True,
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=1024,
	)
	output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

	content = tokenizer.decode(output_ids, skip_special_tokens=True)
	print(content)
	```

	## LocoTrainer Framework

	LocoTrainer-4B is designed to run inside the LocoTrainer agent framework, which handles the full agent loop — tool execution, multi-turn conversation, and report generation.

	```bash
	pip install locotrainer

	locotrainer run -q "What are the default LoRA settings in ms-swift?"
	# → output/output.md
	```

	For full setup and usage, refer to the [GitHub repository](https://github.com/LocoreMind/LocoTrainer).

	## Training Details

	\| Parameter \| Value \|
	\|:----------\|:------\|
	\| Base model \| Qwen3-4B-Instruct-2507 \|
	\| Teacher model \| Qwen3-Coder-Next \|
	\| Method \| Full-parameter SFT \|
	\| Training data \| 361,830 samples \|
	\| Data composition \| Agent trajectory + MS-SWIFT knowledge + project structure paths \|
	\| Hardware \| 8x NVIDIA H100 80GB \|
	\| DeepSpeed \| ZeRO-2 \|
	\| Precision \| BF16 \|
	\| Epochs \| 1 \|
	\| Max sequence length \| 32,768 tokens \|
	\| Attention \| Flash Attention 2 \|
	\| Kernel optimization \| Liger Kernel \|
	\| Learning rate \| 1e-5, warmup ratio 0.05 \|
	\| Batch size \| 1/GPU, gradient accumulation 4 (effective batch 32) \|
	\| Template \| qwen3_nothinking \|
	\| Framework \| MS-SWIFT \|
	\| Training time \| ~25 hours \|

	## Known Limitations

	- Specialized for MS-SWIFT; performance on unrelated codebases is untested
	- 4B parameters — complex multi-hop reasoning may require a larger model
	- MS-SWIFT project structure knowledge reflects the training data snapshot; may drift as the framework evolves

	## License

	MIT

	## Acknowledgments

	- [Qwen Team](https://huggingface.co/Qwen) for the Qwen3-4B-Instruct-2507 base model
	- [MS-SWIFT](https://github.com/modelscope/ms-swift) for the training framework and the codebase this model specializes in
	- [llama.cpp](https://github.com/ggerganov/llama.cpp) for efficient local inference
	- [Anthropic](https://www.anthropic.com/) for the Claude Code agent loop design that inspired this work