YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
MiniMax M2.7 β Fixed Chat Template
A community-fixed Jinja2 chat template for MiniMax M2.7 (all quant formats β GGUF, MLX, etc.) that resolves tool calling failures, thinking block corruption, and improves multi-turn agent reliability.
Drop-in replacement for the default tokenizer_config.json chat template.
Why This Fix?
The stock MiniMax M2.7 chat template has several issues that cause tool calling to silently break, especially in agentic coding workflows (Kilo Code, Continue, OpenClaw, etc.). These failures don't produce errors β the model just generates malformed output that the harness can't parse.
This fixed template resolves all known issues while maintaining full backward compatibility with the original MiniMax M2.7 special token format.
Changes from Stock Template
Critical Fixes
| Issue | Stock Behavior | Fixed Behavior |
|---|---|---|
Unclosed <think> before tool calls |
Model reasoning leaks into <minimax:tool_call> XML, corrupting structured output |
Auto-detects unclosed <think> tags and inserts </think> before the first tool call |
| No tool calling instructions | System prompt only shows format example, no behavioral rules | Injects strict rules: close thinking first, reasoning before not after tool calls, include all required params, preserve user strings verbatim |
| String-form tool arguments | Assumes arguments is always a dict β breaks with TypeError when model outputs JSON string |
New render_tool_arguments macro parses string-form JSON into proper key-value parameters |
Only </think> recognized |
</thinking> variant is silently ignored, leaving reasoning content in the response body |
Both </think> and </thinking> are recognized as valid close tags |
Quality of Life Improvements
| Feature | Description |
|---|---|
| Think toggles | <|think_on|> / <|think_off|> in system, user, or assistant messages to control thinking per-message |
| Developer role | Supports developer role as first message (same as system but distinct for frameworks that use it) |
| Historical reasoning hidden | Previous turns' <think> blocks are stripped from context by default, saving tokens. Only the current generation's thinking is shown. Set preserve_thinking=true to show all. |
| LM Studio compatible | Uses is iterable instead of is sequence for compatibility with LM Studio's limited Jinja runtime |
| Better error messages | Clearer exceptions for unexpected argument types and role ordering |
Preserved from Original
- MiniMax special tokens:
]~!b[,]~b],[e~[(BOS, role start, role end) <minimax:tool_call>/</minimax:tool_call>tool call delimiters<invoke>/<parameter>XML structure for tool calls<response>wrapping for tool results with multi-response grouping- Default model identity string
current_dateandcurrent_locationsystem message fieldsensure_ascii=Falsefor non-ASCII safe JSON rendering
Installation
Option 1: Replace in tokenizer_config.json
Copy the contents of minimax_m2.7_fixed_template.jinja into the chat_template field of your model's tokenizer_config.json.
Option 2: LM Studio
Go to My Models β model settings β Prompt Template and paste the template contents.
Option 3: llama.cpp / llama-server
llama-server -m your-model.gguf --chat-template-file minimax_m2.7_fixed_template.jinja
Option 4: ik_llama.cpp
ik_llama_server -m your-model.gguf --chat-template-file minimax_m2.7_fixed_template.jinja
Tested With
- MiniMax M2.7 GGUF Q8 (Unsloth) via llama.cpp / ik_llama.cpp on M3 Ultra 512GB β 21 t/s generation, clean tool calling
- Kilo Code (VS Code) β repeated code edits with no tool call corruption
- LM Studio β template loads and renders without errors
- Hermes agent framework β multi-turn agentic workflows stable
Credits
Template fixes by Hunterx, based on the community fix pattern established by:
- fakezeta's merged Qwen 3.6 template
- allanchan339's vLLM Qwen chat template fix
- froggeric's Qwen Fixed Chat Templates
Related
License
Same license as the original MiniMax M2.7 model.