Model Overview
- Model Architecture: Kimi-K2.6
- Input: Text, Image
- Output: Text
- Supported Hardware Microarchitecture: AMD MI350/MI355
- ROCm: 7.2.1
- PyTorch: 2.10
- Transformers: 5.5.4
- Operating System(s): Linux
- Inference Engine: SGLang/vLLM
- Model Optimizer: AMD-Quark (v0.11.1)
- Quantized layers:
experts,shared_experts - Weight quantization: OCP MXFP4, Static
- Activation quantization: OCP MXFP4, Dynamic
- Quantized layers:
- Calibration Dataset: Pile
This model was built with Kimi-K2.6 model by applying AMD-Quark for MXFP4 quantization.
Model Quantization
The model was quantized from a BF16-decompressed version of moonshotai/Kimi-K2.6 using AMD-Quark. The original checkpoint uses native INT4 (compressed-tensors) quantization; it was first decompressed to BF16 before applying MXFP4 quantization. The weights and activations are quantized to MXFP4.
Quantization scripts:
cd Quark/examples/torch/language_modeling/llm_ptq/
exclude_layers="*self_attn* *mlp.gate *mlp.gate.linear *lm_head *mlp.gate_proj *mlp.up_proj *mlp.down_proj *mm_projector* *vision_tower*"
python quantize_quark.py \
--model_dir /path/to/Kimi-K2.6-bf16 \
--quant_scheme mxfp4 \
--exclude_layers $exclude_layers \
--output_dir amd/Kimi-K2.6-MXFP4 \
--model_export hf_format \
--file2file_quantization
Deployment
Use with vLLM
This model can be deployed efficiently using the vLLM backend.
Evaluation
The model was evaluated on gsm8k benchmarks using the vllm framework.
Accuracy
| Benchmark | Kimi-K2.6 | Kimi-K2.6-MXFP4 (this model) | Recovery |
| GSM8K (flexible-extract) | 0.9393 | 0.9318 | 99.2% |
Reproduction
The GSM8K results were obtained using the vLLM framework, based on the Docker image rocm/vllm-dev:nightly_main_20260417, with lm-eval and amd-quark compiled and installed from source, and vLLM (version 0.19.1rc1.dev369+gb1dc87a09) pre-installed in the docker image.
lm_eval \
--model vllm \
--model_args pretrained=amd/Kimi-K2.6-MXFP4,trust_remote_code=True,tensor_parallel_size=4 \
--tasks gsm8k \
--batch_size auto
License
Modifications Copyright(c) 2026 Advanced Micro Devices, Inc. All rights reserved.
- Downloads last month
- 1,503
Model tree for amd/Kimi-K2.6-MXFP4
Base model
moonshotai/Kimi-K2.6