MiniMax-M3-FP8-dynamic

Model Overview

This model is an FP8 dynamic quantized version of MiniMaxAI/MiniMax-M3.

  • Base model: MiniMaxAI/MiniMax-M3
  • Optimization: FP8 dynamic quantization
  • Format: safetensors / compressed-tensors
  • Validated runtime: vLLM OpenAI-compatible server
  • Tested hardware: AMD MI350, tensor parallel size 8

MiniMax-M3 is a native multimodal MoE model. The original model card describes it as a ~428B parameter model with ~23B activated parameters and 1M context support.

License

This quantized checkpoint follows the license terms of the base model, MiniMaxAI/MiniMax-M3. The Hugging Face model-card metadata uses license: other because the MiniMax community license is not one of the Hub's enumerated license identifiers.

Model Optimizations

This checkpoint uses FP8 dynamic quantization to reduce memory and disk requirements while preserving model quality. Validation below compares this quantized checkpoint against the BF16 MiniMaxAI/MiniMax-M3 baseline.

Evaluation

The model was evaluated against BF16 MiniMaxAI/MiniMax-M3. Scores are averaged across seeds.

Benchmark MiniMaxAI/MiniMax-M3 EmbeddedLLM/MiniMax-M3-FP8-dynamic Recovery (%)
GSM8k Platinum 95.81 95.92 100.12
IfEval 80.65 79.42 98.47
AIME 2025 20.83 19.17 92.00
GPQA diamond 77.78 77.95 100.22
Math 500 81.20 79.93 98.44
Lcb Codegeneration V6 37.14 35.62 95.90
MMLU Pro Chat 79.85 79.62 99.72

Evaluation Setup

  • Standard seeds: 42, 1234, 4158
  • AIME 2025 seeds: 42, 1234, 4158, 5322, 1356, 9843, 3344, 5678
  • GSM8K Platinum cap: max_gen_toks=64000
  • IFEval, AIME, GPQA, Math 500, MMLU Pro Chat cap: max_gen_toks=4096
  • LiveCodeBench v6 cap: max_gen_toks=2048
  • MiniMax thinking mode: disabled
  • Runners: lm-eval harness and lighteval through LiteLLM endpoint mode
Downloads last month
-
Safetensors
Model size
427B params
Tensor type
BF16
·
F8_E4M3
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for EmbeddedLLM/MiniMax-M3-FP8-dynamic

Quantized
(35)
this model