moondream
/

md3p-int4

Model card Files Files and versions

MD3 Preview - Int4 Quantized (MLX)

Pre-quantized version of Moondream 3 Preview for MLX inference.

Quantization Details

MoE Experts: int4 affine quantization (bits=4, group_size=64)
Other weights: bf16 (unchanged)
Memory savings: ~60% reduction in MoE weight memory

Source

Quantized from moondream/moondream3-preview

Downloads last month: 205

MLX

Hardware compatibility

Log In to add your hardware

Quantized

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for moondream/md3p-int4

Finetunes