Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

26,794

Full-text search

Active filters: 8-bit

GadflyII/GLM-4.7-Flash-NVFP4

Text Generation • 18B • Updated 6 days ago • 172k • 38

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 2.99M • • 4.38k

openai/gpt-oss-20b

Text Generation • 22B • Updated Aug 26, 2025 • 6.65M • • 4.24k

mlx-community/GLM-4.7-Flash-8bit

Text Generation • 30B • Updated about 20 hours ago • 4.88k • 15

mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-8bit

Text-to-Speech • 0.5B • Updated about 15 hours ago • 1.14k • 8

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 5.95k • 1.26k

MultiverseComputingCAI/HyperNova-60B

Text Generation • 60B • Updated 18 days ago • 1.45k • 48

mlx-community/GLM-4.7-Flash-8bit-gs32

Text Generation • 30B • Updated about 20 hours ago • 437 • 5

openai/gpt-oss-safeguard-20b

Text Generation • 22B • Updated 12 days ago • 12.8k • • 182

AlicanKiraz0/Mihenk-LLM-14B-Turkish-Financial-Model-mlx-8Bit

15B • Updated 10 days ago • 29 • 7

NVFP4/Qwen3-Coder-30B-A3B-Instruct-FP4

Text Generation • 16B • Updated Aug 5, 2025 • 3.82k • 6

Salyut1/GLM-4.7-NVFP4

Text Generation • 177B • Updated Dec 23, 2025 • 4.97k • 10

nvidia/DeepSeek-V3.2-NVFP4

Text Generation • 394B • Updated 5 days ago • 1.07k • 3

LiquidAI/LFM2.5-1.2B-Thinking-MLX-8bit

Text Generation • 0.3B • Updated 10 days ago • 179 • 3

lmstudio-community/GLM-4.7-Flash-MLX-8bit

Text Generation • 30B • Updated 4 days ago • 292k • 3

mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-8bit

Text-to-Speech • 0.8B • Updated about 14 hours ago • 754 • 3

ragraph-ai/stable-cypher-instruct-3b

Text Generation • 3B • Updated Jun 12, 2025 • 361 • 31

MaziyarPanahi/Qwen2.5-1.5B-Instruct-GGUF

Text Generation • 2B • Updated Sep 18, 2024 • 144k • 9

tiiuae/Falcon-E-3B-Instruct

Text Generation • 0.9B • Updated Oct 7, 2025 • 291 • 36

MaziyarPanahi/Qwen3-1.7B-GGUF

Text Generation • 2B • Updated Apr 28, 2025 • 220k • 6

drwlf/medgemma-4b-it-abliterated

Text Generation • Updated Jul 21, 2025 • 17 • 6

nvidia/Qwen3-30B-A3B-NVFP4

Text Generation • 16B • Updated Sep 10, 2025 • 32.5k • 21

nvidia/Qwen3-8B-NVFP4

Text Generation • 5B • Updated Sep 9, 2025 • 6.45k • 12

GY2233/Qwen2.5-32B-NVFP4A16

Text Generation • 19B • Updated Sep 16, 2025 • 2

FabioSarracino/VibeVoice-Large-Q8

Text-to-Audio • 9B • Updated Oct 1, 2025 • 2.71k • 78

mlx-community/DeepSeek-OCR-8bit

Image-Text-to-Text • 1B • Updated Oct 27, 2025 • 1.37k • 30

ig1/Qwen3-VL-30B-A3B-Instruct-NVFP4

Image-Text-to-Text • 18B • Updated 15 days ago • 2.2k • 5

Firworks/NVIDIA-Nemotron-3-Nano-30B-A3B-nvfp4

18B • Updated 20 days ago • 2.08k • 7

mlx-community/GLM-4.7-8bit

Text Generation • 353B • Updated Dec 23, 2025 • 1.19k • 4

Tengyunw/MiniMax-M2.1-NVFP4

Text Generation • 115B • Updated 19 days ago • 187 • 6