Qwen 2.5 0.5B Instruct - Mobile INT4 (GGUF)

Alibaba's Qwen 2.5 0.5B Instruct, the smallest capable general-purpose model. Incredibly fast on phones.

Property Value
Base Qwen/Qwen2.5-0.5B-Instruct
Parameters 494 million
Quantization INT4 GGUF
Size ~398 MB
License Apache 2.0

Performance

  • ~45 tok/s on Samsung S20 FE CPU (fastest in our collection!)
  • ~0.7 GB memory footprint
  • Fits on ANY modern smartphone
  • ~94% quality retention

Use Cases

  • Code generation on mobile IDEs
  • Quick text classification / extraction
  • Embedded assistants in apps
  • Ultra-low-latency responses (<50ms per token)
  • Batch processing at massive scale

Quick Start

huggingface-cli download dispatchAI/Qwen2.5-0.5B-Instruct-mobile-int4 --local-dir ./models
./build/bin/main -m ./models/model.gguf -p "Explain quantum computing simply." -n 128 -t 4
Downloads last month
855
GGUF
Model size
0.6B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Spaces using dispatchAI/Qwen2.5-0.5B-Instruct-mobile-int4 3

Collections including dispatchAI/Qwen2.5-0.5B-Instruct-mobile-int4