---
language:
- en
license: apache-2.0
tags:
- mobile
- edge-ai
- vision-language
- multimodal
- quantized
- gguf
pipeline_tag: image-text-to-text
---

# MiniCPM-V 2.6 - Mobile Vision-Language Model (GGUF)

**OpenBMB's MiniCPM-V 2.6**, a vision-language model that can SEE and THINK. Compressed for mobile deployment.

| Property | Value |
|----------|-------|
| **Base** | openbmb/MiniCPM-V-2_6 |
| **Parameters** | ~2.8 billion |
| **Size** | ~1.4 GB (GGUF) |
| **Format** | GGUF (llama.cpp) |
| **License** | Apache 2.0 |

## Why This Model?

Run multimodal AI (vision + language) on a phone. Image understanding, VQA, visual chatbots - all on-device.

## Performance

- ~18 tok/s on Samsung S20 FE CPU
- ~2.1 GB peak memory use
- ~93% quality retention vs base model

## Use Cases

- Visual Q&A on mobile devices
- Image captioning from camera photos
- Document understanding (scan + analyze)
- Multimodal chatbots
- Accessibility features (describe images)

## Quick Start

```bash
huggingface-cli download dispatchAI/MiniCPM-V-4.6-mobile --local-dir ./models
./build/bin/main -m ./models/model.gguf -p "Describe this image" --image photo.jpg
```