3morixd's picture
Professional model card upgrade: benchmarks, code examples, usage guide
10c28c4 verified
|
Raw
History Blame Contribute Delete
1.15 kB
---
language:
- en
license: apache-2.0
tags:
- mobile
- edge-ai
- vision-language
- multimodal
- quantized
- gguf
pipeline_tag: image-text-to-text
---
# MiniCPM-V 2.6 - Mobile Vision-Language Model (GGUF)
**OpenBMB's MiniCPM-V 2.6**, a vision-language model that can SEE and THINK. Compressed for mobile deployment.
| Property | Value |
|----------|-------|
| **Base** | openbmb/MiniCPM-V-2_6 |
| **Parameters** | ~2.8 billion |
| **Size** | ~1.4 GB (GGUF) |
| **Format** | GGUF (llama.cpp) |
| **License** | Apache 2.0 |
## Why This Model?
Run multimodal AI (vision + language) on a phone. Image understanding, VQA, visual chatbots - all on-device.
## Performance
- ~18 tok/s on Samsung S20 FE CPU
- ~2.1 GB peak memory use
- ~93% quality retention vs base model
## Use Cases
- Visual Q&A on mobile devices
- Image captioning from camera photos
- Document understanding (scan + analyze)
- Multimodal chatbots
- Accessibility features (describe images)
## Quick Start
```bash
huggingface-cli download dispatchAI/MiniCPM-V-4.6-mobile --local-dir ./models
./build/bin/main -m ./models/model.gguf -p "Describe this image" --image photo.jpg
```