--- language: - en license: apache-2.0 tags: - mobile - edge-ai - vision-language - multimodal - quantized - gguf pipeline_tag: image-text-to-text --- # MiniCPM-V 2.6 - Mobile Vision-Language Model (GGUF) **OpenBMB's MiniCPM-V 2.6**, a vision-language model that can SEE and THINK. Compressed for mobile deployment. | Property | Value | |----------|-------| | **Base** | openbmb/MiniCPM-V-2_6 | | **Parameters** | ~2.8 billion | | **Size** | ~1.4 GB (GGUF) | | **Format** | GGUF (llama.cpp) | | **License** | Apache 2.0 | ## Why This Model? Run multimodal AI (vision + language) on a phone. Image understanding, VQA, visual chatbots - all on-device. ## Performance - ~18 tok/s on Samsung S20 FE CPU - ~2.1 GB peak memory use - ~93% quality retention vs base model ## Use Cases - Visual Q&A on mobile devices - Image captioning from camera photos - Document understanding (scan + analyze) - Multimodal chatbots - Accessibility features (describe images) ## Quick Start ```bash huggingface-cli download dispatchAI/MiniCPM-V-4.6-mobile --local-dir ./models ./build/bin/main -m ./models/model.gguf -p "Describe this image" --image photo.jpg ```