Instructions to use openai/clip-vit-base-patch16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/clip-vit-base-patch16 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="openai/clip-vit-base-patch16") pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )# Load model directly from transformers import AutoProcessor, AutoModelForZeroShotImageClassification processor = AutoProcessor.from_pretrained("openai/clip-vit-base-patch16") model = AutoModelForZeroShotImageClassification.from_pretrained("openai/clip-vit-base-patch16") - Notebooks
- Google Colab
- Kaggle
Difference between this and patch-32
#2
by sachin - opened
As the title suggests what are the main differences of the different patch-X models? It seems to be only smaller by 5MB.
Same question.
Smaller patches (e.g. 16x16) can capture image detail more finely. β’ Larger patches (e.g. 32x32) lose some local detail information, so performance may be slightly worse on tasks that require high detail resolution