Zero-Shot Image Classification
Transformers
ONNX
Chinese
English
m2_encoder
feature-extraction
multimodal
image-text-retrieval
bilingual
chinese
english
vision-language
custom-code
custom_code
Eval Results (legacy)
Instructions to use malusama/M2-Encoder-0.4B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use malusama/M2-Encoder-0.4B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="malusama/M2-Encoder-0.4B", trust_remote_code=True) pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("malusama/M2-Encoder-0.4B", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
File size: 714 Bytes
f471fb4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | from .utils import (
inception_normalize,
MinMaxResize,
)
from torchvision import transforms
from .randaug import RandAugment
def pixelbert_transform(size=800):
longer = int((1333 / 800) * size)
return transforms.Compose(
[
MinMaxResize(shorter=size, longer=longer),
transforms.ToTensor(),
inception_normalize,
]
)
def pixelbert_transform_randaug(size=800):
longer = int((1333 / 800) * size)
trs = transforms.Compose(
[
MinMaxResize(shorter=size, longer=longer),
transforms.ToTensor(),
inception_normalize,
]
)
trs.transforms.insert(0, RandAugment(2, 9))
return trs
|