| | --- |
| | license: cc-by-nc-4.0 |
| | language: |
| | - en |
| | base_model: |
| | - facebook/metaclip-2-worldwide-s16 |
| | pipeline_tag: image-classification |
| | library_name: transformers |
| | tags: |
| | - text-generation-inference |
| | - gender-identifier |
| | --- |
| | |
| |  |
| |
|
| | # **MetaCLIP-2-Gender-Identifier** |
| |
|
| | > **MetaCLIP-2-Gender-Identifier** is an image classification vision-language encoder model fine-tuned from **[facebook/metaclip-2-worldwide-s16](https://huggingface.co/facebook/metaclip-2-worldwide-s16)** for a single-label classification task. |
| | > It is designed to predict the gender of a person from an image using the **MetaClip2ForImageClassification** architecture. |
| |
|
| | >[!note] |
| | MetaCLIP 2: A Worldwide Scaling Recipe : https://huggingface.co/papers/2507.22062 |
| |
|
| | ``` |
| | Classification Report: |
| | precision recall f1-score support |
| | |
| | female 0.9815 0.9631 0.9722 1600 |
| | male 0.9638 0.9819 0.9728 1600 |
| | |
| | accuracy 0.9725 3200 |
| | macro avg 0.9727 0.9725 0.9725 3200 |
| | weighted avg 0.9727 0.9725 0.9725 3200 |
| | ``` |
| |
|
| |  |
| |
|
| | --- |
| |
|
| | The model categorizes images into two gender classes: |
| |
|
| | * **Class 0:** "female" |
| | * **Class 1:** "male" |
| |
|
| | # **Run with Transformers** |
| |
|
| | ```python |
| | !pip install -q transformers torch pillow gradio |
| | ``` |
| |
|
| | ```python |
| | import gradio as gr |
| | import torch |
| | from transformers import AutoImageProcessor, AutoModelForImageClassification |
| | from PIL import Image |
| | |
| | # Model name from Hugging Face Hub |
| | model_name = "prithivMLmods/MetaCLIP-2-Gender-Identifier" |
| | |
| | # Load processor and model |
| | processor = AutoImageProcessor.from_pretrained(model_name) |
| | model = AutoModelForImageClassification.from_pretrained(model_name) |
| | model.eval() |
| | |
| | # Define labels |
| | LABELS = { |
| | 0: "female", |
| | 1: "male" |
| | } |
| | |
| | def age_classification(image): |
| | """Predict the age group of a person from an image.""" |
| | image = Image.fromarray(image).convert("RGB") |
| | inputs = processor(images=image, return_tensors="pt") |
| | |
| | with torch.no_grad(): |
| | outputs = model(**inputs) |
| | logits = outputs.logits |
| | probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist() |
| | |
| | predictions = {LABELS[i]: round(probs[i], 3) for i in range(len(probs))} |
| | return predictions |
| | |
| | # Build Gradio interface |
| | iface = gr.Interface( |
| | fn=age_classification, |
| | inputs=gr.Image(type="numpy", label="Upload Image"), |
| | outputs=gr.Label(label="Predicted Gender"), |
| | title="MetaCLIP-2-Gender-Identifier", |
| | description="Upload an image to predict the person's gender." |
| | ) |
| | |
| | # Launch app |
| | if __name__ == "__main__": |
| | iface.launch() |
| | ``` |
| |
|
| | # **Sample Inference:** |
| |
|
| |  |
| |  |
| |  |
| |  |
| |
|
| | # **Intended Use:** |
| |
|
| | The **MetaCLIP-2-Gender-Identifier** model is designed to classify images into gender categories. |
| | Potential use cases include: |
| |
|
| | * **Demographic Analysis:** Supporting research and business insights into gender-based distribution. |
| | * **Health and Fitness Applications:** Assisting in gender-specific analytics and recommendations. |
| | * **Security and Access Control:** Supporting gender-based identity verification systems. |
| | * **Retail and Marketing:** Enabling improved personalization and customer segmentation. |
| | * **Forensics and Surveillance:** Assisting in identity estimation for investigative purposes. |