SinaLab/ImageEval2025Task2TrainDataset
Viewer • Updated • 2.72k • 3
How to use SinaLab/Qwen-2.5-VL-7B-Instruct-Image-Captioning with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-VL-7B-Instruct")
model = PeftModel.from_pretrained(base_model, "SinaLab/Qwen-2.5-VL-7B-Instruct-Image-Captioning")This model is a LoRA fine-tuned version of Qwen/Qwen2.5-VL-7B-Instruct for generating Arabic captions for images.
This model was developed as part of the Arabic Image Captioning Shared Task 2025. It generates natural Arabic captions for images with focus on historical and cultural content related to Palestinian heritage.
please refer to the training dataset for more details.
from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
from peft import PeftModel
import torch
from PIL import Image
# Load base model and processor
base_model = Qwen2VLForConditionalGeneration.from_pretrained("Qwen/Qwen2.5-VL-7B-Instruct")
processor = AutoProcessor.from_pretrained("Qwen/Qwen2.5-VL-7B-Instruct")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "your-username/arabic-image-captioning-qwen2.5vl")
# Process image and generate caption
image = Image.open("your_image.jpg")
prompt = "اكتب وصفاً مختصراً لهذه الصورة باللغة العربية"
inputs = processor(images=image, text=prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=128)
caption = processor.decode(outputs[0], skip_special_tokens=True)
print(caption)
For questions or support:
Base model
Qwen/Qwen2.5-VL-7B-Instruct