I have tried doing the same but it is showing irrelevant captions, i have also used the much renowned BLIP model for my project. It is working great. also want to mention your work is simply awesome. Hoping to fix this Issues soon with better refinements.
AgentVikramchanged discussion title from
It needs refinement
to Refinement needed!!