Instructions to use bipin/image-caption-generator with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bipin/image-caption-generator with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="bipin/image-caption-generator")# Load model directly from transformers import AutoTokenizer, AutoModelForImageTextToText tokenizer = AutoTokenizer.from_pretrained("bipin/image-caption-generator") model = AutoModelForImageTextToText.from_pretrained("bipin/image-caption-generator") - Notebooks
- Google Colab
- Kaggle
snippet error
When I test on a image using the UI in the model card page, it gives this error:
Can't load tokenizer using from_pretrained, please update its configuration: <class 'transformers.models.vision_encoder_decoder.configuration_vision_encoder_decoder.VisionEncoderDecoderConfig'>
same here
Thanks for raising the issue @fcakyon . Currently the the model can't be loaded from the UI :( you will have to load it using the transformers library.
Hey @itsyogesh , I've updated the model card to include all required steps for doing inference and also the links to the training procedure.
Thanks everyone for your patience. I'm closing this issue for now. If you face any issues with using the model in code, please feel free to re-open this or raise another issue :)