Visual Document Retrieval
📉
2
Demo for multimodal embedding models
Detect objects in images or videos
Generate personalized images preserving your face identity
Swap faces between two images
Edit images by adding or removing concepts with text prompts
Generate captions for music audio
Chat with an AI assistant using text and images
Create a custom story with characters and plot
BLIP2 (cutting edge image captioning) in 🤗transformers
Ask questions about any image