mistralai/Voxtral-Mini-4B-Realtime-2602
Automatic Speech Recognition • 4B • Updated • 1.26M • 860
Upgraded to v1.0!
https://huggingface.co/papers/2501.03006
View and submit LLM evaluations
Gaze detection using Moondream
Audio Conditioned LipSync with Latent Diffusion Models
Describe what you want, AI writes the FFMPEG command