mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition Paper β’ 2502.01547 β’ Published Feb 3, 2025
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM? Paper β’ 2505.09439 β’ Published May 14, 2025 β’ 10 β’ 2
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation Paper β’ 2406.10082 β’ Published Jun 14, 2024 β’ 1
ibm-granite/granite-speech-3.3-8b Automatic Speech Recognition β’ 9B β’ Updated Aug 19, 2025 β’ 116k β’ 154
saurabhati/DASS_small_AudioSet_47.2 Audio Classification β’ 29.9M β’ Updated Mar 31, 2025 β’ 7 β’ 1
voidful/wav2vec2-xlsr-multilingual-56 Automatic Speech Recognition β’ 0.3B β’ Updated Mar 18, 2023 β’ 7.64k β’ 33