Eric Bezzam PRO
AI & ML interests
speech, audio, imaging
Recent Activity
liked
a model
about 8 hours ago
nvidia/music-flamingo-2601-hf
updated
a model
1 day ago
bezzam/VibeVoice-ASR-7B
upvoted
a
collection
4 days ago
OWSM: Fully Open Speech Recognition and Translation Models
Organizations
Omnilingual ASR (1,600+ Languages)
https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/
-
Running on A100232
Omnilingual ASR Media Transcription
🌍232Transcribe audio/video to text in multiple languages
-
facebook/omnilingual-asr-corpus
Viewer • Updated • 548k • 4.17k • 187 -
facebook/omniASR-CTC-300M
Automatic Speech Recognition • Updated • 7 -
facebook/omniASR-CTC-1B
Automatic Speech Recognition • Updated • 2
Speech recognition datasets
DigiCam (CelebA)
Models for DigiCam trained on the CelebA 26K dataset.
VibeVoice
Neural codecs
Omnilingual ASR (1,600+ Languages)
https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/
-
Running on A100232
Omnilingual ASR Media Transcription
🌍232Transcribe audio/video to text in multiple languages
-
facebook/omnilingual-asr-corpus
Viewer • Updated • 548k • 4.17k • 187 -
facebook/omniASR-CTC-300M
Automatic Speech Recognition • Updated • 7 -
facebook/omniASR-CTC-1B
Automatic Speech Recognition • Updated • 2
Multimodel audio
Speech recognition datasets
Text-to-speech datasets
DigiCam (CelebA)
Models for DigiCam trained on the CelebA 26K dataset.
DiffuserCam Mirflickr
Models for the paper "A modular and robust physics-based approach for lensless image reconstruction"