STT/ASR - CoreML
Collection
models optimized for apple silicon • 80 items • Updated • 2
How to use OpenVoiceOS/parakeet-ctc-0.6b-coreml with NeMo:
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.ASRModel.from_pretrained("OpenVoiceOS/parakeet-ctc-0.6b-coreml")
transcriptions = asr_model.transcribe(["file.wav"])CoreML conversion of nvidia/parakeet-ctc-0.6b.
| Architecture | CTC |
| Language | English |
| Sample rate | 16000 Hz |
| Max audio | 15.0s |
| Vocab size | 1024 |
| Framework | NVIDIA NeMo → CoreML (coremltools) |
| File | Component | Best compute |
|---|---|---|
parakeet_mel_encoder.mlpackage |
mel_encoder | ANE / GPU |
parakeet_ctc_decoder.mlpackage |
ctc_decoder | ANE / GPU |
pip install ovos-stt-plugin-coreml
from ovos_stt_plugin_coreml import CoremlSTT
from ovos_plugin_manager.utils.audio import AudioFile
stt = CoremlSTT(config={"metadata": "metadata.json"})
with AudioFile("speech.wav") as f:
audio = f.read()
print(stt.execute(audio))
Base model
nvidia/parakeet-ctc-0.6b