DeepFilterNet2: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio
Paper
โข 2205.05474 โข Published
MLX-compatible weights for DeepFilterNet2, a real-time speech enhancement model that suppresses background noise from audio.
This is a direct conversion of the original PyTorch weights to safetensors format for use with MLX on Apple Silicon.
safetensors via the included convert_deepfilternet.py scriptNo fine-tuning or quantization was applied. Weights are converted directly from the original checkpoint.
| File | Description |
|---|---|
config.json |
Model architecture configuration |
model.safetensors |
Pre-converted weights (~8.9 MB, float32) |
convert_deepfilternet.py |
Conversion script (PyTorch -> MLX safetensors) |
| Parameter | Value |
|---|---|
| Sample rate | 48 kHz |
| FFT size | 960 |
| Hop size | 480 |
| ERB bands | 32 |
| DF bins | 96 |
| DF order | 5 |
| Embedding hidden dim | 256 |
import MLXAudioSTS
let model = try await DeepFilterNetModel.fromPretrained("iky1e/DeepFilterNet2-MLX")
let enhanced = try model.enhance(audioArray)
from mlx_audio.sts.models.deepfilternet import DeepFilterNetModel
model = DeepFilterNetModel.from_pretrained("iky1e/DeepFilterNet2-MLX")
enhanced = model.enhance("noisy.wav")
python convert_deepfilternet.py \
--input /path/to/DeepFilterNet2 \
--output ./DeepFilterNet2-MLX \
--name DeepFilterNet2
@inproceedings{schroeter2022deepfilternet2,
title = {{DeepFilterNet2}: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio},
author = {Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},
booktitle={17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022)},
year = {2022},
}
Quantized