danielr-ceva commited on
Commit
4af1e8c
·
verified ·
1 Parent(s): 49d9463

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -15
README.md CHANGED
@@ -5,39 +5,53 @@ tags:
5
  - speech_enhancement
6
  - noise_suppression
7
  - real_time
 
8
  ---
9
 
10
 
11
  # DPDFNet: Boosting DeepFilterNet2 via Dual-Path RNN
12
 
13
- DPDFNet is a family of causal, single-channel speech enhancement models for real-time noise suppression in challenging everyday environments. It extends the DeepFilterNet2 enhancement framework by inserting Dual-Path RNN (DPRNN) blocks into the encoder, strengthening long-range temporal and cross-band modeling while preserving a compact, streaming-friendly design.
14
 
15
- This repository provides four TensorFlow Lite (TFLite) models optimized for mobile and edge deployment:
16
 
 
17
  * `baseline.tflite`
18
  * `dpdfnet2.tflite`
19
  * `dpdfnet4.tflite`
20
  * `dpdfnet8.tflite`
21
 
 
 
 
22
  ---
23
 
24
  ## Key Features
25
 
26
- * Causal and low-latency: Designed for streaming use cases such as telephony, conferencing, and embedded devices.
27
- * Dual-Path RNN integration: Improves temporal context and frequency-domain interactions for more robust enhancement in difficult noise conditions.
28
  * Scalable family: Choose baseline or dpdfnet2/4/8 to balance quality vs. compute.
29
- * Edge deployment focus: Demonstrated on Ceva NeuPro Nano NPUs in the accompanying work.
 
30
 
31
  ---
32
 
33
  ## Model Variants and Footprint
34
 
 
 
35
  | Model | Params [M] | MACs [G] | TFLite Size [MB] |
36
- | --------- | ---------: | -------: | ---------------: |
37
- | Baseline | 2.31 | 0.36 | 8.5 |
38
- | DPDFNet-2 | 2.49 | 1.35 | 10.7 |
39
- | DPDFNet-4 | 2.84 | 2.36 | 12.9 |
40
- | DPDFNet-8 | 3.54 | 4.37 | 17.2 |
 
 
 
 
 
 
41
 
42
  ---
43
 
@@ -49,8 +63,12 @@ Deployment targets: Mobile devices, embedded NPUs, and edge platforms.
49
 
50
  Input and Output:
51
 
52
- * Input: 16 kHz mono noisy speech waveform
53
- * Output: 16 kHz mono enhanced speech waveform
 
 
 
 
54
 
55
  Typical applications:
56
 
@@ -63,7 +81,9 @@ Typical applications:
63
 
64
  ## Inference
65
 
66
- This repo includes a reference script for running the TFLite models on WAV files using streaming-style, frame-by-frame inference: `run_tflite.py`.
 
 
67
 
68
  ### Setup
69
 
@@ -78,7 +98,7 @@ pip install tflite-runtime
78
 
79
  By default, the script loads models from:
80
 
81
- * `./<model_name>.tflite`
82
 
83
  Create the folder and place the `.tflite` files there (or edit `TFLITE_DIR` in the script to match your layout).
84
 
@@ -90,7 +110,7 @@ The script processes `*.wav` files non-recursively and writes enhanced outputs a
90
  python run_tflite.py --noisy_dir /path/to/noisy_wavs --enhanced_dir /path/to/out --model_name dpdfnet8
91
  ```
92
 
93
- Available `--model_name` options: `baseline`, `dpdfnet2`, `dpdfnet4`, `dpdfnet8`.
94
 
95
  ---
96
 
 
5
  - speech_enhancement
6
  - noise_suppression
7
  - real_time
8
+ - fullband
9
  ---
10
 
11
 
12
  # DPDFNet: Boosting DeepFilterNet2 via Dual-Path RNN
13
 
14
+ DPDFNet is a family of causal, single-channel speech enhancement models for real-time noise suppression in challenging everyday environments. It extends the DeepFilterNet2 enhancement framework by inserting Dual-Path RNN (DPRNN) blocks into the encoder, strengthening long-range temporal and cross-band modeling while preserving a compact, streaming-friendly design.
15
 
16
+ This repository provides TensorFlow Lite (TFLite) models optimized for mobile and edge deployment:
17
 
18
+ **16kHz models**
19
  * `baseline.tflite`
20
  * `dpdfnet2.tflite`
21
  * `dpdfnet4.tflite`
22
  * `dpdfnet8.tflite`
23
 
24
+ **48kHz model**
25
+ * `dpdfnet2_48khz_hr.tflite`
26
+
27
  ---
28
 
29
  ## Key Features
30
 
31
+ * Causal and low-latency: Designed for streaming use cases such as telephony, conferencing, and embedded devices.
32
+ * Dual-Path RNN integration: Improves temporal context and frequency-domain interactions for more robust enhancement in difficult noise conditions.
33
  * Scalable family: Choose baseline or dpdfnet2/4/8 to balance quality vs. compute.
34
+ * Edge deployment focus: Demonstrated on Ceva NeuPro Nano NPUs in the accompanying work.
35
+ * Fullband option: A dedicated 48kHz model is provided for fullband enhancement.
36
 
37
  ---
38
 
39
  ## Model Variants and Footprint
40
 
41
+ ### 16kHz models
42
+
43
  | Model | Params [M] | MACs [G] | TFLite Size [MB] |
44
+ | --------- | :--------: | :------: | :--------------: |
45
+ | Baseline | 2.31 | 0.36 | 8.5 |
46
+ | DPDFNet-2 | 2.49 | 1.35 | 10.7 |
47
+ | DPDFNet-4 | 2.84 | 2.36 | 12.9 |
48
+ | DPDFNet-8 | 3.54 | 4.37 | 17.2 |
49
+
50
+ ### 48kHz model
51
+
52
+ | Model | Params [M] | MACs [G] | TFLite Size [MB] |
53
+ | ------------ | :--------: | :------: | :--------------: |
54
+ | DPDFNet-2 HR | 2.58 | 2.42 | 11.6 |
55
 
56
  ---
57
 
 
63
 
64
  Input and Output:
65
 
66
+ * **16kHz models**
67
+ * Input: 16kHz mono noisy speech waveform
68
+ * Output: 16kHz mono enhanced speech waveform
69
+ * **48kHz model**
70
+ * Input: 48kHz mono noisy speech waveform
71
+ * Output: 48kHz mono enhanced speech waveform
72
 
73
  Typical applications:
74
 
 
81
 
82
  ## Inference
83
 
84
+ This repo includes a inference script for running the TFLite models on WAV files using streaming-style, frame-by-frame inference: `run_tflite.py`.
85
+
86
+ > **Note:** When using `dpdfnet2_48khz_hr`, the inference script automatically switches to the 48kHz processing pipeline.
87
 
88
  ### Setup
89
 
 
98
 
99
  By default, the script loads models from:
100
 
101
+ * `./<model_name>.tflite`
102
 
103
  Create the folder and place the `.tflite` files there (or edit `TFLITE_DIR` in the script to match your layout).
104
 
 
110
  python run_tflite.py --noisy_dir /path/to/noisy_wavs --enhanced_dir /path/to/out --model_name dpdfnet8
111
  ```
112
 
113
+ Available `--model_name` options: `baseline`, `dpdfnet2`, `dpdfnet4`, `dpdfnet8`, `dpdfnet2_48khz_hr`.
114
 
115
  ---
116