BiseNet: Optimized for Qualcomm Devices
BiSeNet (Bilateral Segmentation Network) is a novel architecture designed for real-time semantic segmentation. It addresses the challenge of balancing spatial resolution and receptive field by employing a Spatial Path to preserve high-resolution features and a context path to capture sufficient receptive field.
This is based on the implementation of BiseNet found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.
Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.
Getting Started
There are two ways to deploy this model on your device:
Option 1: Download Pre-Exported Models
Below are pre-exported model assets ready for deployment.
| Runtime | Precision | Chipset | SDK Versions | Download |
|---|---|---|---|---|
| ONNX | float | Universal | QAIRT 2.37, ONNX Runtime 1.23.0 | Download |
| ONNX | w8a8 | Universal | QAIRT 2.37, ONNX Runtime 1.23.0 | Download |
| QNN_DLC | float | Universal | QAIRT 2.42 | Download |
| QNN_DLC | w8a8 | Universal | QAIRT 2.42 | Download |
| TFLITE | float | Universal | QAIRT 2.42, TFLite 2.17.0 | Download |
| TFLITE | w8a8 | Universal | QAIRT 2.42, TFLite 2.17.0 | Download |
For more device-specific assets and performance metrics, visit BiseNet on Qualcomm® AI Hub.
Option 2: Export with Custom Configurations
Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:
- Custom weights (e.g., fine-tuned checkpoints)
- Custom input shapes
- Target device and runtime configurations
This option is ideal if you need to customize the model beyond the default configuration provided here.
See our repository for BiseNet on GitHub for usage instructions.
Model Details
Model Type: Model_use_case.semantic_segmentation
Model Stats:
- Model checkpoint: best_dice_loss_miou_0.655.pth
- Inference latency: RealTime
- Input resolution: 720x960
- Number of parameters: 12.0M
- Model size (float): 45.7 MB
Performance Summary
| Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit |
|---|---|---|---|---|---|---|
| BiseNet | ONNX | float | Snapdragon® X Elite | 31.468 ms | 66 - 66 MB | NPU |
| BiseNet | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 26.195 ms | 73 - 270 MB | NPU |
| BiseNet | ONNX | float | Qualcomm® QCS8550 (Proxy) | 32.87 ms | 63 - 86 MB | NPU |
| BiseNet | ONNX | float | Qualcomm® QCS9075 | 51.221 ms | 8 - 11 MB | NPU |
| BiseNet | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 19.854 ms | 71 - 211 MB | NPU |
| BiseNet | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 15.173 ms | 56 - 204 MB | NPU |
| BiseNet | ONNX | w8a8 | Snapdragon® X Elite | 8.703 ms | 19 - 19 MB | NPU |
| BiseNet | ONNX | w8a8 | Snapdragon® 8 Gen 3 Mobile | 5.962 ms | 18 - 210 MB | NPU |
| BiseNet | ONNX | w8a8 | Qualcomm® QCS6490 | 236.082 ms | 223 - 236 MB | CPU |
| BiseNet | ONNX | w8a8 | Qualcomm® QCS8550 (Proxy) | 8.607 ms | 16 - 45 MB | NPU |
| BiseNet | ONNX | w8a8 | Qualcomm® QCS9075 | 10.345 ms | 18 - 21 MB | NPU |
| BiseNet | ONNX | w8a8 | Qualcomm® QCM6690 | 232.957 ms | 132 - 139 MB | CPU |
| BiseNet | ONNX | w8a8 | Snapdragon® 8 Elite For Galaxy Mobile | 4.798 ms | 17 - 164 MB | NPU |
| BiseNet | ONNX | w8a8 | Snapdragon® 7 Gen 4 Mobile | 204.645 ms | 212 - 219 MB | CPU |
| BiseNet | ONNX | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 3.743 ms | 0 - 151 MB | NPU |
| BiseNet | QNN_DLC | float | Snapdragon® X Elite | 28.927 ms | 8 - 8 MB | NPU |
| BiseNet | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 20.22 ms | 8 - 285 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 107.749 ms | 2 - 194 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 28.517 ms | 8 - 10 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® SA8775P | 38.769 ms | 1 - 188 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® QCS9075 | 55.43 ms | 8 - 49 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 59.857 ms | 8 - 277 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® SA7255P | 107.749 ms | 2 - 194 MB | NPU |
| BiseNet | QNN_DLC | float | Qualcomm® SA8295P | 44.137 ms | 0 - 213 MB | NPU |
| BiseNet | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 15.012 ms | 8 - 262 MB | NPU |
| BiseNet | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 11.918 ms | 6 - 284 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Snapdragon® X Elite | 10.122 ms | 2 - 2 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Snapdragon® 8 Gen 3 Mobile | 6.747 ms | 2 - 233 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCS6490 | 40.586 ms | 2 - 14 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCS8275 (Proxy) | 20.155 ms | 2 - 182 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCS8550 (Proxy) | 9.474 ms | 2 - 4 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® SA8775P | 10.25 ms | 2 - 183 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCS9075 | 13.068 ms | 2 - 14 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCM6690 | 90.291 ms | 2 - 206 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® QCS8450 (Proxy) | 16.163 ms | 2 - 231 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® SA7255P | 20.155 ms | 2 - 182 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Qualcomm® SA8295P | 12.588 ms | 2 - 185 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Snapdragon® 8 Elite For Galaxy Mobile | 5.175 ms | 2 - 192 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Snapdragon® 7 Gen 4 Mobile | 13.403 ms | 2 - 198 MB | NPU |
| BiseNet | QNN_DLC | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 4.278 ms | 2 - 193 MB | NPU |
| BiseNet | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 20.372 ms | 31 - 310 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 105.404 ms | 32 - 247 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 27.811 ms | 32 - 34 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® SA8775P | 37.795 ms | 32 - 246 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® QCS9075 | 54.488 ms | 0 - 66 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 59.939 ms | 32 - 307 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® SA7255P | 105.404 ms | 32 - 247 MB | NPU |
| BiseNet | TFLITE | float | Qualcomm® SA8295P | 44.229 ms | 23 - 237 MB | NPU |
| BiseNet | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 15.177 ms | 30 - 288 MB | NPU |
| BiseNet | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 11.908 ms | 30 - 309 MB | NPU |
| BiseNet | TFLITE | w8a8 | Snapdragon® 8 Gen 3 Mobile | 8.658 ms | 7 - 240 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCS6490 | 47.105 ms | 6 - 30 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCS8275 (Proxy) | 20.753 ms | 0 - 182 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCS8550 (Proxy) | 12.154 ms | 0 - 110 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® SA8775P | 12.707 ms | 0 - 183 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCS9075 | 13.12 ms | 8 - 32 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCM6690 | 101.835 ms | 6 - 209 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® QCS8450 (Proxy) | 16.289 ms | 8 - 238 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® SA7255P | 20.753 ms | 0 - 182 MB | NPU |
| BiseNet | TFLITE | w8a8 | Qualcomm® SA8295P | 15.152 ms | 8 - 194 MB | NPU |
| BiseNet | TFLITE | w8a8 | Snapdragon® 8 Elite For Galaxy Mobile | 6.588 ms | 6 - 200 MB | NPU |
| BiseNet | TFLITE | w8a8 | Snapdragon® 7 Gen 4 Mobile | 15.957 ms | 0 - 199 MB | NPU |
| BiseNet | TFLITE | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 5.515 ms | 6 - 199 MB | NPU |
License
- The license for the original implementation of BiseNet can be found here.
References
- BiSeNet Bilateral Segmentation Network for Real-time Semantic Segmentation
- Source Model Implementation
Community
- Join our AI Hub Slack community to collaborate, post questions and learn more about on-device AI.
- For questions or feedback please reach out to us.
