| --- |
| library_name: coreml |
| pipeline_tag: image-to-image |
| tags: |
| - super-resolution |
| - apple-silicon |
| - neural-engine |
| - ane |
| - coreml |
| - real-time |
| - video-upscaling |
| - macos |
| license: apache-2.0 |
| datasets: |
| - eugenesiow/Div2k |
| metrics: |
| - psnr |
| - ssim |
| model-index: |
| - name: PiperSR-2x |
| results: |
| - task: |
| type: image-super-resolution |
| name: Image Super-Resolution |
| dataset: |
| type: Set5 |
| name: Set5 |
| metrics: |
| - type: psnr |
| value: 37.54 |
| name: PSNR |
| - task: |
| type: image-super-resolution |
| name: Image Super-Resolution |
| dataset: |
| type: Set14 |
| name: Set14 |
| metrics: |
| - type: psnr |
| value: 33.21 |
| name: PSNR |
| - task: |
| type: image-super-resolution |
| name: Image Super-Resolution |
| dataset: |
| type: BSD100 |
| name: BSD100 |
| metrics: |
| - type: psnr |
| value: 31.98 |
| name: PSNR |
| - task: |
| type: image-super-resolution |
| name: Image Super-Resolution |
| dataset: |
| type: Urban100 |
| name: Urban100 |
| metrics: |
| - type: psnr |
| value: 31.38 |
| name: PSNR |
| --- |
| |
| # PiperSR-2x: ANE-Native Super Resolution for Apple Silicon |
|
|
| Real-time 2x AI upscaling on Apple's Neural Engine. 44.4 FPS at 720p on M2 Max, 928 KB model, every op runs natively on ANE with zero CPU/GPU fallback. |
|
|
| Not a converted PyTorch model β an architecture designed from ANE hardware measurements. Every dimension, operation, and data type is dictated by Neural Engine characteristics. |
|
|
| ## Key Results |
|
|
| | Model | Params | Set5 | Set14 | BSD100 | Urban100 | |
| |-------|--------|------|-------|--------|----------| |
| | Bicubic | β | 33.66 | 30.24 | 29.56 | 26.88 | |
| | FSRCNN | 13K | 37.05 | 32.66 | 31.53 | 29.88 | |
| | **PiperSR** | **453K** | **37.54** | **33.21** | **31.98** | **31.38** | |
| | SAFMN | 228K | 38.00 | ~33.7 | ~32.2 | β | |
|
|
| Beats FSRCNN across all benchmarks. Within 0.46 dB of SAFMN on Set5 β below the perceptual threshold for most content. |
|
|
| ## Performance |
|
|
| | Configuration | FPS | Hardware | Notes | |
| |--------------|-----|----------|-------| |
| | Full-frame 640Γ360 β 1280Γ720 | 44.4 | M2 Max | ANE predict 20.8 ms | |
| | 128Γ128 tiles (static weights) | 125.6 | M2 | Baked weights, 2.82Γ vs dynamic | |
| | 128Γ128 tiles (dynamic weights) | 44.5 | M2 | CoreML default | |
|
|
| Real-time 2Γ upscaling at 30+ FPS on any Mac with Apple Silicon. The ANE sits idle during video playback β PiperSR puts it to work. |
|
|
| ## Architecture |
|
|
| 453K-parameter network: 6 residual blocks at 64 channels with BatchNorm and SiLU activations, upscaling via PixelShuffle. |
|
|
| ``` |
| Input (128Γ128Γ3 FP16) |
| β Head: Conv 3Γ3 (3 β 64) |
| β Body: 6Γ ResBlock [Conv 3Γ3 β BatchNorm β SiLU β Conv 3Γ3 β BatchNorm β Residual Add] |
| β Tail: Conv 3Γ3 (64 β 12) β PixelShuffle(2) |
| Output (256Γ256Γ3) |
| ``` |
|
|
| Compiles to 5 MIL ops: `conv`, `add`, `silu`, `pixel_shuffle`, `const`. All verified ANE-native. |
|
|
| ### Why ANE-native matters |
|
|
| Off-the-shelf super resolution models (SPAN, Real-ESRGAN) were designed for CUDA GPUs and converted to CoreML after the fact. They waste the ANE: |
|
|
| - **Misaligned channels** (48 instead of 64) waste 25%+ of each ANE tile |
| - **Monolithic full-frame** tensors serialize the ANE's parallel compute lanes |
| - **Silent CPU fallback** from unsupported ops can 5-10Γ latency |
| - **No batched tiles** means 60Γ dispatch overhead |
|
|
| PiperSR addresses every one of these by designing around ANE constraints. |
|
|
| ## Model Variants |
|
|
| | File | Use Case | Input β Output | |
| |------|----------|----------------| |
| | `PiperSR_2x.mlpackage` | Static images (128px tiles) | 128Γ128 β 256Γ256 | |
| | `PiperSR_2x_video_720p.mlpackage` | Video (full-frame, BN-fused) | 640Γ360 β 1280Γ720 | |
| | `PiperSR_2x_256.mlpackage` | Static images (256px tiles) | 256Γ256 β 512Γ512 | |
|
|
| ## Usage |
|
|
| ### With ToolPiper (recommended) |
|
|
| PiperSR is integrated into [ToolPiper](https://modelpiper.com), a local macOS AI toolkit. Install ToolPiper, enable the MediaPiper browser extension, and every 720p video on the web is upscaled to 1440p in real time. |
|
|
| ```bash |
| # Via MCP tool |
| mcp__toolpiper__image_upscale image=/path/to/image.png |
| |
| # Via REST API |
| curl -X POST http://127.0.0.1:9998/v1/images/upscale \ |
| -F "image=@input.png" \ |
| -o upscaled.png |
| ``` |
|
|
| ### With CoreML (Swift) |
|
|
| ```swift |
| import CoreML |
| |
| let config = MLModelConfiguration() |
| config.computeUnits = .cpuAndNeuralEngine // NOT .all β .all is 23.6% slower |
| |
| let model = try PiperSR_2x(configuration: config) |
| let input = try PiperSR_2xInput(x: pixelBuffer) |
| let output = try model.prediction(input: input) |
| // output.var_185 contains the 2Γ upscaled image |
| ``` |
|
|
| > **Important:** Use `.cpuAndNeuralEngine`, not `.all`. CoreML's `.all` silently misroutes pure-ANE ops onto the GPU, causing a 23.6% slowdown for this model. |
|
|
| ### With coremltools (Python) |
|
|
| ```python |
| import coremltools as ct |
| from PIL import Image |
| import numpy as np |
| |
| model = ct.models.MLModel("PiperSR_2x.mlpackage") |
| |
| img = Image.open("input.png").resize((128, 128)) |
| arr = np.array(img).astype(np.float32) / 255.0 |
| arr = np.transpose(arr, (2, 0, 1))[np.newaxis] # NCHW |
| |
| result = model.predict({"x": arr}) |
| ``` |
|
|
| ## Training |
|
|
| Trained on DIV2K (800 training images) with L1 loss and random augmentation (flips, rotations). Total training cost: ~$6 on RunPod A6000 instances. Full training journey documented from 33.46 dB to 37.54 dB across 12 experiment findings. |
|
|
| ## Technical Details |
|
|
| - **Compute units:** `.cpuAndNeuralEngine` (ANE primary, CPU for I/O only) |
| - **Precision:** Float16 |
| - **Input format:** NCHW, normalized to [0, 1] |
| - **Output format:** NCHW, [0, 1] |
| - **Model size:** 928 KB (compiled .mlmodelc) |
| - **Parameters:** 453K |
| - **ANE ops used:** conv, batch_norm (fused at inference), silu, add, pixel_shuffle, const |
| - **CPU fallback ops:** None |
|
|
| ## License |
|
|
| Apache 2.0 |
|
|
| ## Citation |
|
|
| ```bibtex |
| @software{pipersr2025, |
| title={PiperSR: ANE-Native Super Resolution for Apple Silicon}, |
| author={ModelPiper}, |
| year={2025}, |
| url={https://huggingface.co/ModelPiper/PiperSR-2x} |
| } |
| ``` |
|
|