DepthPro CoreML (1024x1024 High-Resolution)
This repository contains the High-Resolution (1024x1024) version of the DepthPro model, optimized for CoreML.
DepthPro is a state-of-the-art monocular depth estimation model that provides sharp, metric-scale depth maps. This 1024px version is specifically designed for High-Quality 3D Exports where edge precision and fine detail preservation are critical.
π Key Features
- High Fidelity: Captures thin structures (threads, instruments, hair) with superior accuracy compared to the 512px version.
- Symmetric 3D Rendering Optimized: Perfectly suited for symmetric shifting in VR/AR to minimize visual discomfort.
- VisionOS Ready: Fully compatible with Apple Vision Pro (optimized for GPU/CPU).
π Performance & Requirements
| Metric | Specification |
|---|---|
| Input Resolution | 1024 x 1024 pixels |
| Compute Units | GPU + CPU (Recommended for stability) |
| Average Latency | ~7.5s per frame (on M2 Ultra/M3 Max) |
| Target Use Case | Offline Video Conversion / High-Quality Spatial Video Export |
To ensure inference stability at this resolution, this model is configured to use the GPU/CPU path rather than ANE to avoid memory limits.
π¦ Repository Contents
The repository contains the following core components:
DepthPro_transform.mlpackage: Image preprocessing.DepthPro_encoder.mlpackage: Feature extraction (ViT-Large).DepthPro_decoder.mlpackage: Multiresolution fusion.DepthPro_depth.mlpackage: Final depth output and high-res feature generation.
π Usage with Swift Transformers
You can download and cache this model dynamically using swift-transformers:
let hub = Hub()
let modelDir = try await hub.snapshot(repoId: "aarondevstack/DepthPro-1024x1024-coreml")
// Load models from the downloaded directory
- Downloads last month
- 146
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support