DepthPro CoreML (1024x1024 High-Resolution)

This repository contains the High-Resolution (1024x1024) version of the DepthPro model, optimized for CoreML.

DepthPro is a state-of-the-art monocular depth estimation model that provides sharp, metric-scale depth maps. This 1024px version is specifically designed for High-Quality 3D Exports where edge precision and fine detail preservation are critical.

πŸš€ Key Features

  • High Fidelity: Captures thin structures (threads, instruments, hair) with superior accuracy compared to the 512px version.
  • Symmetric 3D Rendering Optimized: Perfectly suited for symmetric shifting in VR/AR to minimize visual discomfort.
  • VisionOS Ready: Fully compatible with Apple Vision Pro (optimized for GPU/CPU).

πŸ“Š Performance & Requirements

Metric Specification
Input Resolution 1024 x 1024 pixels
Compute Units GPU + CPU (Recommended for stability)
Average Latency ~7.5s per frame (on M2 Ultra/M3 Max)
Target Use Case Offline Video Conversion / High-Quality Spatial Video Export

To ensure inference stability at this resolution, this model is configured to use the GPU/CPU path rather than ANE to avoid memory limits.

πŸ“¦ Repository Contents

The repository contains the following core components:

  1. DepthPro_transform.mlpackage: Image preprocessing.
  2. DepthPro_encoder.mlpackage: Feature extraction (ViT-Large).
  3. DepthPro_decoder.mlpackage: Multiresolution fusion.
  4. DepthPro_depth.mlpackage: Final depth output and high-res feature generation.

πŸ›  Usage with Swift Transformers

You can download and cache this model dynamically using swift-transformers:

let hub = Hub()
let modelDir = try await hub.snapshot(repoId: "aarondevstack/DepthPro-1024x1024-coreml")
// Load models from the downloaded directory
Downloads last month
146
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support