Access Embedl SAM3 (Quantized)

To access this model, please review and accept the terms below. Your contact information is collected solely to manage access and, with your explicit consent, to notify you about updated or new optimized models from Embedl. You can withdraw consent at any time by contacting us (see Contact section below). See our license for full terms.

By requesting access you agree to the Embedl Models Community Licence and the upstream SAM License

Embedl SAM3 (Quantized)

Deployable version of facebook/sam3. Mixed-precision INT8/FP16 quantization with hardware-aware optimizations.

Nvidia AGX Orin

Nvidia Jetson Thor

Nvidia L4

AMD MI300X

Highlights

Format: ONNX with external weights (embedl_sam3_quant.onnx + .onnx.data)
Precision: INT8 with sensitive layers kept in FP16
Runtime: TensorRT (FP16 + INT8 mode)
Hardware: NVIDIA Jetson AGX Orin, Thor, desktop/server GPUs with TensorRT and AMD GPUs

Quick Start

1. Download the model

hf download embedl/sam3 embedl_sam3_quant.onnx embedl_sam3_quant.onnx.data infer_trt.py --local-dir .

2. Build the TensorRT engine

WARNING: Validated with TensorRT 10.1 and 10.3 only. Latest versions of TensorRT produce incorrect segmentation masks for this model.

/usr/src/tensorrt/bin/trtexec --onnx=embedl_sam3_quant.onnx \
        --fp16 --int8 \
        --builderOptimizationLevel=5 \
        --memPoolSize=workspace:4294967296 \
        --timingCacheFile=embedl_sam3_timing_cache.bin \
        --saveEngine=embedl_sam3_quant.engine

3. Run inference

See infer_trt.py for a complete example that runs text-prompted video segmentation, measures latency, and saves an output video with mask overlays.

python3 -m venv venv --system-site-packages # Use system TensorRT
source venv/bin/activate
pip install opencv-python transformers av
python infer_trt.py

Files

File	Description
`embedl_sam3_quant.onnx`	Quantized ONNX model with QDQ operations precalibrated
`embedl_sam3_quant.onnx.data`	External weights (~3.1 GB)
`infer_trt.py`	TensorRT inference example

Performance

The input resolution is reduced from the default to 924 to enable TensorRT layer fusions that are not possible at the original size. All benchmarks use this resolution.

NVIDIA L4 GPU

Environment: NVIDIA L4, Driver 570.211.01, CUDA 12.8, TensorRT 10.3

Configuration	Latency	Speedup
`torch.compile` (FP16)	137 ms	1.0x
Embedl Deploy (this model)	104 ms	1.32x

NVIDIA Jetson AGX Orin

Configuration	Latency	Throughput	Speedup
Baseline (FP16, resized to 924)	763 ms	1.31 qps	1.0x
Embedl Deploy (this model)	462 ms	2.17 qps	1.65x

Accuracy (SA-Co/Gold)

Evaluated on the SA-Co/Gold instance segmentation benchmark (Table 30 in the SAM3 paper). The quantized model retains nearly all of the FP32 accuracy with a tolerance.

Average across all subsets:

Model	cgF1	IL_MCC	pos_µF1
SAM3 (paper, Table 30)	54.1	0.82	66.1
SAM3 ONNX FP32 (ours)	55.56	0.823	67.45
Embedl SAM3 INT8 (this model)	53.77	0.809	66.36

Per-subset breakdown:

Subset	cgF1 (FP32)	cgF1 (INT8)	pos_µF1 (FP32)	pos_µF1 (INT8)
Metaclip	47.92	47.07	59.24	58.54
SA-1B	53.44	52.33	61.70	61.31
Crowded	60.28	59.09	67.54	67.25
FG Food	58.76	56.28	72.01	70.02
Sports Equipment	67.85	65.61	75.15	73.91
Attributes	55.11	54.12	73.08	72.57
WikiCommon	45.57	41.85	63.46	60.88
Average	55.56	53.77	67.45	66.36

Creating Your Own Optimized Models

Deployment-ready models can be created from any supported base model using embedl-deploy, available on PyPI. Detailed tutorial.

License

This model is a derivative of facebook/sam3.

Component	License
Upstream (Meta SAM3)	SAM License
Optimized components	Embedl Models Community Licence v1.0 (no redistribution as a hosted service)

Contact

Enterprise & commercial inquiries: models@embedl.com
Technical issues & early access: github.com/embedl/embedl-deploy

We offer engineering support for on-prem/edge deployments and partner co-marketing opportunities.

Downloads last month: 32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for embedl/sam3

Base model

facebook/sam3

Quantized

(13)

this model

Paper for embedl/sam3

SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 137