| --- |
| license: other |
| license_name: intel-custom |
| license_link: LICENSE |
| library_name: openvino |
| pipeline_tag: object-detection |
| tags: |
| - openvino |
| - intel |
| - yolo |
| - yolo26 |
| - object-detection |
| - coco |
| - edge-ai |
| - metro |
| - dlstreamer |
| datasets: |
| - detection-datasets/coco |
| language: |
| - en |
| --- |
| |
| # Object Detection |
|
|
| | Property | Value | |
| |---|---| |
| | **Category** | General Object Detection (80-class COCO) | |
| | **Base Model** | [YOLO26](https://docs.ultralytics.com/models/yolo26/) (Ultralytics) | |
| | **Source Framework** | PyTorch (Ultralytics) | |
| | **Supported Precisions** | FP32, FP16, INT8 (mixed-precision) | |
| | **Inference Engine** | OpenVINO | |
| | **Hardware** | CPU, GPU, NPU | |
| | **Detected Class(es)** | All 80 COCO classes | |
|
|
| --- |
|
|
| ## Overview |
|
|
| Object Detection is a Metro Analytics use case that detects and classifies objects across the full 80-class COCO taxonomy (person, vehicle, animal, everyday objects, etc.). |
| It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/), a state-of-the-art real-time object detector, quantized to INT8 for efficient inference on Intel hardware. |
| Unlike the specialized person or vehicle detectors, this model keeps all 80 classes active, making it suitable for general-purpose scene understanding. |
|
|
| Typical Metro deployments include: |
|
|
| - **Scene Understanding** -- identify and classify all objects visible in a camera feed. |
| - **Inventory Monitoring** -- detect specific items (bags, suitcases, bottles) on platforms. |
| - **Anomaly Detection** -- flag unexpected objects in restricted areas. |
| - **Multi-Class Analytics** -- gather statistics across people, vehicles, and other categories. |
|
|
| Available variants: `yolo26n`, `yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`. |
| Smaller variants (`yolo26n`, `yolo26s`) are recommended for high-FPS edge deployment; larger variants improve recall for small objects. |
|
|
| --- |
|
|
| ## Prerequisites |
|
|
| - Python 3.11+ |
| - [Install OpenVINO](https://docs.openvino.ai/2026/get-started/install-openvino.html) (latest version) |
| - [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html) (latest version) |
|
|
| Create and activate a Python virtual environment before running the scripts: |
|
|
| ```bash |
| python3 -m venv .venv --system-site-packages |
| source .venv/bin/activate |
| ``` |
|
|
| > **Note:** The `--system-site-packages` flag is required so the virtual |
| > environment can access the system-installed OpenVINO and DLStreamer Python |
| > packages. |
|
|
| --- |
|
|
| ## Getting Started |
|
|
| ### Download and Quantize Model |
|
|
| Run the provided script to download, export to OpenVINO IR, and optionally quantize: |
|
|
| ```bash |
| chmod +x export_and_quantize.sh |
| ./export_and_quantize.sh |
| ``` |
|
|
| This exports the default **yolo26n** model in **FP16** precision. |
|
|
| #### Optional: Select a Different Variant or Precision |
|
|
| ```bash |
| ./export_and_quantize.sh yolo26n FP32 # full-precision |
| ./export_and_quantize.sh yolo26n INT8 # quantized |
| ./export_and_quantize.sh yolo26s # larger variant, default FP16 |
| ``` |
|
|
| Replace `yolo26n` with any variant (`yolo26s`, `yolo26m`, `yolo26l`, `yolo26x`). |
| The second argument selects the precision (`FP32`, `FP16`, `INT8`); the default is **FP16**. |
|
|
| The script performs the following steps: |
|
|
| 1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8). |
| 2. Downloads a sample test image (`test.jpg`) and a sample test video (`test_video.mp4`). |
| 3. Downloads the PyTorch weights and exports to OpenVINO IR. |
| 4. *(INT8 only)* Quantizes the model using NNCF post-training quantization. |
|
|
| Output files: |
|
|
| - `yolo26n_openvino_model/` -- FP32 or FP16 OpenVINO IR model directory. |
| - `yolo26n_objdet_int8.xml` / `yolo26n_objdet_int8.bin` -- INT8 quantized model *(only when `INT8` is selected)*. |
|
|
| #### Precision / Device Compatibility |
|
|
| | Precision | CPU | GPU | NPU | |
| |---|---|---|---| |
| | FP32 | Yes | Yes | No | |
| | FP16 | Yes | Yes | Yes | |
| | INT8 | Yes | Yes | Yes | |
|
|
| > **Note:** The INT8 calibration uses the bundled sample image. |
| > For production accuracy, replace it with a representative set of frames from |
| > the target deployment site. |
|
|
| ### OpenVINO Sample |
|
|
| The sample below runs YOLO26 inference on all 80 COCO classes and prints every detected object with its class name and confidence. |
| YOLO26 is end-to-end (NMS-free), so no manual non-maximum suppression is needed. |
| Change the `device` string to run on CPU, GPU, or NPU. |
|
|
| ```python |
| import cv2 |
| import numpy as np |
| import openvino as ov |
| |
| COCO_NAMES = [ |
| "person","bicycle","car","motorcycle","airplane","bus","train","truck", |
| "boat","traffic light","fire hydrant","stop sign","parking meter","bench", |
| "bird","cat","dog","horse","sheep","cow","elephant","bear","zebra", |
| "giraffe","backpack","umbrella","handbag","tie","suitcase","frisbee", |
| "skis","snowboard","sports ball","kite","baseball bat","baseball glove", |
| "skateboard","surfboard","tennis racket","bottle","wine glass","cup", |
| "fork","knife","spoon","bowl","banana","apple","sandwich","orange", |
| "broccoli","carrot","hot dog","pizza","donut","cake","chair","couch", |
| "potted plant","bed","dining table","toilet","tv","laptop","mouse", |
| "remote","keyboard","cell phone","microwave","oven","toaster","sink", |
| "refrigerator","book","clock","vase","scissors","teddy bear","hair drier", |
| "toothbrush", |
| ] |
| CONF_THRESHOLD = 0.4 |
| INPUT_SIZE = 640 |
| |
| core = ov.Core() |
| model = core.read_model("yolo26n_openvino_model/yolo26n.xml") |
| |
| # Change device to "GPU" or "NPU" to run on integrated GPU or NPU. |
| compiled = core.compile_model(model, "CPU") |
| |
| image = cv2.imread("test.jpg") |
| h0, w0 = image.shape[:2] |
| |
| blob = cv2.resize(image, (INPUT_SIZE, INPUT_SIZE)) |
| blob = cv2.cvtColor(blob, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0 |
| blob = blob.transpose(2, 0, 1)[np.newaxis, ...] # NCHW |
| |
| # YOLO26 end-to-end output: [1, 300, 6] = [x1, y1, x2, y2, confidence, class_id] |
| output = compiled([blob])[compiled.output(0)][0] |
| mask = output[:, 4] >= CONF_THRESHOLD |
| dets = output[mask] |
| |
| sx, sy = w0 / INPUT_SIZE, h0 / INPUT_SIZE |
| print(f"Total detections: {len(dets)}") |
| |
| colors = np.random.RandomState(42).randint(0, 255, (80, 3)).tolist() |
| for det in dets: |
| x1 = int(det[0] * sx) |
| y1 = int(det[1] * sy) |
| x2 = int(det[2] * sx) |
| y2 = int(det[3] * sy) |
| cid = int(det[5]) |
| conf = float(det[4]) |
| label = f"{COCO_NAMES[cid]} {conf:.2f}" |
| color = colors[cid] |
| cv2.rectangle(image, (x1, y1), (x2, y2), color, 2) |
| cv2.putText(image, label, (x1, y1 - 5), |
| cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2) |
| print(f" {label} at ({x1},{y1})-({x2},{y2})") |
| |
| cv2.imwrite("output_openvino.jpg", image) |
| ``` |
|
|
| **Device targets:** |
|
|
| - `"CPU"` -- default, works on all Intel platforms. |
| - `"GPU"` -- Intel integrated or discrete GPU. |
| - `"NPU"` -- Intel NPU (validate with `benchmark_app -d NPU`). |
|
|
| ### Try It on a Sample Image |
|
|
| The `export_and_quantize.sh` script downloads `test.jpg` automatically. |
| Re-run the OpenVINO sample above. |
| The script reads `test.jpg`, prints each detected object to the console, and writes the annotated frame to `output_openvino.jpg`. |
|
|
| Expected console output (representative): |
|
|
| ```text |
| Total detections: 5 |
| person 0.92 at (49,396)-(236,904) |
| bus 0.92 at (0,229)-(804,744) |
| person 0.91 at (670,393)-(809,880) |
| person 0.90 at (223,403)-(345,862) |
| person 0.50 at (0,553)-(68,869) |
| ``` |
|
|
| #### Expected Output |
|
|
|  |
|
|
| ### DLStreamer Sample |
|
|
| The pipeline below runs the FP16 YOLO26 detector on the sample video via |
| `gvadetect`, overlays bounding boxes, saves the annotated result to |
| `output_dlstreamer.mp4`, and prints all detections per frame. |
|
|
| > **Notes on running this sample:** |
| > |
| > - Use the FP16 IR (`yolo26n_openvino_model/yolo26n.xml`). Class names are |
| > read automatically from the model's embedded `metadata.yaml` by |
| > DLStreamer 2026.0+ -- no external `labels-file` is required. |
| > - Export `PYTHONPATH` so the DLStreamer Python module is importable: |
| > |
| > ```bash |
| > source /opt/intel/openvino_2026/setupvars.sh |
| > source /opt/intel/dlstreamer/scripts/setup_dls_env.sh |
| > export PYTHONPATH=/opt/intel/dlstreamer/python:\ |
| > /opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-} |
| > ``` |
| |
| ```python |
| import gi |
| |
| gi.require_version("Gst", "1.0") |
| gi.require_version("GstVideo", "1.0") |
| from gi.repository import Gst |
| from gstgva import VideoFrame |
| |
| Gst.init(None) |
| |
| INPUT_VIDEO = "test_video.mp4" |
| |
| # For CPU: change device=GPU to device=CPU. |
| # For NPU: change device=GPU to device=NPU (batch-size=1, nireq=4 recommended). |
| pipeline_str = ( |
| f"filesrc location={INPUT_VIDEO} ! decodebin3 ! " |
| "videoconvert ! " |
| "gvadetect model=yolo26n_openvino_model/yolo26n.xml " |
| "device=GPU " |
| "threshold=0.4 ! queue ! " |
| "gvawatermark ! videoconvert ! video/x-raw,format=I420 ! " |
| "openh264enc ! h264parse ! " |
| "mp4mux ! filesink name=sink location=output_dlstreamer.mp4" |
| ) |
| pipeline = Gst.parse_launch(pipeline_str) |
| |
|
|
| def on_buffer(pad, info): |
| buf = info.get_buffer() |
| caps = pad.get_current_caps() |
| frame = VideoFrame(buf, caps=caps) |
| for region in frame.regions(): |
| print(f" {region.label()} at ({region.rect().x},{region.rect().y})", |
| flush=True) |
| return Gst.PadProbeReturn.OK |
| |
|
|
| sink = pipeline.get_by_name("sink") |
| sink_pad = sink.get_static_pad("sink") |
| sink_pad.add_probe(Gst.PadProbeType.BUFFER, on_buffer) |
|
|
| pipeline.set_state(Gst.State.PLAYING) |
| bus = pipeline.get_bus() |
| bus.timed_pop_filtered( |
| Gst.CLOCK_TIME_NONE, |
| Gst.MessageType.EOS | Gst.MessageType.ERROR, |
| ) |
| pipeline.set_state(Gst.State.NULL) |
| ``` |
| |
| #### Expected Output |
|
|
|  |
|
|
| **Device targets:** |
|
|
| - `device=GPU` -- default in the sample code. |
| - `device=CPU` -- change `device=GPU` to `device=CPU`. |
| - `device=NPU` -- change `device=GPU` to `device=NPU`; use `batch-size=1` and `nireq=4` for best NPU utilization. |
|
|
| --- |
|
|
| ## License |
|
|
| Copyright (C) Intel Corporation. All rights reserved. |
| Licensed under the MIT License. See [LICENSE](LICENSE) for details. |
|
|
| ## References |
|
|
| - [YOLO26 Documentation](https://docs.ultralytics.com/models/yolo26/) |
| - [OpenVINO YOLO26 Notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov26-optimization/yolov26-object-detection.ipynb) |
| - [COCO Dataset](https://cocodataset.org/) |
| - [OpenVINO Documentation](https://docs.openvino.ai/) |
| - [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html) |
| - [Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/index.html) |
|
|