docs: add Windows ONNX Runtime usage (CPU / NPU / GPU) with WinML CLI

310e9c4 verified 2 days ago

2.75 kB

license: mit
widget:
  - src: >-
      https://www.invoicesimple.com/wp-content/uploads/2018/06/Sample-Invoice-printable.png
    example_title: Invoice

Table Transformer (fine-tuned for Table Detection)

Table Transformer (DETR) model trained on PubTables1M. It was introduced in the paper PubTables-1M: Towards Comprehensive Table Extraction From Unstructured Documents by Smock et al. and first released in this repository.

Disclaimer: The team releasing Table Transformer did not write a model card for this model so this model card has been written by the Hugging Face team.

Model description

The Table Transformer is equivalent to DETR, a Transformer-based object detection model. Note that the authors decided to use the "normalize before" setting of DETR, which means that layernorm is applied before self- and cross-attention.

Usage

You can use the raw model for detecting tables in documents. See the documentation for more info.

Run as ONNX (CPU / NPU / GPU)

Detect tables ~14× faster on a Windows NPU at half the model size, with mAP within 1% of the original PyTorch checkpoint — by exporting this model to ONNX. You can also export to ONNX to run on CPU or GPU.

Benchmarked on an Intel Core Ultra 7 258V (PubTables-1M validation, 1000 samples):

Model	Device	Precision	mAP	mean latency (ms)	p50 latency (ms)	Size (MB)
PyTorch	CPU	fp32	0.9887	620.9	600.3	115
ONNX	OpenVINO NPU	w8a16 (QDQ)	0.9822	44.1	41.6	58

How to convert — Export and quantize with Microsoft's WinML CLI. The NPU build is QDQ-quantized to w8a16; fp32 builds for CPU and GPU are also supported. End-to-end build, evaluation, and a Python inference example: examples/microsoft-table-transformer-detection.
How to run on Windows — Use Windows ML, which manages execution providers for NPU / GPU / CPU and routes ONNX inference to the right backend automatically.
How to run on other platforms — Use ONNX Runtime with the execution provider of your choice (OpenVINO, QNN, DirectML, CUDA, CPU, etc.).