xieofxie's picture
docs: add Windows ONNX Runtime usage (CPU / NPU / GPU) with WinML CLI
310e9c4 verified
|
raw
history blame
2.75 kB
metadata
license: mit
widget:
  - src: >-
      https://www.invoicesimple.com/wp-content/uploads/2018/06/Sample-Invoice-printable.png
    example_title: Invoice

Table Transformer (fine-tuned for Table Detection)

Table Transformer (DETR) model trained on PubTables1M. It was introduced in the paper PubTables-1M: Towards Comprehensive Table Extraction From Unstructured Documents by Smock et al. and first released in this repository.

Disclaimer: The team releasing Table Transformer did not write a model card for this model so this model card has been written by the Hugging Face team.

Model description

The Table Transformer is equivalent to DETR, a Transformer-based object detection model. Note that the authors decided to use the "normalize before" setting of DETR, which means that layernorm is applied before self- and cross-attention.

Usage

You can use the raw model for detecting tables in documents. See the documentation for more info.

Run as ONNX (CPU / NPU / GPU)

Detect tables ~14× faster on a Windows NPU at half the model size, with mAP within 1% of the original PyTorch checkpoint — by exporting this model to ONNX. You can also export to ONNX to run on CPU or GPU.

Benchmarked on an Intel Core Ultra 7 258V (PubTables-1M validation, 1000 samples):

Model Device Precision mAP mean latency (ms) p50 latency (ms) Size (MB)
PyTorch CPU fp32 0.9887 620.9 600.3 115
ONNX OpenVINO NPU w8a16 (QDQ) 0.9822 44.1 41.6 58
  • How to convert — Export and quantize with Microsoft's WinML CLI. The NPU build is QDQ-quantized to w8a16; fp32 builds for CPU and GPU are also supported. End-to-end build, evaluation, and a Python inference example: examples/microsoft-table-transformer-detection.
  • How to run on Windows — Use Windows ML, which manages execution providers for NPU / GPU / CPU and routes ONNX inference to the right backend automatically.
  • How to run on other platforms — Use ONNX Runtime with the execution provider of your choice (OpenVINO, QNN, DirectML, CUDA, CPU, etc.).