YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Table-Transformer-Detection

Model Description

Table-Transformer-Detection is a 28.8-million-parameter object detection model from Microsoft Research, fine-tuned specifically for table detection in documents.
Built on the DETR (DEtection TRansformer) architecture, it locates and identifies tables within unstructured document images such as PDFs and scanned pages.

Trained on PubTables-1M — a large-scale dataset containing nearly one million fully annotated tables from scientific articles — Table-Transformer-Detection delivers strong performance for document table extraction without requiring task-specific architectural customization.

Quickstart

Follow the instructions here. Start with 3 simple steps.

Features

  • Table detection: accurately locates tables in document images, PDFs, and scanned pages.
  • DETR-based architecture: leverages a Transformer encoder-decoder on top of a CNN backbone (ResNet) for end-to-end object detection.
  • Pre-normalization: uses the "normalize before" setting, applying LayerNorm before self- and cross-attention for improved training stability.
  • Lightweight: at only 28.8M parameters (F32), the model is efficient to deploy and run inference on.
  • Fine-tunable: can be further fine-tuned on domain-specific document datasets for improved accuracy.

Use Cases

  • Automated document processing and digitization pipelines
  • Table extraction from academic papers and research articles
  • Invoice and financial document parsing
  • Legal and regulatory document analysis
  • Healthcare and clinical report table extraction
  • Preprocessing step for downstream table structure recognition

Inputs and Outputs

Input:

  • Document images (JPEG, PNG, etc.) containing one or more tables.

Output:

  • Bounding box predictions with confidence scores for each detected table in the image.
  • Class labels identifying detected objects as tables.

License

This repo is licensed under the Creative Commons Attribution–NonCommercial 4.0 (CC BY-NC 4.0) license, which allows use, sharing, and modification only for non-commercial purposes with proper attribution. All NPU-related models, runtimes, and code in this project are protected under this non-commercial license and cannot be used in any commercial or revenue-generating applications. Commercial licensing or enterprise usage requires a separate agreement. For inquiries, please contact dev@nexa.ai

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support