LimiX-2M
LimiX-2M is a 2M-parameter tabular foundation model designed to mitigate low-rank collapse and attention bottlenecks in structured data. It was introduced in the paper LimiX-2M: Mitigating Low-Rank Collapse and Attention Bottlenecks in Tabular Foundation Models.
- GitHub Repository: https://github.com/limix-ldm-ai/LimiX
- Project Page: https://www.limix.ai/
1. Model Introduction
LimiX is a new class of tabular AI model designed to overcome one of modern machine learning’s longest-standing bottlenecks: structured data. With only 2M parameters, LimiX-2M sets a new state-of-the-art across classification, regression, and missing-value imputation, surpassing XGBoost, CatBoost, AutoGluon, and TabPFN. Its lightweight, training-free design makes advanced tabular modeling accessible on ordinary hardware while preserving full transparency and offline deployability.
Key Features
- Unified Tabular Reasoning: End-to-end designed for multi-task tabular intelligence, enabling a single model to handle classification, regression, and imputation without additional tuning or task-specific fine-tuning.
- Training-Free, Context-Driven Inference: Operates directly through in-context learning: no training, no hyperparameters, no preprocessing pipelines.
- Lightweight & Efficient Deployment: A compact 2M-parameter architecture enables fast inference on standard CPUs and laptops.
2. Model Usage
To use LimiX-2M, you need to clone the official repository and install the dependencies listed in the Deployment section.
Classification Task Example
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.model_selection import train_test_split
from huggingface_hub import hf_hub_download
import numpy as np
import os, sys
# Setup environment for inference
os.environ["RANK"] = "0"
os.environ["WORLD_SIZE"] = "1"
os.environ["MASTER_ADDR"] = "127.0.0.1"
os.environ["MASTER_PORT"] = "29500"
# Assuming the LimiX repository is cloned and in the python path
# from inference.predictor import LimiXPredictor
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=42)
# Download model weights
model_file = hf_hub_download(repo_id="stableai-org/LimiX-2M", filename="LimiX-2M.ckpt", local_dir="./cache")
# Initialize predictor (Requires local inference code from GitHub)
# clf = LimiXPredictor(device='cuda', model_path=model_file, inference_config='config/cls_default_retrieval.json')
# prediction = clf.predict(X_train, y_train, X_test)
# print("roc_auc_score:", roc_auc_score(y_test, prediction[:, 1]))
# print("accuracy_score:", accuracy_score(y_test, np.argmax(prediction, axis=1)))
3. Model Architecture & Pretraining
LimiX adopts a 12-block transformer architecture with axis-wise attention to features and samples. LimiX-2M specifically utilizes the RaBEL framework, which expands each scalar into compact localized RBF features to improve conditioning and shallow-layer effective rank.
The model is pretrained through Context-Conditional Masked Modeling (CCMM). By masking table cells and conditioning predictions on context rows, the model learns a wide range of conditional dependencies, allowing it to adapt to new datasets without task-specific training.
4. Deployment
For manual deployment, install the following dependencies:
pip install python==3.12.7 torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1
pip install scikit-learn einops huggingface-hub matplotlib networkx numpy pandas scipy tqdm typing_extensions xgboost kditransform hyperopt
Note: Flash Attention 2 is recommended for optimal performance.
5. License
- Code License: The repository code is licensed under the Apache-2.0 License.
- Model Weight License: The use of LimiX model weights is subject to a separate Model License:
- Fully open for academic research.
- Commercial use requires official authorization from StableAI.
6. Citation
If you use LimiX-2M in your research, please cite the following:
@article{zhang2025limix,
title={Limix: Unleashing structured-data modeling capability for generalist intelligence},
author={Zhang, Xingxuan and Ren, Gang and Yu, Han and Yuan, Hao and Wang, Hui and Li, Jiansheng and Wu, Jiayun and Mo, Lang and Mao, Li and Hao, Mingchao and others},
journal={arXiv preprint arXiv:2509.03505},
year={2025}
}
@article{limix2m2026,
title={LimiX-2M: Mitigating Low-Rank Collapse and Attention Bottlenecks in Tabular Foundation Models},
author={Zhang, Xingxuan and others},
journal={arXiv preprint arXiv:2606.04485},
year={2026}
}
- Downloads last month
- 71