| | --- |
| | license: apache-2.0 |
| | library_name: transformers |
| | language: |
| | - en |
| | base_model: |
| | - Qwen/Qwen2.5-14B |
| | pipeline_tag: text-generation |
| | --- |
| | |
| | # Datarus-R1-14B-preview |
| |
|
| | <div align="center"> |
| | <img src="https://i.postimg.cc/7hsStNgm/logo-icon-2-1.png" alt="Datarus Logo" width="150"/> |
| | |
| | [](https://huggingface.co/DatarusAI/Datarus-R1-14B-preview) |
| | [](LICENSE) |
| | [](https://datarus.ai) |
| | [](https://chat.datarus.ai) |
| | [](https://arxiv.org/abs/2508.13382) |
| | </div> |
| |
|
| | ## 🚀 Overview |
| |
|
| | **Datarus-R1-14B-Preview** is a 14B-parameter open-weights language model fine-tuned from Qwen2.5-14B-Instruct, designed to act as a virtual data analyst and graduate-level problem solver. Unlike traditional models trained on isolated Q&A pairs, Datarus learns from complete analytical trajectories—including reasoning steps, code execution, error traces, self-corrections, and final conclusions—all captured in a ReAct-style notebook format. |
| |
|
| | ### Key Highlights |
| |
|
| | - **🎯 State-of-the-art efficiency**: Surpasses similar-sized models and competes with 32B+ models while using 18-49% fewer tokens |
| | - **🔄 Dual reasoning interfaces**: Supports both Agentic (ReAct) mode for interactive analysis and Reflection (CoT) mode for concise documentation |
| | - **📊 Superior performance**: Achieves up to 30% higher accuracy on AIME 2024/2025 and LiveCodeBench |
| | - **💡 "AHA-moment" pattern**: Exhibits efficient hypothesis refinement in 1-2 iterations, avoiding circular reasoning loops |
| |
|
| | ## 🔗 Quick Links |
| |
|
| | - 🌐 **Website**: [https://datarus.ai](https://datarus.ai) |
| | - 💬 **Try the Demo**: [https://chat.datarus.ai](https://chat.datarus.ai) |
| | - 🛠️ **Jupyter Agent**: [GitHub Repository](https://github.com/DatarusAI/Datarus-JupyterAgent) |
| | - 📄 **Paper**: [Datarus-R1: An Adaptive Multi-Step Reasoning LLM](https://arxiv.org/abs/2508.13382) |
| |
|
| | ## 📊 Performance |
| |
|
| | ### Benchmark Results |
| |
|
| | | Benchmark | Datarus-R1-14B-Preview | QwQ-32B | Phi-4-reasoning | DeepSeek-R1-Distill-14B | |
| | |-----------|----------------|---------|-----------------|-------------------------| |
| | | **LiveCodeBench v6** | 57.7 | 56.6 | 52.6 | 48.6 | |
| | | **AIME 2024** | 70.1 | 76.2 | 74.6* | - | |
| | | **AIME 2025** | 66.2 | 66.2 | 63.1* | - | |
| | | **GPQA Diamond** | 62.1 | 60.1 | 55.0 | 58.6 | |
| |
|
| | *Reported values from official papers |
| | |
| | ### Token Efficiency and Performance |
| | |
| | <div align="center"> |
| | <img src="https://i.postimg.cc/NMSppNM4/perf-efficiency.png" alt="LCB-Efficiency" width="600"/> |
| | <img src="https://i.postimg.cc/nV341Ssf/efficiency.png" alt="Efficiency" width="600" /> |
| | </div> |
| | |
| | ## 🎯 Model Card |
| | |
| | ### Model Details |
| | |
| | - **Model Type**: Language Model for Reasoning and Data Analysis |
| | - **Parameters**: 14.8B |
| | - **Training Data**: 144,000 synthetic analytical trajectories across finance, medicine, numerical analysis, and other quantitative domains + A curated collection of reasoning datasets. |
| | - **Language**: English |
| | - **License**: Apache 2.0 |
| | |
| | ### Intended Use |
| | |
| | #### Primary Use Cases |
| | - **Data Analysis**: Automated data exploration, statistical analysis, and visualization |
| | - **Mathematical Problem Solving**: Graduate-level mathematics including AIME-level problems |
| | - **Code Generation**: Creating analytical scripts and solving programming challenges |
| | - **Scientific Reasoning**: Complex problem-solving in physics, chemistry, and other sciences |
| | - **Interactive Notebooks**: Building complete analysis notebooks with iterative refinement |
| | |
| | ### Dual Mode Usage |
| | |
| | #### Agentic Mode (for interactive analysis) |
| | - Use `<step>`, `<thought>`, `<action>`, `<action_input>`, `<observation>` tags |
| | - Enables iterative code execution and refinement |
| | - Best for data analysis, simulations, and exploratory tasks |
| | |
| | #### Reflection Mode (for documentation) |
| | - Use `<think>` and `<answer>` tags |
| | - Produces compact, self-contained reasoning chains |
| | - Best for mathematical proofs, explanations, and reports |
| | |
| | ## 📚 Citation |
| | |
| | ```bibtex |
| | @article{benchaliah2025datarus, |
| | title={Datarus-R1: An Adaptive Multi-Step Reasoning LLM for Automated Data Analysis}, |
| | author={Ben Chaliah, Ayoub and Dellagi, Hela}, |
| | journal={arXiv preprint arXiv:2508.13382}, |
| | year={2025} |
| | } |
| | ``` |
| | |
| | ## 🤝 Contributing |
| | |
| | We welcome contributions! Please see our [GitHub repository](https://github.com/DatarusAI/Datarus-JupyterAgent) for: |
| | - Bug reports and feature requests |
| | - Pull requests |
| | - Discussion forums |
| | |
| | ## 📄 License |
| | |
| | This model is released under the Apache 2.0 License. |
| | |
| | ## 🙏 Acknowledgments |
| | |
| | We thank the Qwen team for the excellent base model and the open-source community for their valuable contributions. |
| | |
| | ## 📧 Contact |
| | |
| | - **Email**: ayoub1benchaliah@gmail.com, hela.dellagi@outlook.com |
| | - **Website**: [https://datarus.ai](https://datarus.ai) |
| | - **Demo**: [https://chat.datarus.ai](https://chat.datarus.ai) |
| | |
| | --- |
| | |
| | <div align="center"> |
| | <strong>Experience the future of AI-powered data analysis with Datarus-R1</strong> |
| | |
| | [Try Demo](https://chat.datarus.ai) | [View Code](https://github.com/DatarusAI/Datarus-JupyterAgent) | [Read Paper](https://arxiv.org/abs/2508.13382) |
| | </div> |
| | |
| | ## ⭐ Support |
| | |
| | If you find this model and Agent pipeline useful, please consider __Like/Star__! Your support helps us continue improving the project. |
| | |
| | Found a bug or have a feature request? Please open an issue on GitHub. |
| | |
| | --- |
| | |
| | <p align="center">Made with ❤️ by the Datarus Team from Paris</p> |