Unitxt

community

https://www.unitxt.ai

Activity Feed Request to join this org

AI & ML interests

IBM Research

updated a Space 2 months ago

Unitxt Metric

Evaluate AI models with unified benchmarks

updated a dataset 2 months ago

unitxt/data

Updated May 27 • 4.58k

authored 7 papers 5 months ago

AlephBERT:A Hebrew Large Pre-Trained Language Model to Start-off your Hebrew NLP Application With

Paper • 2104.04052 • Published Apr 8, 2021

Efficient Benchmarking (of Language Models)

Paper • 2308.11696 • Published Aug 22, 2023

Quality Controlled Paraphrase Generation

Paper • 2203.10940 • Published Mar 21, 2022

Lexical Generalization Improves with Larger Models and Longer Training

Paper • 2210.12673 • Published Oct 23, 2022

Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation

Paper • 2407.13696 • Published Jul 18, 2024 • 5

DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation

Paper • 2503.01622 • Published Mar 3, 2025

General Agent Evaluation

Paper • 2602.22953 • Published Feb 26 • 12

authored a paper over 2 years ago

Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AI

Paper • 2401.14019 • Published Jan 25, 2024 • 23