Sparse Query Attention (SQA) Research - a ReactiveAI Collection

ReactiveAI 's Collections

Reactive Transformer PoC - RxT-Beta-Micro models

RxT-Beta Training Datasets

Reactive Transformer PoC - RxT-Alpha Supervised Models

Sparse Query Attention (SQA) Research

Interaction SFT Datasets

Sparse Query Attention (SQA) Research

updated Oct 3, 2025

Experimental models with Sparse Query Attention layers. Reducing training time/cost by ~3-10% compared to GQA & MQA, with the same level performance