GQLA: Group-Query Latent Attention for Hardware-Adaptive Large Language Model Decoding Paper • 2605.15250 • Published 18 days ago • 13
HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention Paper • 2603.28458 • Published Mar 30 • 44
HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention Paper • 2603.28458 • Published Mar 30 • 44