Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

LCM-Lab
/
nsa_llama

Text Generation
Transformers
Safetensors
English
llama
conversational
text-generation-inference
Model card Files Files and versions
xet
Community

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers

Downloads last month
-
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
Text Generation
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including LCM-Lab/nsa_llama

Elastic-Attention

Collection
Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers • 17 items • Updated about 4 hours ago • 2

Paper for LCM-Lab/nsa_llama

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers

Paper • 2601.17367 • Published 5 days ago • 29
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs