Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

CompactAI-O
/
Shard-1

PyTorch
English
small-lm
gemma4-attention
muon
swiglu
experimental
Model card Files Files and versions
xet
Community
Shard-1 / code
30.6 kB
Ctrl+K
Ctrl+K
  • 2 contributors
History: 1 commit
Crownelius's picture
Crownelius
Initial release: Shard-40m-v1 (54.5M dense transformer, anneal final)
025878f verified 2 days ago
  • config.py
    4.36 kB
    Initial release: Shard-40m-v1 (54.5M dense transformer, anneal final) 2 days ago
  • model.py
    15.5 kB
    Initial release: Shard-40m-v1 (54.5M dense transformer, anneal final) 2 days ago
  • muon.py
    7.39 kB
    Initial release: Shard-40m-v1 (54.5M dense transformer, anneal final) 2 days ago
  • tokenizer.py
    3.39 kB
    Initial release: Shard-40m-v1 (54.5M dense transformer, anneal final) 2 days ago