File size: 562 Bytes
701cfd9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | # Enhanced Hybrid Transformer 416M
🚀 **416,417,792 parameter** transformer with modern optimizations.
## Features
- **24 layers** × **16 heads**
- **GQA-4** (Grouped Query Attention)
- **SwiGLU** activation
- **RMSNorm** normalization
- **RoPE** positional embeddings
## Contents
- `pytorch_model.bin` - Model weights
- `config.json` - Model configuration
- `tokenizer.json` - Tokenizer files
- `README.md` - This file
## Usage
Load with the original repository code for full functionality.
---
🚀 Generated with [Claude Code](https://claude.ai/code)
|