shivash
/

testingmodel

enhanced_hybrid_transformer

Model card Files Files and versions

testingmodel / README.md

shivash's picture

Upload Enhanced Hybrid Transformer 416M weights 🚀

701cfd9 verified 5 months ago

|

history blame contribute delete

562 Bytes

	# Enhanced Hybrid Transformer 416M

	🚀 416,417,792 parameter transformer with modern optimizations.

	## Features
	- 24 layers × 16 heads
	- GQA-4 (Grouped Query Attention)
	- SwiGLU activation
	- RMSNorm normalization
	- RoPE positional embeddings

	## Contents
	- `pytorch_model.bin` - Model weights
	- `config.json` - Model configuration
	- `tokenizer.json` - Tokenizer files
	- `README.md` - This file

	## Usage
	Load with the original repository code for full functionality.

	---
	🚀 Generated with [Claude Code](https://claude.ai/code)