| --- |
| license: mit |
| --- |
| # ZipPlus Model Card |
|
|
| **A pre-trained 4-layer GRU model for neural file compression. Each compressed file contains its own adapted model β no external model needed to decompress.** |
|
|
| This is a pre-trained ByteGRU model for [Zip+](https://github.com/CompactAIOfficial/ZipPlus). |
|
|
| ## What is this |
|
|
| Zip+ compresses any file into a PNG image using a neural network (GRU + range coding). Each compressed file embeds its own adapted model: |
|
|
| ``` |
| file.txt β [ByteGRU + Range Coding] β file.txt.zpng.png β [embedded model] β file.txt |
| ``` |
|
|
| **Every PNG is self-contained** β decompress even if you lose the original model file! |
|
|
| ## Model Details |
|
|
| - **Architecture**: 4-layer GRU over byte embeddings |
| - **Embedding dim**: 64 β Hidden dim: 512 |
| - **Trained on**: FineWeb-Edu (10BT of educational web text) + adaptive per-file training |
| - **Entropy coding**: Range coding via Constriction |
| - **Output format**: PNG where payload + model live in RGB pixel bytes |
| - **Magic header**: `ZPNG` (first 4 bytes) |
|
|
| ## Requirements |
|
|
| - Python 3.10+ |
| - PyTorch (CUDA recommended) |
| - Constriction (`pip install constriction`) |
| - Pillow |
| - numpy |
| - huggingface_hub |
| |
| ```bash |
| pip install torch constriction pillow numpy huggingface_hub |
| ``` |
| |
| ## Quick Start |
| |
| ### Compress a file (with auto-adaptation) |
| |
| ```bash |
| python inference.py compress myfile.txt -o myfile.zpng.png |
| ``` |
| |
| - Automatically adapts model to your file (50 steps) |
| - Embeds adapted model in PNG for self-contained decoding |
| |
| ### Decompress |
| |
| ```bash |
| python inference.py decompress myfile.zpng.png -o restored.txt |
| ``` |
| |
| Loads the model embedded in the PNG β no external files needed! |
| |
| ### Training (optional) |
| |
| ```bash |
| python train.py --grid 128 --steps 10000 |
| ``` |
| |
| Auto-downloads FineWeb-Edu if no corpus specified. |
| |
| ## Performance |
| |
| - Text files: ~5-20% of original size |
| - Works best on files > 10KB |
| - Smaller files: embedding overhead (~21MB) may exceed compression gains |
| |
| ## Warnings |
| |
| - **Embedding adds ~21MB** to output β worth it for large files |
| - **GPU recommended** for training and compression |
| - **Lossless** β verified via SHA256 checksums |
| |
| ## License |
| |
| MIT. I'm not liable if this eats your thesis/pixels/anything. |