| --- |
| license: mit |
| --- |
| |
|
|
| # TMLM-Haiku-2.3 |
|
|
| **It speaks. It actually speaks. Mostly.** |
|
|
| We have come so far. From the dark ages of `couldcouldoldbloodblood` to actual, coherent, structured sentences. This is TMLM-Haiku-2.3. It is 1 million parameters. It is small. It is trying its best. And unlike its ancestors, it usually succeeds. |
|
|
| ## Quick Stats |
|
|
| - **Parameters:** 1,000,000 (Yes, really. 1M.) |
| - **Training Tokens:** 10 Billion |
| - **Context Window:** 2048 tokens |
| - **Vibe:** Chaotic good, but mostly good. |
|
|
| ## What Is This? |
|
|
| Haiku-2.3 is the latest evolution of the TMLM-Haiku series. It builds on Haiku-2 by adding **SPIN (Self-Play Fine-Tuning)** to the training loop. This model represents a **3x improvement** in combined performance score over the original Haiku. Coherence has jumped from 1.99 to 6.03. Relevance is no longer zero. It is a miracle. |
|
|
| ### The Journey |
|
|
| | Model | Era | Typical Output | Combined Score | |
| | :--- | :--- | :--- | :--- | |
| | **Haiku-1** | The Dark Ages | `couldcouldoldbloodbloodbodybody` | 1.62 | |
| | **Haiku-1.3** | The Pipe Character Incident | `\|fdish\|\|\|\|\|!@\|` | 1.21 | |
| | **Haiku-2** | The Awakening | `It is about **competent development**...` | 3.87 | |
| | **Haiku-2.3 (SPIN)** | **Current Era** | `The artificial intelligence is a problem...` | **4.84 ★** | |
|
|
| **Expected Output:** |
| > "The simple terms arrived in simulant explorers and honey are specific or forecasters. They allow the structure of their similar..." |
|
|
| ## Disclaimer |
|
|
| This is a **1 million parameter model**. |
| - It is not GPT-5. |
| - It is not GPT-2. |
| - It is a tiny neural network running on a prayer and a GPU. |
| - It might still output `chuamliamce` occasionally. If it does, just try again. It is shy. |
| - For best results, use temperature around 0.7. If you crank it to 2.0, you are on your own. |
|
|
| ## Benchmarks |
|
|
| We benchmarked Haiku-2.3 against all previous versions using a standard 7-question suite. |
|
|
| | Metric | Haiku-1 | Haiku-1.3 | Haiku-2 | **Haiku-2.3 (SPIN)** | |
| | :--- | :---: | :---: | :---: | :---: | |
| | **Fluency** | 0.50 | 1.69 | 8.35 | **8.78** | |
| | **Coherence** | 1.99 | 1.56 | 5.72 | **6.03** | |
| | **Relevance** | 1.22 | 0.00 | 0.00 | **2.25** | |
| | **Format** | 3.29 | 3.29 | 3.29 | **3.29** | |
| | **Combined** | 1.62 | 1.21 | 3.87 | **4.84** | |
|
|
| ## Related Models |
|
|
| Check out the rest of the family: |
| - [TMLM-Haiku-1](https://huggingface.co/CompactAI-O/TMLM-Haiku-1) (The ancestor) |
| - [TMLM-Haiku-1.3](https://huggingface.co/CompactAI-O/TMLM-Haiku-1.3) (The pipe character one) |
| - [TMLM-Haiku-2](https://huggingface.co/CompactAI-O/TMLM-Haiku-2) (The breakthrough) |
|
|
| ## Acknowledgments |
|
|
| Built with curiosity over compute. Trained on FineWeb-Edu. SPIN optimized. And a lot of hope. |
|
|
| --- |
|
|
| **Built by [CompactAI](https://huggingface.co/CompactAI-O).** |
| *If you like tiny models that try their best, give us a follow.* |
| ``` |