Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,71 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
|
| 6 |
+
# TMLM-Haiku-2.3
|
| 7 |
+
|
| 8 |
+
**It speaks. It actually speaks. Mostly.**
|
| 9 |
+
|
| 10 |
+
We have come so far. From the dark ages of `couldcouldoldbloodblood` to actual, coherent, structured sentences. This is TMLM-Haiku-2.3. It is 1 million parameters. It is small. It is trying its best. And unlike its ancestors, it usually succeeds.
|
| 11 |
+
|
| 12 |
+
## Quick Stats
|
| 13 |
+
|
| 14 |
+
- **Parameters:** 1,000,000 (Yes, really. 1M.)
|
| 15 |
+
- **Training Tokens:** 10 Billion
|
| 16 |
+
- **Context Window:** 2048 tokens
|
| 17 |
+
- **Vibe:** Chaotic good, but mostly good.
|
| 18 |
+
|
| 19 |
+
## What Is This?
|
| 20 |
+
|
| 21 |
+
Haiku-2.3 is the latest evolution of the TMLM-Haiku series. It builds on Haiku-2 by adding **SPIN (Self-Play Fine-Tuning)** to the training loop. This model represents a **3x improvement** in combined performance score over the original Haiku. Coherence has jumped from 1.99 to 6.03. Relevance is no longer zero. It is a miracle.
|
| 22 |
+
|
| 23 |
+
### The Journey
|
| 24 |
+
|
| 25 |
+
| Model | Era | Typical Output | Combined Score |
|
| 26 |
+
| :--- | :--- | :--- | :--- |
|
| 27 |
+
| **Haiku-1** | The Dark Ages | `couldcouldoldbloodbloodbodybody` | 1.62 |
|
| 28 |
+
| **Haiku-1.3** | The Pipe Character Incident | `\|fdish\|\|\|\|\|!@\|` | 1.21 |
|
| 29 |
+
| **Haiku-2** | The Awakening | `It is about **competent development**...` | 3.87 |
|
| 30 |
+
| **Haiku-2.3 (SPIN)** | **Current Era** | `The artificial intelligence is a problem...` | **4.84 ★** |
|
| 31 |
+
|
| 32 |
+
**Expected Output:**
|
| 33 |
+
> "The simple terms arrived in simulant explorers and honey are specific or forecasters. They allow the structure of their similar..."
|
| 34 |
+
|
| 35 |
+
## Disclaimer
|
| 36 |
+
|
| 37 |
+
This is a **1 million parameter model**.
|
| 38 |
+
- It is not GPT-5.
|
| 39 |
+
- It is not GPT-2.
|
| 40 |
+
- It is a tiny neural network running on a prayer and a GPU.
|
| 41 |
+
- It might still output `chuamliamce` occasionally. If it does, just try again. It is shy.
|
| 42 |
+
- For best results, use temperature around 0.7. If you crank it to 2.0, you are on your own.
|
| 43 |
+
|
| 44 |
+
## Benchmarks
|
| 45 |
+
|
| 46 |
+
We benchmarked Haiku-2.3 against all previous versions using a standard 7-question suite.
|
| 47 |
+
|
| 48 |
+
| Metric | Haiku-1 | Haiku-1.3 | Haiku-2 | **Haiku-2.3 (SPIN)** |
|
| 49 |
+
| :--- | :---: | :---: | :---: | :---: |
|
| 50 |
+
| **Fluency** | 0.50 | 1.69 | 8.35 | **8.78** |
|
| 51 |
+
| **Coherence** | 1.99 | 1.56 | 5.72 | **6.03** |
|
| 52 |
+
| **Relevance** | 1.22 | 0.00 | 0.00 | **2.25** |
|
| 53 |
+
| **Format** | 3.29 | 3.29 | 3.29 | **3.29** |
|
| 54 |
+
| **Combined** | 1.62 | 1.21 | 3.87 | **4.84** |
|
| 55 |
+
|
| 56 |
+
## Related Models
|
| 57 |
+
|
| 58 |
+
Check out the rest of the family:
|
| 59 |
+
- [TMLM-Haiku-1](https://huggingface.co/CompactAI-O/TMLM-Haiku-1) (The ancestor)
|
| 60 |
+
- [TMLM-Haiku-1.3](https://huggingface.co/CompactAI-O/TMLM-Haiku-1.3) (The pipe character one)
|
| 61 |
+
- [TMLM-Haiku-2](https://huggingface.co/CompactAI-O/TMLM-Haiku-2) (The breakthrough)
|
| 62 |
+
|
| 63 |
+
## Acknowledgments
|
| 64 |
+
|
| 65 |
+
Built with curiosity over compute. Trained on FineWeb-Edu. SPIN optimized. And a lot of hope.
|
| 66 |
+
|
| 67 |
+
---
|
| 68 |
+
|
| 69 |
+
**Built by [CompactAI](https://huggingface.co/CompactAI-O).**
|
| 70 |
+
*If you like tiny models that try their best, give us a follow.*
|
| 71 |
+
```
|