CompactAI commited on
Commit
f0d2e46
·
verified ·
1 Parent(s): b72932d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -3
README.md CHANGED
@@ -1,3 +1,71 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+
6
+ # TMLM-Haiku-2.3
7
+
8
+ **It speaks. It actually speaks. Mostly.**
9
+
10
+ We have come so far. From the dark ages of `couldcouldoldbloodblood` to actual, coherent, structured sentences. This is TMLM-Haiku-2.3. It is 1 million parameters. It is small. It is trying its best. And unlike its ancestors, it usually succeeds.
11
+
12
+ ## Quick Stats
13
+
14
+ - **Parameters:** 1,000,000 (Yes, really. 1M.)
15
+ - **Training Tokens:** 10 Billion
16
+ - **Context Window:** 2048 tokens
17
+ - **Vibe:** Chaotic good, but mostly good.
18
+
19
+ ## What Is This?
20
+
21
+ Haiku-2.3 is the latest evolution of the TMLM-Haiku series. It builds on Haiku-2 by adding **SPIN (Self-Play Fine-Tuning)** to the training loop. This model represents a **3x improvement** in combined performance score over the original Haiku. Coherence has jumped from 1.99 to 6.03. Relevance is no longer zero. It is a miracle.
22
+
23
+ ### The Journey
24
+
25
+ | Model | Era | Typical Output | Combined Score |
26
+ | :--- | :--- | :--- | :--- |
27
+ | **Haiku-1** | The Dark Ages | `couldcouldoldbloodbloodbodybody` | 1.62 |
28
+ | **Haiku-1.3** | The Pipe Character Incident | `\|fdish\|\|\|\|\|!@\|` | 1.21 |
29
+ | **Haiku-2** | The Awakening | `It is about **competent development**...` | 3.87 |
30
+ | **Haiku-2.3 (SPIN)** | **Current Era** | `The artificial intelligence is a problem...` | **4.84 ★** |
31
+
32
+ **Expected Output:**
33
+ > "The simple terms arrived in simulant explorers and honey are specific or forecasters. They allow the structure of their similar..."
34
+
35
+ ## Disclaimer
36
+
37
+ This is a **1 million parameter model**.
38
+ - It is not GPT-5.
39
+ - It is not GPT-2.
40
+ - It is a tiny neural network running on a prayer and a GPU.
41
+ - It might still output `chuamliamce` occasionally. If it does, just try again. It is shy.
42
+ - For best results, use temperature around 0.7. If you crank it to 2.0, you are on your own.
43
+
44
+ ## Benchmarks
45
+
46
+ We benchmarked Haiku-2.3 against all previous versions using a standard 7-question suite.
47
+
48
+ | Metric | Haiku-1 | Haiku-1.3 | Haiku-2 | **Haiku-2.3 (SPIN)** |
49
+ | :--- | :---: | :---: | :---: | :---: |
50
+ | **Fluency** | 0.50 | 1.69 | 8.35 | **8.78** |
51
+ | **Coherence** | 1.99 | 1.56 | 5.72 | **6.03** |
52
+ | **Relevance** | 1.22 | 0.00 | 0.00 | **2.25** |
53
+ | **Format** | 3.29 | 3.29 | 3.29 | **3.29** |
54
+ | **Combined** | 1.62 | 1.21 | 3.87 | **4.84** |
55
+
56
+ ## Related Models
57
+
58
+ Check out the rest of the family:
59
+ - [TMLM-Haiku-1](https://huggingface.co/CompactAI-O/TMLM-Haiku-1) (The ancestor)
60
+ - [TMLM-Haiku-1.3](https://huggingface.co/CompactAI-O/TMLM-Haiku-1.3) (The pipe character one)
61
+ - [TMLM-Haiku-2](https://huggingface.co/CompactAI-O/TMLM-Haiku-2) (The breakthrough)
62
+
63
+ ## Acknowledgments
64
+
65
+ Built with curiosity over compute. Trained on FineWeb-Edu. SPIN optimized. And a lot of hope.
66
+
67
+ ---
68
+
69
+ **Built by [CompactAI](https://huggingface.co/CompactAI-O).**
70
+ *If you like tiny models that try their best, give us a follow.*
71
+ ```