Update README.md
Browse files
README.md
CHANGED
|
@@ -114,6 +114,9 @@ Math (broken, hallucinates digits or loops). Code generation (gibberish). Factua
|
|
| 114 |
|
| 115 |
To document a reproducible recipe at this scale. The next iteration in this line moves to a 412M MoE with 3 routed experts, vocabulary 262144, distillation pretraining from frontier teachers, and a token budget that crosses the Chinchilla line. This artifact is the baseline against which that next model will be measured.
|
| 116 |
|
|
|
|
|
|
|
|
|
|
| 117 |
## License
|
| 118 |
|
| 119 |
Apache 2.0. Use freely. Attribution appreciated but not required.
|
|
|
|
| 114 |
|
| 115 |
To document a reproducible recipe at this scale. The next iteration in this line moves to a 412M MoE with 3 routed experts, vocabulary 262144, distillation pretraining from frontier teachers, and a token budget that crosses the Chinchilla line. This artifact is the baseline against which that next model will be measured.
|
| 116 |
|
| 117 |
+
# Notes
|
| 118 |
+
As this model was trained by [Crownelius](https://huggingface.co/Crownelius), it does not adhere to the required specifications and therefore cannot be integrated into the inference script
|
| 119 |
+
|
| 120 |
## License
|
| 121 |
|
| 122 |
Apache 2.0. Use freely. Attribution appreciated but not required.
|