MiniAxion1-0.9M / README.md
AxionLab-official's picture
Update README.md
520a1c9 verified
---
license: mit
language:
- en
metrics:
pipeline_tag: text-generation
tags:
- nrm
- nano
- reasoning
- thinking
- sub-1m
- lowparams
- custom_code
---
# ๐Ÿง  MiniAxion1-0.9M
**MiniAxion1-0.9M** is a Nano Reasoning Model (NRM) with ~920K parameters designed to explore the emergence of structured reasoning in extremely small neural networks.
Despite its minimal size, the model demonstrates strong consistency in reasoning format and step-based thinking using explicit `<THINK>` and `<STEP>` tokens.
---
## ๐Ÿš€ Overview
* **Model Type:** Nano Reasoning Model (NRM)
* **Parameters:** ~920,833
* **Architecture:** Transformer (6 layers: 2 entry + 2 shared + 2 exit)
* **d_model:** 256
* **Heads:** 8
* **FFN size:** 512
* **LoRA Rank:** 16
* **Vocabulary Size:** 2048
* **Training Time:** ~80 minutes (CPU)
---
## ๐Ÿง  Key Capabilities
### โœ… Structured Reasoning
The model reliably produces structured reasoning traces:
```
<THINK>
<STEP> ...
<STEP> ...
</THINK>
<ANS>...</ANS>
```
* 100% usage of reasoning tokens
* Consistent multi-step formatting
* Stable output structure across tasks
---
### โšก Ultra-Lightweight
* Runs efficiently on CPU
* Designed for experimentation and rapid iteration
* Suitable for embedded or game-like environments
---
### ๐Ÿงช Research-Oriented Design
MiniAxion1 is not intended to compete with large-scale models. Instead, it is built to:
* Study reasoning emergence in small models
* Explore structure vs correctness trade-offs
* Enable fast iteration cycles for AI research
---
## ๐Ÿ“Š Evaluation Results
| Task | Accuracy |
| ----------------------- | -------- |
| Arithmetic | 3.3% |
| Two-Step Arithmetic | 10.0% |
| Even/Odd | 100.0% |
| Comparison | 5.0% |
| Pattern Completion | 0.0% |
| Word Problems | 0.0% |
| Sorting | 0.0% |
| Chain-of-Thought Format | 100.0% |
**Average Accuracy:** 16.9%
---
## ๐Ÿ” Observations
* The model learns reasoning *structure* before reasoning *correctness*
* Chain-of-thought formatting is highly reliable
* Arithmetic and symbolic reasoning remain limited at this scale
* Evidence of partial decoupling between reasoning steps and final answers
---
## โš ๏ธ Limitations
* Weak performance on arithmetic and multi-step reasoning tasks
* Susceptible to incorrect intermediate reasoning steps
* Limited generalization beyond trained patterns
* Not suitable for production use in critical systems
* Due to 920k parameters, low results on evaluation is expected
---
## ๐ŸŽฏ Intended Use Cases
* ๐Ÿงช AI research and experimentation
* ๐ŸŽฎ Game AI / NPC reasoning simulation
* ๐Ÿ“š Educational demonstrations of reasoning structure
* โš™๏ธ Lightweight reasoning prototypes
---
### Quick start
```python
import torch
from model import NRMModel
from tokenizer import Tokenizer
# load
model = NRMModel.from_config("config.json")
model.load_state_dict(torch.load("model.pt"))
model.eval()
tokenizer = Tokenizer.load("tokenizer.json")
def generate(prompt):
tokens = tokenizer.encode(prompt)
output = model.generate(tokens)
return tokenizer.decode(output)
print(generate("<INST>What is 2 + 2?</INST>"))
```
## ๐Ÿง  Philosophy
MiniAxion1 explores a key question:
> *Can structured reasoning emerge in extremely small models?*
This model provides early evidence that:
* Reasoning format can be learned efficiently
* Structure and correctness are separable capabilities
* Useful behavior can emerge even at sub-1M scale
---
## ๐Ÿ”ฎ Future Directions
* Improved dataset alignment for arithmetic reasoning
* Scaling parameters (1M โ†’ 10M range)
* Better coupling between reasoning and answers
* Task-specific specialization (e.g., math-only variants)
* distillation knowledge on bigger models
---
## ๐Ÿค Acknowledgments
This model was developed as part of ongoing experimentation in nano-scale reasoning systems.
the main question was: "How low could a model think(or mimic it)?
---
## ๐Ÿ“Ž Model
๐Ÿ‘‰ https://huggingface.co/AxionLab-Co/MiniAxion1-0.9M
---
## ๐Ÿงช Disclaimer
This is an experimental research model. Outputs may be incorrect even when reasoning appears structured or convincing.