Instructions to use AxionLab-official/MiniBot-0.9M-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AxionLab-official/MiniBot-0.9M-Base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AxionLab-official/MiniBot-0.9M-Base")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AxionLab-official/MiniBot-0.9M-Base")
model = AutoModelForCausalLM.from_pretrained("AxionLab-official/MiniBot-0.9M-Base")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use AxionLab-official/MiniBot-0.9M-Base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AxionLab-official/MiniBot-0.9M-Base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AxionLab-official/MiniBot-0.9M-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/AxionLab-official/MiniBot-0.9M-Base

SGLang

How to use AxionLab-official/MiniBot-0.9M-Base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AxionLab-official/MiniBot-0.9M-Base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AxionLab-official/MiniBot-0.9M-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AxionLab-official/MiniBot-0.9M-Base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AxionLab-official/MiniBot-0.9M-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use AxionLab-official/MiniBot-0.9M-Base with Docker Model Runner:
```
docker model run hf.co/AxionLab-official/MiniBot-0.9M-Base
```

MiniBot-0.9M-Base / README.md

AxionLab-official

Update README.md

a27e334 verified about 1 month ago

preview code

raw

history blame contribute delete

5.08 kB

	---
	license: mit
	language:
	- pt
	pipeline_tag: text-generation
	tags:
	- base
	- pretrain
	- pretrained
	- nano
	- mini
	- chatbot
	library_name: transformers
	---

	# 🧠 MiniBot-0.9M-Base

	> Ultra-lightweight GPT-2 style language model (~900K parameters) specialized in Portuguese conversational text.

	[![Model](https://img.shields.io/badge/🤗%20Hugging%20Face-MiniBot--0.9M--Base-yellow)](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Base)
	[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
	[![Language](https://img.shields.io/badge/Language-Portuguese-blue)](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Base)
	[![Parameters](https://img.shields.io/badge/Parameters-~900K-orange)](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Base)

	---

	## 📌 Overview

	MiniBot-0.9M-Base is a tiny decoder-only Transformer (~0.9M parameters) based on the GPT-2 architecture, designed for efficient text generation in Portuguese.

	This is a base (pretrained) model — trained purely for next-token prediction, with no instruction tuning or alignment of any kind. It serves as the foundation for fine-tuned variants such as [MiniBot-0.9M-Instruct](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Instruct).

	---

	## 🎯 Key Characteristics

	\| Attribute \| Detail \|
	\|---\|---\|
	\| 🇧🇷 Language \| Portuguese (primary) \|
	\| 🧠 Architecture \| GPT-2 style (Transformer decoder-only) \|
	\| 🔤 Embeddings \| GPT-2 compatible \|
	\| 📉 Parameters \| ~900K \|
	\| ⚙️ Objective \| Causal Language Modeling (next-token prediction) \|
	\| 🚫 Alignment \| None (base model) \|

	---

	## 🏗️ Architecture

	MiniBot-0.9M follows a scaled-down GPT-2 design:

	- Token embeddings + positional embeddings
	- Multi-head self-attention
	- Feed-forward (MLP) layers
	- Autoregressive decoding

	Despite its small size, it preserves the core inductive biases of GPT-2, making it well-suited for experimentation and educational purposes.

	---

	## 📚 Training Dataset

	The model was trained on a Portuguese conversational dataset focused on language pattern learning.

	Training notes:
	- Pure next-token prediction objective
	- No instruction tuning (no SFT, no RLHF, no alignment)
	- Lightweight training pipeline
	- Optimized for small-scale experimentation

	---

	## 💡 Capabilities

	### ✅ Strengths

	- Portuguese text generation
	- Basic dialogue structure
	- Simple prompt continuation
	- Linguistic pattern learning

	### ❌ Limitations

	- Very limited reasoning ability
	- Loses context in long conversations
	- Inconsistent outputs
	- Prone to repetition or incoherence

	> ⚠️ This model behaves as a statistical language generator, not a reasoning system.

	---

	## 🚀 Getting Started

	### Installation

	```bash
	pip install transformers torch
	```

	### Usage with Hugging Face Transformers

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_name = "AxionLab-official/MiniBot-0.9M-Base"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)

	prompt = "User: Me explique o que é gravidade\nBot:"
	inputs = tokenizer(prompt, return_tensors="pt")

	outputs = model.generate(
	**inputs,
	max_new_tokens=50,
	temperature=0.8,
	top_p=0.95,
	do_sample=True,
	)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### ⚙️ Recommended Settings

	\| Parameter \| Recommended Value \| Description \|
	\|---\|---\|---\|
	\| `temperature` \| `0.7 – 1.0` \| Controls randomness \|
	\| `top_p` \| `0.9 – 0.95` \| Nucleus sampling \|
	\| `do_sample` \| `True` \| Enable sampling \|
	\| `max_new_tokens` \| `30 – 80` \| Response length \|

	> 💡 Base models generally benefit from higher temperature values compared to instruct variants, since there is no fine-tuning to constrain the output distribution.

	---

	## 🧪 Intended Use Cases

	\| Use Case \| Suitability \|
	\|---\|---\|
	\| 🧠 Fine-tuning (chat, instruction, roleplay) \| ✅ Ideal \|
	\| 🎮 Prompt playground & experimentation \| ✅ Ideal \|
	\| 🔬 Research on tiny LLMs \| ✅ Ideal \|
	\| 📉 Benchmarking small architectures \| ✅ Ideal \|
	\| ⚡ Local / CPU-only applications \| ✅ Ideal \|
	\| 🏭 Critical production environments \| ❌ Not recommended \|

	---

	## ⚠️ Disclaimer

	- Extremely small model (~900K parameters)
	- Limited world knowledge and weak generalization
	- No safety or alignment measures
	- Not suitable for production use

	---

	## 🔮 Future Work

	- [x] 🎯 Instruction-tuned version → [`MiniBot-0.9M-Instruct`](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Instruct)
	- [ ] 📚 Larger and more diverse dataset
	- [ ] 🔤 Tokenizer improvements
	- [ ] 📈 Scaling to 1M–10M parameters
	- [ ] 🧠 Experimental reasoning fine-tuning

	---

	## 📜 License

	Distributed under the MIT License. See [`LICENSE`](LICENSE) for more details.

	---

	## 👤 Author

	Developed by [AxionLab](https://huggingface.co/AxionLab-official) 🔬

	---

	<div align="center">
	<sub>MiniBot-0.9M-Base · AxionLab · MIT License</sub>
	</div>