Instructions to use AxionLab-official/MiniBot-0.9M-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AxionLab-official/MiniBot-0.9M-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="AxionLab-official/MiniBot-0.9M-Instruct")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("AxionLab-official/MiniBot-0.9M-Instruct") model = AutoModelForCausalLM.from_pretrained("AxionLab-official/MiniBot-0.9M-Instruct") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use AxionLab-official/MiniBot-0.9M-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "AxionLab-official/MiniBot-0.9M-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AxionLab-official/MiniBot-0.9M-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/AxionLab-official/MiniBot-0.9M-Instruct
- SGLang
How to use AxionLab-official/MiniBot-0.9M-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "AxionLab-official/MiniBot-0.9M-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AxionLab-official/MiniBot-0.9M-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "AxionLab-official/MiniBot-0.9M-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AxionLab-official/MiniBot-0.9M-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use AxionLab-official/MiniBot-0.9M-Instruct with Docker Model Runner:
docker model run hf.co/AxionLab-official/MiniBot-0.9M-Instruct
| license: mit | |
| language: | |
| - pt | |
| pipeline_tag: text-generation | |
| base_model: | |
| - AxionLab-official/MiniBot-0.9M-Base | |
| library_name: transformers | |
| # ๐ง MiniBot-0.9M-Instruct | |
| > **Instruction-tuned GPT-2 style language model (~900K parameters) optimized for Portuguese conversational tasks.** | |
| [](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Instruct) | |
| [](https://opensource.org/licenses/MIT) | |
| [](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Instruct) | |
| [](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Instruct) | |
| --- | |
| ## ๐ Overview | |
| **MiniBot-0.9M-Instruct** is the instruction-tuned version of [MiniBot-0.9M-Base](https://huggingface.co/AxionLab-official/MiniBot-0.9M-Base), designed to follow prompts more accurately, respond to user inputs, and generate more coherent conversational outputs in **Portuguese**. | |
| Built on a GPT-2 architecture (~0.9M parameters), this model was fine-tuned on conversational and instruction-style data to improve usability in real-world interactions. | |
| --- | |
| ## ๐ฏ Key Characteristics | |
| | Attribute | Detail | | |
| |---|---| | |
| | ๐ง๐ท **Language** | Portuguese (primary) | | |
| | ๐ง **Architecture** | GPT-2 style (Transformer decoder-only) | | |
| | ๐ค **Embeddings** | GPT-2 compatible | | |
| | ๐ **Parameters** | ~900K | | |
| | โ๏ธ **Base Model** | MiniBot-0.9M-Base | | |
| | ๐ฏ **Fine-tuning** | Instruction tuning (supervised) | | |
| | โ **Alignment** | Basic prompt-following behavior | | |
| --- | |
| ## ๐ง What Changed from Base? | |
| Instruction tuning introduced significant behavioral improvements with no architectural changes: | |
| | Feature | Base | Instruct | | |
| |---|---|---| | |
| | Prompt understanding | โ | โ | | |
| | Conversational flow | โ ๏ธ Partial | โ | | |
| | Instruction following | โ | โ | | |
| | Overall coherence | Low | Improved | | |
| | Practical usability | Experimental | Functional | | |
| > ๐ก The model is now significantly more usable in chat scenarios. | |
| --- | |
| ## ๐๏ธ Architecture | |
| The core architecture remains identical to the base model: | |
| - **Decoder-only Transformer** (GPT-2 style) | |
| - Token embeddings + positional embeddings | |
| - Self-attention + MLP blocks | |
| - Autoregressive generation | |
| No structural changes were made โ only behavioral improvement through fine-tuning. | |
| --- | |
| ## ๐ Fine-Tuning Dataset | |
| The model was fine-tuned on a Portuguese instruction-style conversational dataset composed of: | |
| - ๐ฌ Questions and answers | |
| - ๐ Simple instructions | |
| - ๐ค Assistant-style chat | |
| - ๐ญ Basic roleplay | |
| - ๐ฃ๏ธ Natural conversations | |
| **Expected format:** | |
| ``` | |
| User: Me explique o que รฉ gravidade | |
| Bot: A gravidade รฉ a forรงa que atrai objetos com massa... | |
| ``` | |
| **Training strategy:** | |
| - Supervised Fine-Tuning (SFT) | |
| - Pattern learning for instruction-following | |
| - No RLHF or preference optimization | |
| --- | |
| ## ๐ก Capabilities | |
| ### โ Strengths | |
| - Following simple instructions | |
| - Answering basic questions | |
| - Conversing more naturally | |
| - Higher coherence in short responses | |
| - More consistent dialogue structure | |
| ### โ Limitations | |
| - Reasoning is still limited | |
| - May generate incorrect facts | |
| - Does not retain long context | |
| - Sensitive to poorly structured prompts | |
| > โ ๏ธ Even with instruction tuning, this remains an extremely small model. Adjust expectations accordingly. | |
| --- | |
| ## ๐ Getting Started | |
| ### Installation | |
| ```bash | |
| pip install transformers torch | |
| ``` | |
| ### Usage with Hugging Face Transformers | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| model_name = "AxionLab-official/MiniBot-0.9M-Instruct" | |
| tokenizer = AutoTokenizer.from_pretrained(model_name) | |
| model = AutoModelForCausalLM.from_pretrained(model_name) | |
| prompt = "User: Me diga uma curiosidade sobre o espaรงo\nBot:" | |
| inputs = tokenizer(prompt, return_tensors="pt") | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=80, | |
| temperature=0.7, | |
| top_p=0.9, | |
| do_sample=True, | |
| ) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| ### โ๏ธ Recommended Settings | |
| | Parameter | Recommended Value | Description | | |
| |---|---|---| | |
| | `temperature` | `0.6 โ 0.8` | Controls randomness | | |
| | `top_p` | `0.85 โ 0.95` | Nucleus sampling | | |
| | `do_sample` | `True` | Enable sampling | | |
| | `max_new_tokens` | `40 โ 100` | Response length | | |
| > ๐ก Instruct models tend to perform better at lower temperatures. Try values around `0.65` for more accurate and focused responses. | |
| --- | |
| ## ๐งช Intended Use Cases | |
| | Use Case | Suitability | | |
| |---|---| | |
| | ๐ฌ Lightweight Portuguese chatbots | โ Ideal | | |
| | ๐ฎ NPCs and games | โ Ideal | | |
| | ๐ง Fine-tuning experiments | โ Ideal | | |
| | ๐ NLP education | โ Ideal | | |
| | โก Local / CPU-only applications | โ Ideal | | |
| | ๐ญ Critical production environments | โ Not recommended | | |
| --- | |
| ## โ ๏ธ Disclaimer | |
| - Extremely small model (~900K parameters) | |
| - No robust alignment (no RLHF) | |
| - May generate incorrect or nonsensical responses | |
| - **Not suitable for critical production environments** | |
| --- | |
| ## ๐ฎ Future Work | |
| - [ ] ๐ง Reasoning-tuned version (`MiniBot-Reason`) | |
| - [ ] ๐ Scaling to 1Mโ10M parameters | |
| - [ ] ๐ Larger and more diverse dataset | |
| - [ ] ๐ค Improved response alignment | |
| - [ ] ๐งฉ Tool-use experiments | |
| --- | |
| ## ๐ License | |
| Distributed under the **MIT License**. See [`LICENSE`](LICENSE) for more details. | |
| --- | |
| ## ๐ค Author | |
| Developed by **[AxionLab](https://huggingface.co/AxionLab-official)** ๐ฌ | |
| --- | |
| <div align="center"> | |
| <sub>MiniBot-0.9M-Instruct ยท AxionLab ยท MIT License</sub> | |
| </div> |