--- license: apache-2.0 language: - en tags: - text-generation - cli - shell - command-line - sft - instruction-following pipeline_tag: text-generation widget: - text: "Instruction: List all files in the current directory\nCommand:" example_title: List files - text: "Instruction: Find all Python files\nCommand:" example_title: Find Python files - text: "Instruction: Show disk usage\nCommand:" example_title: Disk usage --- # Tiny-LLM CLI SFT (54M) A **54 million parameter** language model fine-tuned for CLI command generation. ## Model Description This model is a Supervised Fine-Tuned (SFT) version of [jonmabe/tiny-llm-54m](https://huggingface.co/jonmabe/tiny-llm-54m), trained to generate Unix/Linux shell commands from natural language instructions. ### Training Data - **Geddy's NL2Bash dataset**: ~2,300 natural language to bash command pairs - **NL2Bash benchmark**: Standard benchmark for command translation - **Synthetic examples**: Additional generated pairs - **Total**: ~13,000 training pairs ### Training Details | Parameter | Value | |-----------|-------| | Base Model | tiny-llm-54m | | Training Steps | 2,000 | | Best Checkpoint | Step 1,000 | | Best Val Loss | 1.2456 | | Learning Rate | 5e-5 | | Batch Size | 16 | | Hardware | NVIDIA RTX 5090 | | Training Time | ~9 minutes | ## Architecture - **Parameters**: 54.93M - **Layers**: 12 - **Hidden Size**: 512 - **Attention Heads**: 8 - **Intermediate Size**: 1408 - **Max Position**: 512 - **Vocabulary**: 32,000 tokens - **Features**: RoPE, RMSNorm, SwiGLU, Weight Tying ## Usage ### Prompt Format ``` Instruction: Command: ``` ### Example ```python from model import TinyLLM import torch # Load model checkpoint = torch.load("best_model.pt", map_location="cpu") model = TinyLLM(checkpoint["config"]["model"]) model.load_state_dict(checkpoint["model_state_dict"]) model.eval() # Generate prompt = "Instruction: Find all Python files modified in the last day\nCommand:" # ... tokenize and generate ``` ## Limitations ⚠️ **Known Issues:** - Tokenizer decode shows raw BPE tokens (Ġ = space, Ċ = newline) - Model generates fragments of correct commands but output can be noisy - Needs more training steps for reliable generation - Small model size limits command complexity ## Improvement Plan 1. **Fix tokenizer decode** - Proper BPE to text conversion 2. **Longer training** - 5,000-10,000 steps 3. **Data quality** - Curate cleaner training pairs 4. **Lower LR** - More stable convergence with 1e-5 ## License Apache 2.0 ## Citation ```bibtex @misc{tiny-llm-cli-sft-2026, author = {Jon Mabe}, title = {Tiny-LLM CLI SFT: Small Language Model for Command Generation}, year = {2026}, publisher = {HuggingFace}, url = {https://huggingface.co/jonmabe/tiny-llm-cli-sft} } ```