--- library_name: transformers datasets: - yahma/alpaca-cleaned license: mit language: - en base_model: - SlitherCode/tiny-edu-166m --- # tiny-edu-166M-instruct-v0 A naively instruction-tuned version of [tiny-edu-166M](https://huggingface.co/SlitherCode/tiny-edu-166m), a 166M parameter language model built on the ParchmentLM architecture and pretrained from scratch on FineWeb. This is **v0** — a baseline instruct model trained on Alpaca-Cleaned with no filtering, curation, or preference optimization. It exists to establish a benchmark before more principled data pipeline work in future versions. ## Model Details | | | |---|---| | **Base Model** | SlitherCode/tiny-edu-166m | | **Architecture** | ParchmentLM (LLaMA-style, tiktoken cl100k_base tokenizer) | | **Parameters** | 166M | | **Pretraining Data** | FineWeb (~4B tokens) | | **SFT Data** | yahma/alpaca-cleaned (52k examples) | | **Training Epochs** | 3 | | **Precision** | bfloat16 | For full architecture details see the base model repo. ## Usage Load the tokenizer from the base model and weights from this repo: ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("SlitherCode/tiny-edu-166m", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("SlitherCode/tiny-edu-166m-instruct-v0", trust_remote_code=True) model.eval() messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"} ] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(prompt, return_tensors="pt") input_len = inputs["input_ids"].shape[1] with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=200, do_sample=False, repetition_penalty=1.1, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.eos_token_id, ) response = tokenizer.decode(outputs[0][input_len:], skip_special_tokens=True) print(response) ``` ## Chat Template This model uses a custom chat template with `<|endoftext|>` as the turn separator: ``` system You are a helpful assistant.<|endoftext|> user What is the capital of France?<|endoftext|> assistant ``` ## Limitations - 166M parameters — limited factual knowledge and reasoning capacity - Arithmetic and multi-step reasoning are unreliable at this scale - Naively trained on Alpaca-Cleaned with no quality filtering or preference optimization - Not suitable for production use ## Training Details Trained using HuggingFace Trainer with the following configuration: - Optimizer: AdamW - Learning rate: 2e-5 with cosine decay - Warmup ratio: 0.03 - Batch size: 32 - Precision: bfloat16 ## Roadmap - **v1**: Retrain on a curated, category-balanced dataset derived from real-world queries with higher quality responses - **v2**: Retrain on a synthetically generated and curated dataset with further optimizations ## License The model weights are released under the **MIT License**, inherited from the base model [tiny-edu-166M](https://huggingface.co/SlitherCode/tiny-edu-166m). The SFT training data [yahma/alpaca-cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned) is licensed under **CC-BY-4.0**. Per the license terms, attribution is given to the original Alpaca dataset authors (Stanford University) and the cleaned version maintainers. > Taori et al., "Alpaca: A Strong, Replicable Instruction-Following Model", Stanford University, 2023. > Cleaned version: https://github.com/gururise/AlpacaDataCleaned