๐ธ {MOMO_VERSION}
Momo is a friendly 336M parameter language model trained from scratch, designed to feel like chatting with a warm, knowledgeable friend.
Model Details
- Parameters: ~336M
- Architecture: Transformer (RoPE + RMSNorm + GQA + SwiGLU)
- Trained on: WikiText-103 + Alpaca + Custom reasoning data
- Context length: {MAX_SEQ_LEN} tokens
- Vocabulary: {VOCAB_FINAL} tokens
Capabilities
- ๐ฌ Friendly, casual chat
- ๐ง Reasoning with
<think>tags - โ Question answering
- ๐ค Emotional support
Quick Start
# Load and chat with Momo
model = MomoForCausalLM.from_pretrained('path/to/Momo-336M')
tokenizer = AutoTokenizer.from_pretrained('path/to/Momo-336M')
messages = [{{'role': 'user', 'content': 'Hey Momo! How are you?'}}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors='pt')
output = model.generate(**inputs, max_new_tokens=200, temperature=0.75)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Training Setup
- GPU: 2ร NVIDIA T4 (Kaggle)
- Precision: float16 AMP
- Gradient checkpointing: enabled
- Training stages: Pretrain โ SFT โ Reasoning
- Downloads last month
- 223
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support