Nandi-Mini-150M-Instruct

Introduction

Nandi-Mini-150M-Instruct is a compact, efficient multilingual language model designed for strong performance in resource-constrained environments. It is pre-trained from scratch on 525 billion tokens and further enhanced through instruction tuning and Direct Preference Optimization (DPO). The model supports English and 10 Indic languages.

Nandi-Mini-150M-Instruct focuses on maximizing performance per parameter through architectural efficiency rather than scale. It is optimized for edge devices, on-prem deployments, and low-latency applications, making it ideal for resource-constrained environments. Nandi-Mini-150M-Instruct brings the following key features:

  • Strong multilingual capability across English and Indic languages
  • Efficient design enabling high performance at small scale (150M parameters)
  • Reduced memory footprint using factorized embeddings
  • Better parameter efficiency through layer sharing

πŸ“ Upcoming Releases & Roadmap

We’re just getting started with the Nandi series πŸš€

  • Nandi-Mini-150M-Tool-Calling (Specialized-Model) β€” Coming Soon this week
  • Nandi-Mini-500M (Base + Instruct) β€” Pre-Training Going On
  • Nandi-Mini-1B (Base + Instruct) β€” Pre-Training Going On

πŸ“’ Blogs & technical deep-dives coming soon, where we’ll share:

  • Architecture decisions and design trade-offs
  • Training insights and dataset composition
  • Benchmarks and real-world applications

Stay tuned!

🌍 Supported Languages

The model is trained on English and a diverse set of Indic languages, including:

  • Hindi, Bengali, Tamil, Telugu, Marathi, Gujarati, Kannada, Malayalam, Punjabi, Odia

πŸš€ Usage

!pip install transformers=='5.4.0'

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "FrontiersMind/Nandi-Mini-150M-Instruct"

device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    trust_remote_code=True,
    dtype=torch.bfloat16
).to(device).eval()

prompt = "Explain newton's second law of motion"

messages = [
    {"role": "user", "content": prompt}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **inputs,
    max_new_tokens=500,
    do_sample=True,
    temperature=0.3,
    top_p=0.90,
    top_k=20,
    repetition_penalty=1.1,
)

generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(response)

πŸ“¬ Feedback & Suggestions

We’d love to hear your thoughts, feedback, and ideas!

Downloads last month
3,741
Safetensors
Model size
0.2B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for FrontiersMind/Nandi-Mini-150M-Instruct

Finetuned
(2)
this model

Collection including FrontiersMind/Nandi-Mini-150M-Instruct