| --- |
| language: |
| - en |
| license: apache-2.0 |
| tags: |
| - ssm |
| - state-space-model |
| - causal-lm |
| - rabbit |
| - rtaforge |
| - india |
| - sovereign-ai |
| pipeline_tag: text-generation |
| --- |
| |
| # Anvaya-Rabbit 2.7B |
|
|
| **India's first sovereign SSM-based language model.** |
|
|
| Non-transformer architecture. No attention mechanism. Constitutional training via Gurukul. 7 patents filed at IP India. |
|
|
| --- |
|
|
| ## ⚠️ Checkpoint Deprecation Notice |
|
|
| | Checkpoint | Status | Notes | |
| |---|---|---| |
| | `Anvaya-Rabbit-2.7B-0.55-base.pt` | ✅ **CURRENT** | Wikipedia warmup complete, CE 0.993x | |
| | Any prior checkpoint | ⚠️ **DEPRECATED** | Do not use for inference | |
|
|
| Prior checkpoints are retained for research transparency. |
| The current checkpoint reflects iterative refinement of the |
| ANVAYA RtaSSM architecture and training pipeline. |
|
|
| **Always use the latest `-base.pt` for any downstream work.** |
|
|
| --- |
|
|
| ## What's in this repo |
|
|
| | Tier | File | Use this when… | |
| |---|---|---| |
| | **Base** | `base/Anvaya-Rabbit-2.7B-0.55-base.pt` | You want raw pretrained weights for your own fine-tuning | |
|
|
| Instruct and Imprint tiers are in preparation (epoch 2 → SFT → imprint pipeline). |
|
|
| --- |
|
|
| ## Quickstart |
|
|
| ```bash |
| pip install rtaforge transformers |
| ``` |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b") |
| tokenizer.add_special_tokens({"additional_special_tokens": ["<|im_start|>", "<|im_end|>"]}) |
| |
| model = AutoModelForCausalLM.from_pretrained( |
| "RtaForge/Anvaya-Rabbit-2.7B", |
| trust_remote_code=True, |
| torch_dtype="bfloat16", |
| device_map="auto", |
| ) |
| |
| prompt = "Rabbit is a helpful and honest assistant.\n\nUser: Who are you?\nRabbit:" |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| outputs = model.generate(**inputs, max_new_tokens=60, repetition_penalty=1.3) |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ``` |
|
|
| > The `rtaforge` runtime package provides the compiled architecture. Source is not distributed. |
|
|
| --- |
|
|
| ## Why SSM? |
|
|
| > Transformers scale quadratically with context length because every token attends to every other token. SSMs replace attention with a fixed-size recurrent state: inference cost stays **constant per token** regardless of context length, VRAM footprint shrinks dramatically, and long-document throughput improves by orders of magnitude — all at the same parameter count. |
|
|
| --- |
|
|
| ## Architecture |
|
|
| Rabbit is built on **RtaSSM v7.2.2-FU "Fortress Unbroken"**, a custom state-space model developed at RtaForge: |
|
|
| - **No attention mechanism** — purely recurrent SSM layers with learned state dynamics |
| - **64 layers, 2560 hidden dimensions**, 2.7B parameters, bfloat16 |
| - **Constitutional training** — Gurukul curriculum with wiki pretraining → instruct SFT → persona imprint |
| - **Vocabulary** 50,280 tokens (GPT-NeoX tokenizer) |
|
|
| --- |
|
|
| ## Training |
|
|
| | Stage | Data | Notes | |
| |---|---|---| |
| | Wiki warmup (v0.55) | Wikipedia (en) | 700 constitutional proposals via Gurukul — **complete** | |
| | Epoch 2 (planned) | RedPajama | Gate-only, ~3,350 proposals | |
| | Instruct SFT (planned) | ChatML instruction pairs | `gate_only` trainable strategy | |
| | Persona imprint (planned) | Rabbit constitutional corpus | Identity and value alignment | |
|
|
| --- |
|
|
| ## Evaluation Access |
|
|
| Weights are publicly available. Runtime package is live: |
|
|
| ```bash |
| pip install rtaforge |
| ``` |
|
|
| To evaluate Rabbit or discuss deployment: |
| 📧 guha@rtaforge.in |
| 🌐 rtaforge.in |
|
|
| Runtime documentation coming soon. |
|
|
| --- |
|
|
| ## Maturity and Roadmap |
|
|
| **v0.55 is a base pretrained checkpoint** — Wikipedia warmup complete, CE ratio 0.993×. |
| Usable conversational behaviour is targeted at **v0.8–v0.9**, currently in training. |
|
|
| - Evaluating for deployment? Wait for v0.9. |
| - Evaluating the architecture or training methodology? v0.55-base is exactly what you need. |
|
|
| ## Limitations |
|
|
| v0.55 has not been evaluated on standard benchmarks. She is small, she is new, and she is learning. Feedback welcome at guha@rtaforge.in. |
|
|
| --- |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{anvaya-rabbit-2026, |
| title = {Anvaya-Rabbit: A Sovereign SSM Language Model}, |
| author = {RtaForge}, |
| year = {2026}, |
| url = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B} |
| } |
| ``` |
|
|
| --- |
|
|
| *Anvaya (अन्वय) — logical connection, coherence. Rabbit — the fast runner.* |
|
|