RtaForge
/

Anvaya-Rabbit-2.7B

@@ -8,57 +8,119 @@ tags:
 - causal-lm
 - rabbit
 - rtaforge
-base_model: RtaForge/Anvaya-Rabbit-2.7B
 ---
 # Anvaya-Rabbit 2.7B
-A 2.7B parameter State-Space Model (SSM) trained by RtaForge using the Gurukul
-constitutional training protocol.
-## Architecture
-- **Type**: Ṛta-SSM v7.2.2, Fortress Unbroken — recurrent SSM, no attention
-- **Parameters**: ~2.78B
-- **Layers**: 64
-- **d_model / d_state**: 2560
-- **Vocabulary**: 50,280 (GPT-NeoX tokenizer)
-- **Precision**: bfloat16
-## Weights
-This repository contains the base pretrained checkpoint (`base/Anvaya-Rabbit-2.7B-0.1-alpha-base.pt`)
-and the SFT imprint checkpoint (`imprint/Anvaya-Rabbit-2.7B-0.1-alpha-imprint.pt`).
-Load the base weights directly:
-```python
-from white_rabbit.rabbit_model import create_rabbit_model
-from transformers import AutoTokenizer
-import torch
-model = create_rabbit_model(vocab_size=50280, durga_variant="fu-64")
-sd = torch.load("base/Anvaya-Rabbit-2.7B-0.1-alpha-base.pt", map_location="cpu")
-model.load_state_dict(sd, strict=False)
-model.eval()
 tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
 ```
-## Benchmarks
-*Benchmarks pending — will be updated after evaluation run completes.*
-| Task | Metric | Score |
-|------|--------|-------|
-| HellaSwag | acc_norm | — |
-| ARC-Challenge | acc_norm | — |
-| MMLU | acc | — |
-| WinoGrande | acc | — |
-| TruthfulQA MC1 | mc1 | — |
 ## Training
-Trained with the Anvaya Gurukul protocol: a constitutional Sisya/Guru loop
-where Sisya proposes weight deltas and Guru applies them after validation.
-SFT imprint applied using surface-only gate-layer fine-tuning.

 - causal-lm
 - rabbit
 - rtaforge
+- india
+- sovereign-ai
+pipeline_tag: text-generation
 ---
 # Anvaya-Rabbit 2.7B
+**India's first sovereign SSM-based language model.**
+Non-transformer architecture. No attention mechanism. Constitutional training via Gurukul. 7 patents filed at IP India.
+---
+## What's in this repo
+Three model tiers are available, each built on the same 2.7B parameter base:
+| Tier | File | Use this when… |
+|---|---|---|
+| **Base** | `base/Anvaya-Rabbit-2.7B-0.5-alpha-base.pt` | You want raw pretrained weights for your own fine-tuning |
+| **Instruct** | `instruct/Anvaya-Rabbit-2.7B-0.5-alpha-instruct.pt` | You want a general-purpose assistant that follows instructions |
+| **Imprint** | `imprint/Anvaya-Rabbit-2.7B-0.5-alpha-imprint.pt` | You want the full Rabbit persona — opinionated, constitutional, identity-aware |
+If you're not sure which to use, start with **Instruct**.
+---
+## Quickstart
+```bash
+pip install rtaforge transformers
+```
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained(
+    "RtaForge/Anvaya-Rabbit-2.7B",
+    trust_remote_code=True,
+    torch_dtype="auto",
+    device_map="auto",
+)
 tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
+inputs = tokenizer("Hello, I am Rabbit.", return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=200)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
+> The `rtaforge` runtime package provides the compiled architecture. Source is not distributed.
+---
+## Why SSM?
+> Transformers scale quadratically with context length because every token attends to every other token. SSMs replace attention with a fixed-size recurrent state: inference cost stays **constant per token** regardless of context length, VRAM footprint shrinks dramatically, and long-document throughput improves by orders of magnitude — all at the same parameter count.
+---
+## Architecture
+Rabbit is built on **RtaSSM v7.2.2-FU "Fortress Unbroken"**, a custom state-space model developed at RtaForge:
+- **No attention mechanism** — purely recurrent SSM layers with learned state dynamics
+- **64 layers, 2560 hidden dimensions**, 2.7B parameters, bfloat16
+- **Constitutional training** — Gurukul curriculum with wiki pretraining → instruct SFT → persona imprint
+- **Vocabulary** 50,280 tokens (GPT-NeoX tokenizer)
+---
 ## Training
+| Stage | Data | Notes |
+|---|---|---|
+| Wiki pretraining | Wikipedia (en) | 732 constitutional proposals via Gurukul |
+| Instruct SFT | ChatML instruction pairs | `gate_only` trainable strategy |
+| Persona imprint | Rabbit constitutional corpus | Identity and value alignment |
+---
+## Evaluation Access
+Weights are publicly available. Runtime package is live:
+```bash
+pip install rtaforge
+```
+To evaluate Rabbit or discuss deployment:
+📧 guha@rtaforge.in
+🌐 rtaforge.in
+Runtime documentation coming soon.
+---
+## Limitations
+v0.5-alpha is an early research release. Rabbit has not been evaluated on standard benchmarks. She is small, she is new, and she is learning. Feedback welcome at guha@rtaforge.in.
+---
+## Citation
+```bibtex
+@misc{anvaya-rabbit-2026,
+  title  = {Anvaya-Rabbit: A Sovereign SSM Language Model},
+  author = {RtaForge},
+  year   = {2026},
+  url    = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B}
+}
+```
+---
+*Anvaya (अन्वय) — logical connection, coherence. Rabbit — the fast runner.*