RtaForge
/

Anvaya-Rabbit-2.7B

@@ -8,119 +8,57 @@ tags:
 - causal-lm
 - rabbit
 - rtaforge
-- india
-- sovereign-ai
-pipeline_tag: text-generation
 ---
 # Anvaya-Rabbit 2.7B
-**India's first sovereign SSM-based language model.**
-Non-transformer architecture. No attention mechanism. Constitutional training via Gurukul. 7 patents filed at IP India.
----
-## What's in this repo
-Three model tiers are available, each built on the same 2.7B parameter base:
-| Tier | File | Use this when… |
-|---|---|---|
-| **Base** | `base/Anvaya-Rabbit-2.7B-0.5-alpha-base.pt` | You want raw pretrained weights for your own fine-tuning |
-| **Instruct** | `instruct/Anvaya-Rabbit-2.7B-0.5-alpha-instruct.pt` | You want a general-purpose assistant that follows instructions |
-| **Imprint** | `imprint/Anvaya-Rabbit-2.7B-0.5-alpha-imprint.pt` | You want the full Rabbit persona — opinionated, constitutional, identity-aware |
-If you're not sure which to use, start with **Instruct**.
----
-## Quickstart
-```bash
-pip install rtaforge transformers
-```
 ```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model = AutoModelForCausalLM.from_pretrained(
-    "RtaForge/Anvaya-Rabbit-2.7B",
-    trust_remote_code=True,
-    torch_dtype="auto",
-    device_map="auto",
-)
-tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
-inputs = tokenizer("Hello, I am Rabbit.", return_tensors="pt").to(model.device)
-outputs = model.generate(**inputs, max_new_tokens=200)
-print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
-> The `rtaforge` runtime package provides the compiled architecture. Source is not distributed.
----
-## Why SSM?
-> Transformers scale quadratically with context length because every token attends to every other token. SSMs replace attention with a fixed-size recurrent state: inference cost stays **constant per token** regardless of context length, VRAM footprint shrinks dramatically, and long-document throughput improves by orders of magnitude — all at the same parameter count.
----
-## Architecture
-Rabbit is built on **RtaSSM v7.2.2-FU "Fortress Unbroken"**, a custom state-space model developed at RtaForge:
-- **No attention mechanism** — purely recurrent SSM layers with learned state dynamics
-- **64 layers, 2560 hidden dimensions**, 2.7B parameters, bfloat16
-- **Constitutional training** — Gurukul curriculum with wiki pretraining → instruct SFT → persona imprint
-- **Vocabulary** 50,280 tokens (GPT-NeoX tokenizer)
----
 ## Training
-| Stage | Data | Notes |
-|---|---|---|
-| Wiki pretraining | Wikipedia (en) | 732 constitutional proposals via Gurukul |
-| Instruct SFT | ChatML instruction pairs | `gate_only` trainable strategy |
-| Persona imprint | Rabbit constitutional corpus | Identity and value alignment |
----
-## Evaluation Access
-Weights are publicly available. Runtime package is live:
-```bash
-pip install rtaforge
-```
-To evaluate Rabbit or discuss deployment:
-📧 guha@rtaforge.in
-🌐 rtaforge.in
-Runtime documentation coming soon.
----
-## Limitations
-v0.5-alpha is an early research release. Rabbit has not been evaluated on standard benchmarks. She is small, she is new, and she is learning. Feedback welcome at guha@rtaforge.in.
----
-## Citation
-```bibtex
-@misc{anvaya-rabbit-2026,
-  title  = {Anvaya-Rabbit: A Sovereign SSM Language Model},
-  author = {RtaForge},
-  year   = {2026},
-  url    = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B}
-}
-```
----
-*Anvaya (अन्वय) — logical connection, coherence. Rabbit — the fast runner.*

 - causal-lm
 - rabbit
 - rtaforge
+base_model: RtaForge/Anvaya-Rabbit-2.7B
 ---
 # Anvaya-Rabbit 2.7B
+A 2.7B parameter State-Space Model (SSM) trained by RtaForge using the Gurukul
+constitutional training protocol.
+## Architecture
+- **Type**: Ṛta-SSM v7.2.2, Fortress Unbroken — recurrent SSM, no attention
+- **Parameters**: ~2.78B
+- **Layers**: 64
+- **d_model / d_state**: 2560
+- **Vocabulary**: 50,280 (GPT-NeoX tokenizer)
+- **Precision**: bfloat16
+## Weights
+This repository contains the base pretrained checkpoint (`base/Anvaya-Rabbit-2.7B-0.1-alpha-base.pt`)
+and the SFT imprint checkpoint (`imprint/Anvaya-Rabbit-2.7B-0.1-alpha-imprint.pt`).
+Load the base weights directly:
 ```python
+from white_rabbit.rabbit_model import create_rabbit_model
+from transformers import AutoTokenizer
+import torch
+model = create_rabbit_model(vocab_size=50280, durga_variant="fu-64")
+sd = torch.load("base/Anvaya-Rabbit-2.7B-0.1-alpha-base.pt", map_location="cpu")
+model.load_state_dict(sd, strict=False)
+model.eval()
+tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
 ```
+## Benchmarks
+*Benchmarks pending — will be updated after evaluation run completes.*
+| Task | Metric | Score |
+|------|--------|-------|
+| HellaSwag | acc_norm | — |
+| ARC-Challenge | acc_norm | — |
+| MMLU | acc | — |
+| WinoGrande | acc | — |
+| TruthfulQA MC1 | mc1 | — |
 ## Training
+Trained with the Anvaya Gurukul protocol: a constitutional Sisya/Guru loop
+where Sisya proposes weight deltas and Guru applies them after validation.
+SFT imprint applied using surface-only gate-layer fine-tuning.