tvastr commited on
Commit
c8a4bfd
·
verified ·
1 Parent(s): 5367fde

docs: restore full model card (pipeline overwrote README)

Browse files
Files changed (1) hide show
  1. README.md +96 -34
README.md CHANGED
@@ -8,57 +8,119 @@ tags:
8
  - causal-lm
9
  - rabbit
10
  - rtaforge
11
- base_model: RtaForge/Anvaya-Rabbit-2.7B
 
 
12
  ---
13
 
14
  # Anvaya-Rabbit 2.7B
15
 
16
- A 2.7B parameter State-Space Model (SSM) trained by RtaForge using the Gurukul
17
- constitutional training protocol.
18
 
19
- ## Architecture
20
 
21
- - **Type**: Ṛta-SSM v7.2.2, Fortress Unbroken — recurrent SSM, no attention
22
- - **Parameters**: ~2.78B
23
- - **Layers**: 64
24
- - **d_model / d_state**: 2560
25
- - **Vocabulary**: 50,280 (GPT-NeoX tokenizer)
26
- - **Precision**: bfloat16
27
 
28
- ## Weights
29
 
30
- This repository contains the base pretrained checkpoint (`base/Anvaya-Rabbit-2.7B-0.1-alpha-base.pt`)
31
- and the SFT imprint checkpoint (`imprint/Anvaya-Rabbit-2.7B-0.1-alpha-imprint.pt`).
32
- Load the base weights directly:
33
 
34
- ```python
35
- from white_rabbit.rabbit_model import create_rabbit_model
36
- from transformers import AutoTokenizer
37
- import torch
 
 
 
 
 
 
 
38
 
39
- model = create_rabbit_model(vocab_size=50280, durga_variant="fu-64")
40
- sd = torch.load("base/Anvaya-Rabbit-2.7B-0.1-alpha-base.pt", map_location="cpu")
41
- model.load_state_dict(sd, strict=False)
42
- model.eval()
 
 
43
 
 
 
 
 
 
 
44
  tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
 
 
 
 
45
  ```
46
 
47
- ## Benchmarks
 
 
 
 
 
 
48
 
49
- *Benchmarks pending — will be updated after evaluation run completes.*
 
 
50
 
51
- | Task | Metric | Score |
52
- |------|--------|-------|
53
- | HellaSwag | acc_norm | — |
54
- | ARC-Challenge | acc_norm | — |
55
- | MMLU | acc | — |
56
- | WinoGrande | acc | — |
57
- | TruthfulQA MC1 | mc1 | — |
58
 
 
 
 
 
 
 
59
 
60
  ## Training
61
 
62
- Trained with the Anvaya Gurukul protocol: a constitutional Sisya/Guru loop
63
- where Sisya proposes weight deltas and Guru applies them after validation.
64
- SFT imprint applied using surface-only gate-layer fine-tuning.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - causal-lm
9
  - rabbit
10
  - rtaforge
11
+ - india
12
+ - sovereign-ai
13
+ pipeline_tag: text-generation
14
  ---
15
 
16
  # Anvaya-Rabbit 2.7B
17
 
18
+ **India's first sovereign SSM-based language model.**
 
19
 
20
+ Non-transformer architecture. No attention mechanism. Constitutional training via Gurukul. 7 patents filed at IP India.
21
 
22
+ ---
 
 
 
 
 
23
 
24
+ ## What's in this repo
25
 
26
+ Three model tiers are available, each built on the same 2.7B parameter base:
 
 
27
 
28
+ | Tier | File | Use this when… |
29
+ |---|---|---|
30
+ | **Base** | `base/Anvaya-Rabbit-2.7B-0.5-alpha-base.pt` | You want raw pretrained weights for your own fine-tuning |
31
+ | **Instruct** | `instruct/Anvaya-Rabbit-2.7B-0.5-alpha-instruct.pt` | You want a general-purpose assistant that follows instructions |
32
+ | **Imprint** | `imprint/Anvaya-Rabbit-2.7B-0.5-alpha-imprint.pt` | You want the full Rabbit persona — opinionated, constitutional, identity-aware |
33
+
34
+ If you're not sure which to use, start with **Instruct**.
35
+
36
+ ---
37
+
38
+ ## Quickstart
39
 
40
+ ```bash
41
+ pip install rtaforge transformers
42
+ ```
43
+
44
+ ```python
45
+ from transformers import AutoModelForCausalLM, AutoTokenizer
46
 
47
+ model = AutoModelForCausalLM.from_pretrained(
48
+ "RtaForge/Anvaya-Rabbit-2.7B",
49
+ trust_remote_code=True,
50
+ torch_dtype="auto",
51
+ device_map="auto",
52
+ )
53
  tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
54
+
55
+ inputs = tokenizer("Hello, I am Rabbit.", return_tensors="pt").to(model.device)
56
+ outputs = model.generate(**inputs, max_new_tokens=200)
57
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
58
  ```
59
 
60
+ > The `rtaforge` runtime package provides the compiled architecture. Source is not distributed.
61
+
62
+ ---
63
+
64
+ ## Why SSM?
65
+
66
+ > Transformers scale quadratically with context length because every token attends to every other token. SSMs replace attention with a fixed-size recurrent state: inference cost stays **constant per token** regardless of context length, VRAM footprint shrinks dramatically, and long-document throughput improves by orders of magnitude — all at the same parameter count.
67
 
68
+ ---
69
+
70
+ ## Architecture
71
 
72
+ Rabbit is built on **RtaSSM v7.2.2-FU "Fortress Unbroken"**, a custom state-space model developed at RtaForge:
 
 
 
 
 
 
73
 
74
+ - **No attention mechanism** — purely recurrent SSM layers with learned state dynamics
75
+ - **64 layers, 2560 hidden dimensions**, 2.7B parameters, bfloat16
76
+ - **Constitutional training** — Gurukul curriculum with wiki pretraining → instruct SFT → persona imprint
77
+ - **Vocabulary** 50,280 tokens (GPT-NeoX tokenizer)
78
+
79
+ ---
80
 
81
  ## Training
82
 
83
+ | Stage | Data | Notes |
84
+ |---|---|---|
85
+ | Wiki pretraining | Wikipedia (en) | 732 constitutional proposals via Gurukul |
86
+ | Instruct SFT | ChatML instruction pairs | `gate_only` trainable strategy |
87
+ | Persona imprint | Rabbit constitutional corpus | Identity and value alignment |
88
+
89
+ ---
90
+
91
+ ## Evaluation Access
92
+
93
+ Weights are publicly available. Runtime package is live:
94
+
95
+ ```bash
96
+ pip install rtaforge
97
+ ```
98
+
99
+ To evaluate Rabbit or discuss deployment:
100
+ 📧 guha@rtaforge.in
101
+ 🌐 rtaforge.in
102
+
103
+ Runtime documentation coming soon.
104
+
105
+ ---
106
+
107
+ ## Limitations
108
+
109
+ v0.5-alpha is an early research release. Rabbit has not been evaluated on standard benchmarks. She is small, she is new, and she is learning. Feedback welcome at guha@rtaforge.in.
110
+
111
+ ---
112
+
113
+ ## Citation
114
+
115
+ ```bibtex
116
+ @misc{anvaya-rabbit-2026,
117
+ title = {Anvaya-Rabbit: A Sovereign SSM Language Model},
118
+ author = {RtaForge},
119
+ year = {2026},
120
+ url = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B}
121
+ }
122
+ ```
123
+
124
+ ---
125
+
126
+ *Anvaya (अन्वय) — logical connection, coherence. Rabbit — the fast runner.*