File size: 4,303 Bytes
23c452a
 
 
ea927a4
23c452a
 
 
 
0830f34
23c452a
c8a4bfd
 
 
23c452a
 
fbcb549
23c452a
c8a4bfd
23c452a
9fdf4c5
 
 
 
91b7e4a
 
 
 
 
 
 
 
 
 
 
 
9fdf4c5
91b7e4a
 
 
9fdf4c5
 
 
91b7e4a
9fdf4c5
91b7e4a
9fdf4c5
c8a4bfd
 
9fdf4c5
 
 
 
 
 
 
 
c05d933
9fdf4c5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c8a4bfd
 
 
9fdf4c5
c8a4bfd
9fdf4c5
922f3bb
c8a4bfd
 
 
0830f34
9fdf4c5
 
 
 
 
 
 
 
 
 
 
 
 
91b7e4a
 
 
 
9fdf4c5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91b7e4a
9fdf4c5
 
 
91b7e4a
9fdf4c5
 
 
91b7e4a
c8a4bfd
 
 
 
 
 
 
 
 
 
 
 
 
9fdf4c5
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
---
language:
- en
license: apache-2.0
tags:
- ssm
- state-space-model
- causal-lm
- rabbit
- rtaforge
- india
- sovereign-ai
pipeline_tag: text-generation
---

# Anvaya-Rabbit 2.7B

**India's first sovereign SSM-based language model.**

Non-transformer architecture. No attention mechanism. Constitutional training via Gurukul. 7 patents filed at IP India.

---

## ⚠️ Checkpoint Deprecation Notice

| Checkpoint | Status | Notes |
|---|---|---|
| `Anvaya-Rabbit-2.7B-0.55-base.pt` | ✅ **CURRENT** | Wikipedia warmup complete, CE 0.993x |
| Any prior checkpoint | ⚠️ **DEPRECATED** | Do not use for inference |

Prior checkpoints are retained for research transparency.  
The current checkpoint reflects iterative refinement of the  
ANVAYA RtaSSM architecture and training pipeline.

**Always use the latest `-base.pt` for any downstream work.**

---

## What's in this repo

| Tier | File | Use this when… |
|---|---|---|
| **Base** | `base/Anvaya-Rabbit-2.7B-0.55-base.pt` | You want raw pretrained weights for your own fine-tuning |

Instruct and Imprint tiers are in preparation (epoch 2 → SFT → imprint pipeline).

---

## Quickstart

```bash
pip install rtaforge transformers
```

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
tokenizer.add_special_tokens({"additional_special_tokens": ["<|im_start|>", "<|im_end|>"]})

model = AutoModelForCausalLM.from_pretrained(
    "RtaForge/Anvaya-Rabbit-2.7B",
    trust_remote_code=True,
    torch_dtype="bfloat16",
    device_map="auto",
)

prompt = "Rabbit is a helpful and honest assistant.\n\nUser: Who are you?\nRabbit:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=60, repetition_penalty=1.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

> The `rtaforge` runtime package provides the compiled architecture. Source is not distributed.

---

## Why SSM?

> Transformers scale quadratically with context length because every token attends to every other token. SSMs replace attention with a fixed-size recurrent state: inference cost stays **constant per token** regardless of context length, VRAM footprint shrinks dramatically, and long-document throughput improves by orders of magnitude — all at the same parameter count.

---

## Architecture

Rabbit is built on **RtaSSM v7.2.2-FU "Fortress Unbroken"**, a custom state-space model developed at RtaForge:

- **No attention mechanism** — purely recurrent SSM layers with learned state dynamics
- **64 layers, 2560 hidden dimensions**, 2.7B parameters, bfloat16
- **Constitutional training** — Gurukul curriculum with wiki pretraining → instruct SFT → persona imprint
- **Vocabulary** 50,280 tokens (GPT-NeoX tokenizer)

---

## Training

| Stage | Data | Notes |
|---|---|---|
| Wiki warmup (v0.55) | Wikipedia (en) | 700 constitutional proposals via Gurukul — **complete** |
| Epoch 2 (planned) | RedPajama | Gate-only, ~3,350 proposals |
| Instruct SFT (planned) | ChatML instruction pairs | `gate_only` trainable strategy |
| Persona imprint (planned) | Rabbit constitutional corpus | Identity and value alignment |

---

## Evaluation Access

Weights are publicly available. Runtime package is live:

```bash
pip install rtaforge
```

To evaluate Rabbit or discuss deployment:
📧 guha@rtaforge.in
🌐 rtaforge.in

Runtime documentation coming soon.

---

## Maturity and Roadmap

**v0.55 is a base pretrained checkpoint** — Wikipedia warmup complete, CE ratio 0.993×.  
Usable conversational behaviour is targeted at **v0.8–v0.9**, currently in training.

- Evaluating for deployment? Wait for v0.9.
- Evaluating the architecture or training methodology? v0.55-base is exactly what you need.

## Limitations

v0.55 has not been evaluated on standard benchmarks. She is small, she is new, and she is learning. Feedback welcome at guha@rtaforge.in.

---

## Citation

```bibtex
@misc{anvaya-rabbit-2026,
  title  = {Anvaya-Rabbit: A Sovereign SSM Language Model},
  author = {RtaForge},
  year   = {2026},
  url    = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B}
}
```

---

*Anvaya (अन्वय) — logical connection, coherence. Rabbit — the fast runner.*