tvastr commited on
Commit
9fdf4c5
·
verified ·
1 Parent(s): 4774b2d

restore: README to v0.5-alpha model card

Browse files
Files changed (1) hide show
  1. README.md +97 -10
README.md CHANGED
@@ -17,28 +17,111 @@ pipeline_tag: text-generation
17
 
18
  **India's first sovereign SSM-based language model.**
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ---
21
 
22
- ## Status Update (2026-05-19)
 
 
 
 
 
 
 
23
 
24
- **v0.5-alpha weights have been withdrawn.**
25
- A regression was identified in the Guru governance layer during the v0.5 SFT phase, leading to sub-optimal weights.
26
- We are restarting the SFT process from the v0.1 baseline with a fixed governance harness.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  ---
29
 
30
- ## Available Tiers (v0.1-alpha)
31
 
32
- | Tier | File |
33
- |---|---|
34
- | **Base** | `base/Anvaya-Rabbit-2.7B-0.1-alpha-base.pt` |
35
- | **Imprint** | `imprint/Anvaya-Rabbit-2.7B-0.1-alpha-imprint.pt` |
36
 
37
  ---
38
 
39
  ## Architecture
40
 
41
- Rabbit is built on **RtaSSM v7.2.2-FU "Fortress Unbroken"**, a custom state-space model developed at RtaForge.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
  ---
44
 
@@ -52,3 +135,7 @@ Rabbit is built on **RtaSSM v7.2.2-FU "Fortress Unbroken"**, a custom state-spac
52
  url = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B}
53
  }
54
  ```
 
 
 
 
 
17
 
18
  **India's first sovereign SSM-based language model.**
19
 
20
+ Non-transformer architecture. No attention mechanism. Constitutional training via Gurukul. 7 patents filed at IP India.
21
+
22
+ ---
23
+
24
+ ## What's in this repo
25
+
26
+ Three model tiers are available, each built on the same 2.7B parameter base:
27
+
28
+ | Tier | File | Use this when… |
29
+ |---|---|---|
30
+ | **Base** | `base/Anvaya-Rabbit-2.7B-0.5-alpha-base.pt` | You want raw pretrained weights for your own fine-tuning |
31
+ | **Instruct** | `instruct/Anvaya-Rabbit-2.7B-0.5-alpha-instruct.pt` | You want a general-purpose assistant that follows instructions |
32
+ | **Imprint** | `imprint/Anvaya-Rabbit-2.7B-0.5-alpha-imprint.pt` | You want the full Rabbit persona — opinionated, constitutional, identity-aware |
33
+
34
+ If you're not sure which to use, start with **Instruct**.
35
+
36
  ---
37
 
38
+ ## Quickstart
39
+
40
+ ```bash
41
+ pip install rtaforge transformers
42
+ ```
43
+
44
+ ```python
45
+ from transformers import AutoModelForCausalLM, AutoTokenizer
46
 
47
+ tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
48
+ tokenizer.add_special_tokens({"additional_special_tokens": ["<|im_start|>", "<|im_end|>"]})
49
+
50
+ model = AutoModelForCausalLM.from_pretrained(
51
+ "RtaForge/Anvaya-Rabbit-2.7B",
52
+ trust_remote_code=True,
53
+ torch_dtype="bfloat16",
54
+ device_map="auto",
55
+ )
56
+
57
+ # v0.5-alpha uses raw completion format
58
+ prompt = "Rabbit is a helpful and honest assistant.\n\nUser: Who are you?\nRabbit:"
59
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
60
+ outputs = model.generate(**inputs, max_new_tokens=60, repetition_penalty=1.3)
61
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
62
+ ```
63
+
64
+ > *v0.5-alpha uses raw completion format. Chat template support (ChatML) coming in v0.9.*
65
+
66
+ > The `rtaforge` runtime package provides the compiled architecture. Source is not distributed.
67
 
68
  ---
69
 
70
+ ## Why SSM?
71
 
72
+ > Transformers scale quadratically with context length because every token attends to every other token. SSMs replace attention with a fixed-size recurrent state: inference cost stays **constant per token** regardless of context length, VRAM footprint shrinks dramatically, and long-document throughput improves by orders of magnitude — all at the same parameter count.
 
 
 
73
 
74
  ---
75
 
76
  ## Architecture
77
 
78
+ Rabbit is built on **RtaSSM v7.2.2-FU "Fortress Unbroken"**, a custom state-space model developed at RtaForge:
79
+
80
+ - **No attention mechanism** — purely recurrent SSM layers with learned state dynamics
81
+ - **64 layers, 2560 hidden dimensions**, 2.7B parameters, bfloat16
82
+ - **Constitutional training** — Gurukul curriculum with wiki pretraining → instruct SFT → persona imprint
83
+ - **Vocabulary** 50,280 tokens (GPT-NeoX tokenizer)
84
+
85
+ ---
86
+
87
+ ## Training
88
+
89
+ | Stage | Data | Notes |
90
+ |---|---|---|
91
+ | Wiki pretraining | Wikipedia (en) | 732 constitutional proposals via Gurukul |
92
+ | Instruct SFT | ChatML instruction pairs | `gate_only` trainable strategy |
93
+ | Persona imprint | Rabbit constitutional corpus | Identity and value alignment |
94
+
95
+ ---
96
+
97
+ ## Evaluation Access
98
+
99
+ Weights are publicly available. Runtime package is live:
100
+
101
+ ```bash
102
+ pip install rtaforge
103
+ ```
104
+
105
+ To evaluate Rabbit or discuss deployment:
106
+ 📧 guha@rtaforge.in
107
+ 🌐 rtaforge.in
108
+
109
+ Runtime documentation coming soon.
110
+
111
+ ---
112
+
113
+ ## Maturity and Roadmap
114
+
115
+ **v0.5-alpha is a proof of concept.** It demonstrates that the RtaSSM architecture trains end-to-end, the Gurukul constitutional pipeline works, and the weights are real.
116
+
117
+ Usable conversational behaviour is targeted at **v0.8–v0.9**, currently in training.
118
+
119
+ - Evaluating for deployment? Wait for v0.9.
120
+ - Evaluating the architecture or training methodology? v0.5-alpha is exactly what you need.
121
+
122
+ ## Limitations
123
+
124
+ v0.5-alpha has not been evaluated on standard benchmarks. She is small, she is new, and she is learning. Feedback welcome at guha@rtaforge.in.
125
 
126
  ---
127
 
 
135
  url = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B}
136
  }
137
  ```
138
+
139
+ ---
140
+
141
+ *Anvaya (अन्वय) — logical connection, coherence. Rabbit — the fast runner.*