tvastr commited on
Commit
bcfbdc0
·
verified ·
1 Parent(s): cbf9aed

Revert to v0.1-alpha baseline and withdraw v0.5-alpha

Browse files
Files changed (1) hide show
  1. README.md +10 -97
README.md CHANGED
@@ -17,111 +17,28 @@ pipeline_tag: text-generation
17
 
18
  **India's first sovereign SSM-based language model.**
19
 
20
- Non-transformer architecture. No attention mechanism. Constitutional training via Gurukul. 7 patents filed at IP India.
21
-
22
- ---
23
-
24
- ## What's in this repo
25
-
26
- Three model tiers are available, each built on the same 2.7B parameter base:
27
-
28
- | Tier | File | Use this when… |
29
- |---|---|---|
30
- | **Base** | `base/Anvaya-Rabbit-2.7B-0.5-alpha-base.pt` | You want raw pretrained weights for your own fine-tuning |
31
- | **Instruct** | `instruct/Anvaya-Rabbit-2.7B-0.5-alpha-instruct.pt` | You want a general-purpose assistant that follows instructions |
32
- | **Imprint** | `imprint/Anvaya-Rabbit-2.7B-0.5-alpha-imprint.pt` | You want the full Rabbit persona — opinionated, constitutional, identity-aware |
33
-
34
- If you're not sure which to use, start with **Instruct**.
35
-
36
  ---
37
 
38
- ## Quickstart
39
-
40
- ```bash
41
- pip install rtaforge transformers
42
- ```
43
-
44
- ```python
45
- from transformers import AutoModelForCausalLM, AutoTokenizer
46
 
47
- tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
48
- tokenizer.add_special_tokens({"additional_special_tokens": ["<|im_start|>", "<|im_end|>"]})
49
-
50
- model = AutoModelForCausalLM.from_pretrained(
51
- "RtaForge/Anvaya-Rabbit-2.7B",
52
- trust_remote_code=True,
53
- torch_dtype="bfloat16",
54
- device_map="auto",
55
- )
56
-
57
- # v0.5-alpha uses raw completion format
58
- prompt = "Rabbit is a helpful and honest assistant.\n\nUser: Who are you?\nRabbit:"
59
- inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
60
- outputs = model.generate(**inputs, max_new_tokens=60, repetition_penalty=1.3)
61
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
62
- ```
63
-
64
- > *v0.5-alpha uses raw completion format. Chat template support (ChatML) coming in v0.9.*
65
-
66
- > The `rtaforge` runtime package provides the compiled architecture. Source is not distributed.
67
 
68
  ---
69
 
70
- ## Why SSM?
71
 
72
- > Transformers scale quadratically with context length because every token attends to every other token. SSMs replace attention with a fixed-size recurrent state: inference cost stays **constant per token** regardless of context length, VRAM footprint shrinks dramatically, and long-document throughput improves by orders of magnitude — all at the same parameter count.
 
 
 
73
 
74
  ---
75
 
76
  ## Architecture
77
 
78
- Rabbit is built on **RtaSSM v7.2.2-FU "Fortress Unbroken"**, a custom state-space model developed at RtaForge:
79
-
80
- - **No attention mechanism** — purely recurrent SSM layers with learned state dynamics
81
- - **64 layers, 2560 hidden dimensions**, 2.7B parameters, bfloat16
82
- - **Constitutional training** — Gurukul curriculum with wiki pretraining → instruct SFT → persona imprint
83
- - **Vocabulary** 50,280 tokens (GPT-NeoX tokenizer)
84
-
85
- ---
86
-
87
- ## Training
88
-
89
- | Stage | Data | Notes |
90
- |---|---|---|
91
- | Wiki pretraining | Wikipedia (en) | 732 constitutional proposals via Gurukul |
92
- | Instruct SFT | ChatML instruction pairs | `gate_only` trainable strategy |
93
- | Persona imprint | Rabbit constitutional corpus | Identity and value alignment |
94
-
95
- ---
96
-
97
- ## Evaluation Access
98
-
99
- Weights are publicly available. Runtime package is live:
100
-
101
- ```bash
102
- pip install rtaforge
103
- ```
104
-
105
- To evaluate Rabbit or discuss deployment:
106
- 📧 guha@rtaforge.in
107
- 🌐 rtaforge.in
108
-
109
- Runtime documentation coming soon.
110
-
111
- ---
112
-
113
- ## Maturity and Roadmap
114
-
115
- **v0.5-alpha is a proof of concept.** It demonstrates that the RtaSSM architecture trains end-to-end, the Gurukul constitutional pipeline works, and the weights are real.
116
-
117
- Usable conversational behaviour is targeted at **v0.8–v0.9**, currently in training.
118
-
119
- - Evaluating for deployment? Wait for v0.9.
120
- - Evaluating the architecture or training methodology? v0.5-alpha is exactly what you need.
121
-
122
- ## Limitations
123
-
124
- v0.5-alpha has not been evaluated on standard benchmarks. She is small, she is new, and she is learning. Feedback welcome at guha@rtaforge.in.
125
 
126
  ---
127
 
@@ -135,7 +52,3 @@ v0.5-alpha has not been evaluated on standard benchmarks. She is small, she is n
135
  url = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B}
136
  }
137
  ```
138
-
139
- ---
140
-
141
- *Anvaya (अन्वय) — logical connection, coherence. Rabbit — the fast runner.*
 
17
 
18
  **India's first sovereign SSM-based language model.**
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ---
21
 
22
+ ## Status Update (2026-05-19)
 
 
 
 
 
 
 
23
 
24
+ **v0.5-alpha weights have been withdrawn.**
25
+ A regression was identified in the Guru governance layer during the v0.5 SFT phase, leading to sub-optimal weights.
26
+ We are restarting the SFT process from the v0.1 baseline with a fixed governance harness.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  ---
29
 
30
+ ## Available Tiers (v0.1-alpha)
31
 
32
+ | Tier | File |
33
+ |---|---|
34
+ | **Base** | `base/Anvaya-Rabbit-2.7B-0.1-alpha-base.pt` |
35
+ | **Imprint** | `imprint/Anvaya-Rabbit-2.7B-0.1-alpha-imprint.pt` |
36
 
37
  ---
38
 
39
  ## Architecture
40
 
41
+ Rabbit is built on **RtaSSM v7.2.2-FU "Fortress Unbroken"**, a custom state-space model developed at RtaForge.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
  ---
44
 
 
52
  url = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B}
53
  }
54
  ```