docs: v0.55 — wiki warmup complete, checkpoint deprecation notice

91b7e4a verified 7 days ago

4.3 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- ssm
	- state-space-model
	- causal-lm
	- rabbit
	- rtaforge
	- india
	- sovereign-ai
	pipeline_tag: text-generation
	---

	# Anvaya-Rabbit 2.7B

	India's first sovereign SSM-based language model.

	Non-transformer architecture. No attention mechanism. Constitutional training via Gurukul. 7 patents filed at IP India.

	---

	## ⚠️ Checkpoint Deprecation Notice

	\| Checkpoint \| Status \| Notes \|
	\|---\|---\|---\|
	\| `Anvaya-Rabbit-2.7B-0.55-base.pt` \| ✅ CURRENT \| Wikipedia warmup complete, CE 0.993x \|
	\| Any prior checkpoint \| ⚠️ DEPRECATED \| Do not use for inference \|

	Prior checkpoints are retained for research transparency.
	The current checkpoint reflects iterative refinement of the
	ANVAYA RtaSSM architecture and training pipeline.

	Always use the latest `-base.pt` for any downstream work.

	---

	## What's in this repo

	\| Tier \| File \| Use this when… \|
	\|---\|---\|---\|
	\| Base \| `base/Anvaya-Rabbit-2.7B-0.55-base.pt` \| You want raw pretrained weights for your own fine-tuning \|

	Instruct and Imprint tiers are in preparation (epoch 2 → SFT → imprint pipeline).

	---

	## Quickstart

	```bash
	pip install rtaforge transformers
	```

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
	tokenizer.add_special_tokens({"additional_special_tokens": ["<\|im_start\|>", "<\|im_end\|>"]})

	model = AutoModelForCausalLM.from_pretrained(
	"RtaForge/Anvaya-Rabbit-2.7B",
	trust_remote_code=True,
	torch_dtype="bfloat16",
	device_map="auto",
	)

	prompt = "Rabbit is a helpful and honest assistant.\n\nUser: Who are you?\nRabbit:"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=60, repetition_penalty=1.3)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	> The `rtaforge` runtime package provides the compiled architecture. Source is not distributed.

	---

	## Why SSM?

	> Transformers scale quadratically with context length because every token attends to every other token. SSMs replace attention with a fixed-size recurrent state: inference cost stays constant per token regardless of context length, VRAM footprint shrinks dramatically, and long-document throughput improves by orders of magnitude — all at the same parameter count.

	---

	## Architecture

	Rabbit is built on RtaSSM v7.2.2-FU "Fortress Unbroken", a custom state-space model developed at RtaForge:

	- No attention mechanism — purely recurrent SSM layers with learned state dynamics
	- 64 layers, 2560 hidden dimensions, 2.7B parameters, bfloat16
	- Constitutional training — Gurukul curriculum with wiki pretraining → instruct SFT → persona imprint
	- Vocabulary 50,280 tokens (GPT-NeoX tokenizer)

	---

	## Training

	\| Stage \| Data \| Notes \|
	\|---\|---\|---\|
	\| Wiki warmup (v0.55) \| Wikipedia (en) \| 700 constitutional proposals via Gurukul — complete \|
	\| Epoch 2 (planned) \| RedPajama \| Gate-only, ~3,350 proposals \|
	\| Instruct SFT (planned) \| ChatML instruction pairs \| `gate_only` trainable strategy \|
	\| Persona imprint (planned) \| Rabbit constitutional corpus \| Identity and value alignment \|

	---

	## Evaluation Access

	Weights are publicly available. Runtime package is live:

	```bash
	pip install rtaforge
	```

	To evaluate Rabbit or discuss deployment:
	📧 guha@rtaforge.in
	🌐 rtaforge.in

	Runtime documentation coming soon.

	---

	## Maturity and Roadmap

	v0.55 is a base pretrained checkpoint — Wikipedia warmup complete, CE ratio 0.993×.
	Usable conversational behaviour is targeted at v0.8–v0.9, currently in training.

	- Evaluating for deployment? Wait for v0.9.
	- Evaluating the architecture or training methodology? v0.55-base is exactly what you need.

	## Limitations

	v0.55 has not been evaluated on standard benchmarks. She is small, she is new, and she is learning. Feedback welcome at guha@rtaforge.in.

	---

	## Citation

	```bibtex
	@misc{anvaya-rabbit-2026,
	title = {Anvaya-Rabbit: A Sovereign SSM Language Model},
	author = {RtaForge},
	year = {2026},
	url = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B}
	}
	```

	---

	Anvaya (अन्वय) — logical connection, coherence. Rabbit — the fast runner.