foros / README.md

Upload README.md with huggingface_hub

1a4a73d verified 4 days ago

5.87 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- robotics
	- instruction-following
	- structured-generation
	- text-to-json
	- ros
	- ros2
	- sparse-transformer
	- embedded-ai
	- on-device
	- temporal-control
	- control-loop
	pipeline_tag: text-generation
	inference: false
	---

	# Foros — Robotics Action Engine

	Foros is an ultra-compact 10M parameter instruction-to-JSON model designed
	for low-latency, on-device robotics control. It translates plain-English robot
	commands — including temporal loops, timed sequences, and FSM transitions —
	directly into structured JSON arrays of operations compatible with ROS / ROS2.

	Developed by AMEFORGE — https://huggingface.co/AMFORGE.
	Built on the in-house SparseMind architecture (sparse token attention +
	sparse channel FFN + dynamic neuron typing).

	---

	## Benchmark Results (Measured — Kaggle T4 GPU)

	Evaluated on a 200-example robotics test suite (natural language → exact ROS JSON).

	\| Model \| JSON Valid (%) \| Exact Match (%) \| Latency (ms) \| Size (MB) \| Source \|
	\|---\|---\|---\|---\|---\|---\|
	\| 🚀 Foros — AMEFORGE (Ours) \| 95.5% \| 94.0% \| 345.4 \| 38.3 \| Measured \|
	\| ❌ TinyLlama-1.1B \| 50.0% \| 10.0% \| 2448.1 \| 2098.2 \| Measured \|
	\| ⚠️ Phi-3-Mini-3.8B \| 85.0% \| 55.0% \| 1250.0 \| 7600.0 \| Literature Estimate \|

	Key takeaways:
	- Foros outperforms TinyLlama-1.1B by +84 percentage points in exact match on robotics commands.
	- Foros is 7× faster than TinyLlama in response time.
	- At 38.3 MB, Foros is 55× smaller than TinyLlama — runs on Jetson, Raspberry Pi, or any embedded CPU with no cloud dependency.

	---

	## What it does

	### Atomic Commands
	\| Natural Language Input \| Structured Output (ROS JSON) \|
	\|---\|---\|
	\| `move to x=0.5 y=-1.2 z=0.8` \| `[{"op":"move","x":0.5,"y":-1.2,"z":0.8}]` \|
	\| `rotate joints to [0.0, 45.0, 90.0, 0.0, 0.0, 0.0]` \| `[{"op":"joint_move","joints":[0.0,45.0,90.0,0.0,0.0,0.0]}]` \|
	\| `close gripper with force 0.75` \| `[{"op":"gripper","action":"close","force":0.75}]` \|
	\| `wait for 3.5 seconds` \| `[{"op":"wait","seconds":3.5}]` \|
	\| `set velocity to 0.75 m/s` \| `[{"op":"speed","velocity":0.75}]` \|

	### Temporal / Loop Commands (v2)
	\| Natural Language Input \| Structured Output \|
	\|---\|---\|
	\| `repeat 5 times: move arm` \| `[{"op":"repeat","times":5,"body":[{"op":"move",...}]}]` \|
	\| `keep doing move arm until obstacle` \| `[{"op":"repeat_until","cond":"obstacle","body":[...]}]` \|
	\| `run control loop at 100Hz for 2.5 seconds` \| `[{"op":"control_loop","frequency_hz":100,"duration_s":2.5,"body":[...]}]` \|
	\| `every 0.5s do rotate joints for 4 steps` \| `[{"op":"timed_seq","interval_s":0.5,"count":4,"body":[...]}]` \|
	\| `simultaneously move arm and set speed` \| `[{"op":"parallel","branches":[[...],[...]]}]` \|

	### Complex Sequences (Multi-step planning)
	```json
	Input: pick up the red_box at 0.5 0.5 0.0 and place it at -0.5 1.0 0.0 =>
	Output: [
	{"op":"move","x":0.5,"y":0.5,"z":0.0},
	{"op":"gripper","action":"close"},
	{"op":"move","x":0.5,"y":0.5,"z":0.3},
	{"op":"move","x":-0.5,"y":1.0,"z":0.3},
	{"op":"move","x":-0.5,"y":1.0,"z":0.0},
	{"op":"gripper","action":"open"}
	]
	```

	---

	## Supported Operations (v2)

	\| Category \| Operations \|
	\|---\|---\|
	\| Motion \| `move`, `joint_move`, `move_tcp`, `move_joint`, `home`, `trajectory` \|
	\| End Effector \| `gripper`, `tool`, `get_joint_values` \|
	\| Control Flow \| `wait`, `safety`, `stop`, `repeat`, `repeat_until` \|
	\| Temporal \| `timed_seq`, `control_loop`, `parallel`, `state_transition` \|

	---

	## Model Details

	\| Property \| Value \|
	\|---\|---\|
	\| Architecture \| SparseMind (decoder-only, sparse) \|
	\| Parameters \| 10,347,395 (~10.3M) \|
	\| Hidden size / Layers \| 256 / 6 \|
	\| Context length \| 384 tokens \|
	\| Tokenizer \| SparsForos Tokenizer (SentencePiece-BPE, vocab 3000) \|
	\| Precision \| FP32 \|
	\| Model Size \| 38.3 MB \|

	---

	## Training Data

	Trained on a rich hybrid dataset combining real execution data and synthetic temporal patterns:

	\| Source \| Type \| Rows \|
	\|---\|---\|---\|
	\| Synthetic (AMEFORGE) \| Atomic + temporal ROS/JSON commands \| 45,000 \|
	\| [`milistu/robot-instructions`](https://huggingface.co/datasets/milistu/robot-instructions) \| Real function-call robot instructions \| ~887 \|
	\| [`jat-project/jat-dataset`](https://huggingface.co/datasets/jat-project/jat-dataset) \| Meta-World manipulation trajectories \| ~500 \|
	\| [`lerobot/pusht`](https://huggingface.co/datasets/lerobot/pusht) \| Push-T manipulation action sequences \| ~400 \|

	---

	## Local Inference

	```python
	import torch, sentencepiece as spm
	from huggingface_hub import hf_hub_download

	# Download
	# Since the tokenizer repository is private, you must pass your HF_TOKEN
	model_file = hf_hub_download(repo_id="AMFORGE/foros", filename="foros.pt")
	tok_file = hf_hub_download(repo_id="AMFORGE/foros_tok", filename="sparsforos_tokenizer.model", token="YOUR_HF_TOKEN")

	# Tokenizer
	sp = spm.SentencePieceProcessor()
	sp.Load(tok_file)

	# Model (import SparseMind & Config from training script)
	from sparsemind_robotics_train import SparseMind, Config
	ckpt = torch.load(model_file, map_location="cpu", weights_only=False)
	cfg = Config(**{k: v for k, v in ckpt["config"].items() if k in Config.__dataclass_fields__})
	model = SparseMind(cfg)
	model.load_state_dict(ckpt["model"])
	model.eval()

	# Inference
	prompt = "move to x=0.5 y=-1.2 z=0.8 =>"
	input_ids = torch.tensor([sp.EncodeAsIds(prompt)])
	out_ids = model.generate(input_ids, max_new=128, temp=1.0, top_k=1)
	result = sp.DecodeIds(out_ids[0, input_ids.shape[1]:].tolist())
	print(result) # [{"op":"move","x":0.5,"y":-1.2,"z":0.8}]
	```

	---

	## Citation

	```bibtex
	@misc{foros_robotics,
	title = {Foros: An On-Device Instruction-to-JSON Engine for Robotics},
	author = {AMEFORGE},
	year = {2026},
	note = {Built on the SparseMind architecture. https://huggingface.co/AMFORGE}
	}
	```