metadata
license: apache-2.0
language:
- en
tags:
- robotics
- instruction-following
- structured-generation
- text-to-json
- ros
- ros2
- sparse-transformer
- embedded-ai
- on-device
- temporal-control
- control-loop
pipeline_tag: text-generation
inference: false
Foros β Robotics Action Engine
Foros is an ultra-compact 10M parameter instruction-to-JSON model designed for low-latency, on-device robotics control. It translates plain-English robot commands β including temporal loops, timed sequences, and FSM transitions β directly into structured JSON arrays of operations compatible with ROS / ROS2.
Developed by AMEFORGE β https://huggingface.co/AMFORGE.
Built on the in-house SparseMind architecture (sparse token attention +
sparse channel FFN + dynamic neuron typing).
Benchmark Results (Measured β Kaggle T4 GPU)
Evaluated on a 200-example robotics test suite (natural language β exact ROS JSON).
| Model | JSON Valid (%) | Exact Match (%) | Latency (ms) | Size (MB) | Source |
|---|---|---|---|---|---|
| π Foros β AMEFORGE (Ours) | 95.5% | 94.0% | 345.4 | 38.3 | Measured |
| β TinyLlama-1.1B | 50.0% | 10.0% | 2448.1 | 2098.2 | Measured |
| β οΈ Phi-3-Mini-3.8B | 85.0% | 55.0% | 1250.0 | 7600.0 | Literature Estimate |
Key takeaways:
- Foros outperforms TinyLlama-1.1B by +84 percentage points in exact match on robotics commands.
- Foros is 7Γ faster than TinyLlama in response time.
- At 38.3 MB, Foros is 55Γ smaller than TinyLlama β runs on Jetson, Raspberry Pi, or any embedded CPU with no cloud dependency.
What it does
Atomic Commands
| Natural Language Input | Structured Output (ROS JSON) |
|---|---|
move to x=0.5 y=-1.2 z=0.8 |
[{"op":"move","x":0.5,"y":-1.2,"z":0.8}] |
rotate joints to [0.0, 45.0, 90.0, 0.0, 0.0, 0.0] |
[{"op":"joint_move","joints":[0.0,45.0,90.0,0.0,0.0,0.0]}] |
close gripper with force 0.75 |
[{"op":"gripper","action":"close","force":0.75}] |
wait for 3.5 seconds |
[{"op":"wait","seconds":3.5}] |
set velocity to 0.75 m/s |
[{"op":"speed","velocity":0.75}] |
Temporal / Loop Commands (v2)
| Natural Language Input | Structured Output |
|---|---|
repeat 5 times: move arm |
[{"op":"repeat","times":5,"body":[{"op":"move",...}]}] |
keep doing move arm until obstacle |
[{"op":"repeat_until","cond":"obstacle","body":[...]}] |
run control loop at 100Hz for 2.5 seconds |
[{"op":"control_loop","frequency_hz":100,"duration_s":2.5,"body":[...]}] |
every 0.5s do rotate joints for 4 steps |
[{"op":"timed_seq","interval_s":0.5,"count":4,"body":[...]}] |
simultaneously move arm and set speed |
[{"op":"parallel","branches":[[...],[...]]}] |
Complex Sequences (Multi-step planning)
Input: pick up the red_box at 0.5 0.5 0.0 and place it at -0.5 1.0 0.0 =>
Output: [
{"op":"move","x":0.5,"y":0.5,"z":0.0},
{"op":"gripper","action":"close"},
{"op":"move","x":0.5,"y":0.5,"z":0.3},
{"op":"move","x":-0.5,"y":1.0,"z":0.3},
{"op":"move","x":-0.5,"y":1.0,"z":0.0},
{"op":"gripper","action":"open"}
]
Supported Operations (v2)
| Category | Operations |
|---|---|
| Motion | move, joint_move, move_tcp, move_joint, home, trajectory |
| End Effector | gripper, tool, get_joint_values |
| Control Flow | wait, safety, stop, repeat, repeat_until |
| Temporal | timed_seq, control_loop, parallel, state_transition |
Model Details
| Property | Value |
|---|---|
| Architecture | SparseMind (decoder-only, sparse) |
| Parameters | 10,347,395 (~10.3M) |
| Hidden size / Layers | 256 / 6 |
| Context length | 384 tokens |
| Tokenizer | SparsForos Tokenizer (SentencePiece-BPE, vocab 3000) |
| Precision | FP32 |
| Model Size | 38.3 MB |
Training Data
Trained on a rich hybrid dataset combining real execution data and synthetic temporal patterns:
| Source | Type | Rows |
|---|---|---|
| Synthetic (AMEFORGE) | Atomic + temporal ROS/JSON commands | 45,000 |
milistu/robot-instructions |
Real function-call robot instructions | ~887 |
jat-project/jat-dataset |
Meta-World manipulation trajectories | ~500 |
lerobot/pusht |
Push-T manipulation action sequences | ~400 |
Local Inference
import torch, sentencepiece as spm
from huggingface_hub import hf_hub_download
# Download
# Since the tokenizer repository is private, you must pass your HF_TOKEN
model_file = hf_hub_download(repo_id="AMFORGE/foros", filename="foros.pt")
tok_file = hf_hub_download(repo_id="AMFORGE/foros_tok", filename="sparsforos_tokenizer.model", token="YOUR_HF_TOKEN")
# Tokenizer
sp = spm.SentencePieceProcessor()
sp.Load(tok_file)
# Model (import SparseMind & Config from training script)
from sparsemind_robotics_train import SparseMind, Config
ckpt = torch.load(model_file, map_location="cpu", weights_only=False)
cfg = Config(**{k: v for k, v in ckpt["config"].items() if k in Config.__dataclass_fields__})
model = SparseMind(cfg)
model.load_state_dict(ckpt["model"])
model.eval()
# Inference
prompt = "move to x=0.5 y=-1.2 z=0.8 =>"
input_ids = torch.tensor([sp.EncodeAsIds(prompt)])
out_ids = model.generate(input_ids, max_new=128, temp=1.0, top_k=1)
result = sp.DecodeIds(out_ids[0, input_ids.shape[1]:].tolist())
print(result) # [{"op":"move","x":0.5,"y":-1.2,"z":0.8}]
Citation
@misc{foros_robotics,
title = {Foros: An On-Device Instruction-to-JSON Engine for Robotics},
author = {AMEFORGE},
year = {2026},
note = {Built on the SparseMind architecture. https://huggingface.co/AMFORGE}
}