Image-Text-to-Text
PEFT
Safetensors
laboratory
protocol-conditioned-action-prediction
lora
qwen
long-horizon-planning
conversational
Instructions to use Stanford-CongLab/LabHorizon-Model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Stanford-CongLab/LabHorizon-Model with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.6-35B-A3B") model = PeftModel.from_pretrained(base_model, "Stanford-CongLab/LabHorizon-Model") - Notebooks
- Google Colab
- Kaggle
Remove standalone trained row from model card
Browse files
README.md
CHANGED
|
@@ -192,7 +192,7 @@ Main training settings:
|
|
| 192 |
|
| 193 |
## 🧠 Training Result
|
| 194 |
|
| 195 |
-
The table compares direct-prompting SOTA/baseline systems, the base Qwen model,
|
| 196 |
|
| 197 |
| System | Level 1 Next Action Accuracy | Level 2 Action Sequence Similarity | Level 2 Parameter Accuracy | Level 2 Final Score |
|
| 198 |
|:---|---:|---:|---:|---:|
|
|
@@ -201,7 +201,6 @@ The table compares direct-prompting SOTA/baseline systems, the base Qwen model,
|
|
| 201 |
| GPT-5.5 | 0.535 | 0.2092 | 0.2459 | 0.2276 |
|
| 202 |
| Kimi K2.6 | 0.550 | 0.2845 | 0.3456 | 0.3150 |
|
| 203 |
| Qwen3.6-35B-A3B | 0.475 | 0.2585 | 0.2483 | 0.2534 |
|
| 204 |
-
| Qwen3.6-35B-A3B(trained) | 0.635 | 0.4030 | 0.4170 | 0.4100 |
|
| 205 |
| Qwen3.6-35B-A3B(trained+agents) | **0.665** | **0.4485** | **0.4580** | **0.4532** |
|
| 206 |
|
| 207 |
Agent setting: `Qwen3.6-35B-A3B(trained)` is used as Actor, and Gemini 3.1 Pro is used as Simulator/Selector. The Simulator/Selector choice is the current setting and has not been exhaustively ablated.
|
|
|
|
| 192 |
|
| 193 |
## 🧠 Training Result
|
| 194 |
|
| 195 |
+
The table compares direct-prompting SOTA/baseline systems, the base Qwen model, and the trained+agents system evaluated on the same LabHorizon test splits.
|
| 196 |
|
| 197 |
| System | Level 1 Next Action Accuracy | Level 2 Action Sequence Similarity | Level 2 Parameter Accuracy | Level 2 Final Score |
|
| 198 |
|:---|---:|---:|---:|---:|
|
|
|
|
| 201 |
| GPT-5.5 | 0.535 | 0.2092 | 0.2459 | 0.2276 |
|
| 202 |
| Kimi K2.6 | 0.550 | 0.2845 | 0.3456 | 0.3150 |
|
| 203 |
| Qwen3.6-35B-A3B | 0.475 | 0.2585 | 0.2483 | 0.2534 |
|
|
|
|
| 204 |
| Qwen3.6-35B-A3B(trained+agents) | **0.665** | **0.4485** | **0.4580** | **0.4532** |
|
| 205 |
|
| 206 |
Agent setting: `Qwen3.6-35B-A3B(trained)` is used as Actor, and Gemini 3.1 Pro is used as Simulator/Selector. The Simulator/Selector choice is the current setting and has not been exhaustively ablated.
|