black-yt commited on
Commit
b031d6a
·
1 Parent(s): c03a16b

Remove standalone trained row from model card

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -192,7 +192,7 @@ Main training settings:
192
 
193
  ## 🧠 Training Result
194
 
195
- The table compares direct-prompting SOTA/baseline systems, the base Qwen model, this trained LoRA adapter, and the trained+agents system evaluated on the same LabHorizon test splits.
196
 
197
  | System | Level 1 Next Action Accuracy | Level 2 Action Sequence Similarity | Level 2 Parameter Accuracy | Level 2 Final Score |
198
  |:---|---:|---:|---:|---:|
@@ -201,7 +201,6 @@ The table compares direct-prompting SOTA/baseline systems, the base Qwen model,
201
  | GPT-5.5 | 0.535 | 0.2092 | 0.2459 | 0.2276 |
202
  | Kimi K2.6 | 0.550 | 0.2845 | 0.3456 | 0.3150 |
203
  | Qwen3.6-35B-A3B | 0.475 | 0.2585 | 0.2483 | 0.2534 |
204
- | Qwen3.6-35B-A3B(trained) | 0.635 | 0.4030 | 0.4170 | 0.4100 |
205
  | Qwen3.6-35B-A3B(trained+agents) | **0.665** | **0.4485** | **0.4580** | **0.4532** |
206
 
207
  Agent setting: `Qwen3.6-35B-A3B(trained)` is used as Actor, and Gemini 3.1 Pro is used as Simulator/Selector. The Simulator/Selector choice is the current setting and has not been exhaustively ablated.
 
192
 
193
  ## 🧠 Training Result
194
 
195
+ The table compares direct-prompting SOTA/baseline systems, the base Qwen model, and the trained+agents system evaluated on the same LabHorizon test splits.
196
 
197
  | System | Level 1 Next Action Accuracy | Level 2 Action Sequence Similarity | Level 2 Parameter Accuracy | Level 2 Final Score |
198
  |:---|---:|---:|---:|---:|
 
201
  | GPT-5.5 | 0.535 | 0.2092 | 0.2459 | 0.2276 |
202
  | Kimi K2.6 | 0.550 | 0.2845 | 0.3456 | 0.3150 |
203
  | Qwen3.6-35B-A3B | 0.475 | 0.2585 | 0.2483 | 0.2534 |
 
204
  | Qwen3.6-35B-A3B(trained+agents) | **0.665** | **0.4485** | **0.4580** | **0.4532** |
205
 
206
  Agent setting: `Qwen3.6-35B-A3B(trained)` is used as Actor, and Gemini 3.1 Pro is used as Simulator/Selector. The Simulator/Selector choice is the current setting and has not been exhaustively ablated.