Add SWE-bench Verified evaluation result (74.0%)

#35

by SaylorTwift HF Staff - opened 2 days ago

base: refs/heads/main

←

from: refs/pr/35

Discussion Files changed

+10

-0

YAML Metadata Error:Invalid content in Eval Result file .eval_results/swe_bench_verified_with_tools.yaml

Check out the documentation for more information.

Show details

Task ID "swe_bench_verified" does not match any task in dataset "SWE-bench/SWE-bench_Verified". Available: swe_bench_%_resolved, swe_bench_average_cost

Files changed (1) hide show

.eval_results/swe_bench_verified_with_tools.yaml +10 -0

.eval_results/swe_bench_verified_with_tools.yaml ADDED Viewed

	@@ -0,0 +1,10 @@

+- dataset:
+    id: SWE-bench/SWE-bench_Verified
+    task_id: swe_bench_verified
+  value: 74.0
+  date: '2026-03-17'
+  source:
+    url: https://huggingface.co/MiniMaxAI/MiniMax-M2.1
+    name: Model Card
+    user: burtenshaw
+  notes: "With tools"