Commit History
add complete humaneval output for gpt-4o 45710d9
add complete mmlu output for gpt-4o 0948b4d
add complete math output for gpt-4o 7d377c3
add viz tab for mint 38a40d1
add some outputs da7aaba
update results fe6c7e5
Xingyao Wang commited on
plot success rate with cost when available 743d952
Xingyao Wang commited on
add results for deepseek chat v2 126490f
Xingyao Wang commited on
add codeact swe agent 9b33edf
Xingyao Wang commited on
update gitignore 1c3a57d
Xingyao Wang commited on
add gpt4o result for 1.5 5dbfa12
Xingyao Wang commited on
move data to swe_bench_lite 23df10d
Xingyao Wang commited on
Merge commit 'f6d9f43457bdadd36685181efda2fd45e813a02c' d61638c
Xingyao Wang commited on
visualize swe-bench-lite & fix stuck in look 4deac19
Xingyao Wang commited on
add cost info when exists f6d9f43
Xingyao Wang commited on
show errrors 565afe1
Xingyao Wang commited on
rename dir 0d2d477
Xingyao Wang commited on
add result for deepseek f07fb3e
Xingyao Wang commited on
fix visualizer for json 260700f
Xingyao Wang commited on
fix glob 3c245bf
Xingyao Wang commited on
update visualizer on multi-page 1412295
Xingyao Wang commited on
add results for gpt-4o 72c2e93
Xingyao Wang commited on
change to only load merged 3bf3aaa
Xingyao Wang commited on
updare resykts cd893a5
Xingyao Wang commited on
Update README.md f995976 verified
add absolute number of solved 886e465
Xingyao Wang commited on
update float c6f2aaa
Xingyao Wang commited on
change to pct 5864960
Xingyao Wang commited on
add benchmark code edcb2c1
Xingyao Wang commited on
support multi-page 4e9c2f0
Xingyao Wang commited on
also show metadata for exp results 5f8e68b
Xingyao Wang commited on
update gitignore a6f521f
Xingyao Wang commited on
update app 87b70a8
Xingyao Wang commited on
support the visualization of refractored arch 525d2f3
Xingyao Wang commited on
update gitignore 4bbc5ff
Xingyao Wang commited on
remove all logs 3f290ce
Xingyao Wang commited on
initial results 2e05a39
Xingyao Wang commited on