evalstate/birch-html / analysis /deep-dives /opus-performance-summary.csv
evalstate's picture
download
raw
845 Bytes
suite,model,quality,generation_ok,generation_total,duration_s,total_tokens,input_tokens,output_tokens,effective_input_tokens,cache_tokens,cache_pct_input,tokens_per_s,output_tokens_per_s,turns,tool_calls,det_failures,vlm_failures,vlm_warnings,rank_quality_efficiency
publish,opus47,100.0,5,5,872.87,2041367,1980822,60545,228388,1752434,88.47,2338.68,69.36,67,83,0,0,0,2
new-model-day,opus46,98.8,5,5,997.508,1851922,1793023,58899,191646,1601377,89.31,1856.55,59.05,73,98,2,0,0,7
new-model-day,opus48,100.0,5,5,979.386,2354120,2286911,67209,267975,2018936,88.28,2403.67,68.62,71,91,0,0,0,6
new-model-day,opus?task_budget=50000,87.0,4,5,325.213,516007,488908,27099,184283,304625,62.31,1586.67,83.33,28,29,4,1,1,8
new-model-day,opus?task_budget=200000,97.4,5,5,1189.632,3680965,3584777,96188,311493,3273284,91.31,3094.2,80.86,88,105,4,0,2,10

Xet Storage Details

Size:
845 Bytes
·
Xet hash:
d8d70088f6cd235d46f02358fd0f2751e0822774b38a0ae4445e53a1428c11cc

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.