Different Score on BrowseComp for Cladue Opus-4.5

#56
by Shaobo1103 - opened

https://huggingface.co/MiniMaxAI/MiniMax-M2.1
In M2.1's report, the score on Browsecomp for Opus-4.5 with context management is 57.8
But in this report, I'm wondering why it increased to 67.8?

Sign up or log in to comment