Benchmarks of reasoning levels?

#6
by coder543 - opened

Are there any benchmarks of the reasoning levels that are mentioned in the Step-3.7-Flash-GGUF README? It would be nice to see how token usage varies across the reasoning levels for a set of standardized benchmarks, and how it affects the scores in those benchmarks.

coder543 changed discussion title from Benchmarks of reasoning levels to Benchmarks of reasoning levels?

Well, that was removed without any explanation of why: https://huggingface.co/stepfun-ai/Step-3.7-Flash-GGUF/commit/348385622260399c185368a98586f649a75bf57e

Does this model support different reasoning levels? That would be extremely nice.

coder543 changed discussion status to closed
coder543 changed discussion status to open

Sign up or log in to comment