Stack-2-9-finetuned / evaluation

Commit History

feat: add evaluation datasets (HumanEval 50, MBPP 100, Tool scenarios 50)
20a06fb

walidsobhie-code commited on