EvalEvalBot commited on
Commit
c92b925
·
verified ·
1 Parent(s): 87e6514

Add EvalEval community eval results

Browse files

Adds EvalEval Community Evals YAML entries with source backlinks to EEE aggregate records.

Contributor: evaleval

.eval_results/gpqa-diamond.yaml CHANGED
@@ -1,6 +1,6 @@
1
  - dataset:
2
  id: Idavidrein/gpqa
3
- task_id: gpqa_diamond
4
  date: '2026-04-17'
5
  notes: GPQA Diamond
6
  source:
 
1
  - dataset:
2
  id: Idavidrein/gpqa
3
+ task_id: diamond
4
  date: '2026-04-17'
5
  notes: GPQA Diamond
6
  source:
.eval_results/mmlu-pro.yaml CHANGED
@@ -1,6 +1,6 @@
1
  - dataset:
2
  id: TIGER-Lab/MMLU-Pro
3
- task_id: mmlu-pro
4
  source:
5
  name: EvalEval
6
  url: https://huggingface.co/datasets/evaleval/EEE_datastore/blob/b11a260fe158662bb63b4a144be2b5690615414d/flat/objects/c5/4c/c54c4ee8-ff99-4cda-a81f-a2e3a4347fb8.json
 
1
  - dataset:
2
  id: TIGER-Lab/MMLU-Pro
3
+ task_id: mmlu_pro
4
  source:
5
  name: EvalEval
6
  url: https://huggingface.co/datasets/evaleval/EEE_datastore/blob/b11a260fe158662bb63b4a144be2b5690615414d/flat/objects/c5/4c/c54c4ee8-ff99-4cda-a81f-a2e3a4347fb8.json