Spaces:

qpluslab
/

OpenRA-Bench

Running

App Files Files Community

OpenRA-Bench / app.py

Commit History

Training-parity minimap (real terrain + legend) + viewer (system/thinking/debrief)

39fba02

Running

yxc20098 commited on 2 days ago

Adversarial 1v1 spotlight: ladder family + rating + Elo wiring

f5e23f8

yxc20098 commited on 3 days ago

Unified Battle Viewer in app.py + run/model playback identity

0a488d3

yxc20098 commited on 3 days ago

Generalization-gap metric: held-out split in run_eval + leaderboard

03e4efa

yxc20098 commited on 3 days ago

#6 leaderboard: data layer + run_eval publish + Gradio tab

b98ab1a

yxc20098 commited on 3 days ago

Add HF identity verification and anonymous submission support

6f326d5

yxc20098 commited on Mar 2

Add server-side game aggregation with minimum 5-game threshold

e642d5b

yxc20098 commited on Mar 2

Security hardening: XSS prevention, input validation, rate limiting

9422de7

yxc20098 commited on Feb 26

Add agent URL hyperlinks, replay downloads, and submit_with_replay endpoint

9fead46

yxc20098 commited on Feb 26

Add upload form, API endpoint, 5 difficulty tiers, real game data

45ef63c

yxc20098 commited on Feb 26

Move Try experience to OpenRA-RL Space, remove Try tab from Bench

8ce66d2

yxc20098 commited on Feb 24

Rename Evaluate tab to Try, stream LLM agent gameplay

2771ddd

yxc20098 commited on Feb 24

Add in-browser evaluation via Evaluate tab

824262a

yxc20098 commited on Feb 24

Connect evaluation harness to HF-hosted OpenRA-RL environment

44493a3

yxc20098 commited on Feb 24

Move HF Space sync to openra-rl org

5962a76

yxc20098 commited on Feb 24

Add OpenRA-Bench leaderboard, evaluation harness, and rubrics

f96ea53

yxc20098 commited on Feb 19

Commit History

Training-parity minimap (real terrain + legend) + viewer (system/thinking/debrief) 39fba02 Running

Adversarial 1v1 spotlight: ladder family + rating + Elo wiring f5e23f8

Unified Battle Viewer in app.py + run/model playback identity 0a488d3

Generalization-gap metric: held-out split in run_eval + leaderboard 03e4efa

#6 leaderboard: data layer + run_eval publish + Gradio tab b98ab1a

Add HF identity verification and anonymous submission support 6f326d5

Add server-side game aggregation with minimum 5-game threshold e642d5b

Security hardening: XSS prevention, input validation, rate limiting 9422de7

Add agent URL hyperlinks, replay downloads, and submit_with_replay endpoint 9fead46

Add upload form, API endpoint, 5 difficulty tiers, real game data 45ef63c

Move Try experience to OpenRA-RL Space, remove Try tab from Bench 8ce66d2

Rename Evaluate tab to Try, stream LLM agent gameplay 2771ddd

Add in-browser evaluation via Evaluate tab 824262a

Connect evaluation harness to HF-hosted OpenRA-RL environment 44493a3

Move HF Space sync to openra-rl org 5962a76

Add OpenRA-Bench leaderboard, evaluation harness, and rubrics f96ea53

Training-parity minimap (real terrain + legend) + viewer (system/thinking/debrief)

39fba02

Running

Adversarial 1v1 spotlight: ladder family + rating + Elo wiring

f5e23f8

Unified Battle Viewer in app.py + run/model playback identity

0a488d3

Generalization-gap metric: held-out split in run_eval + leaderboard

03e4efa

#6 leaderboard: data layer + run_eval publish + Gradio tab

b98ab1a

Add HF identity verification and anonymous submission support

6f326d5

Add server-side game aggregation with minimum 5-game threshold

e642d5b

Security hardening: XSS prevention, input validation, rate limiting

9422de7

Add agent URL hyperlinks, replay downloads, and submit_with_replay endpoint

9fead46

Add upload form, API endpoint, 5 difficulty tiers, real game data

45ef63c

Move Try experience to OpenRA-RL Space, remove Try tab from Bench

8ce66d2

Rename Evaluate tab to Try, stream LLM agent gameplay

2771ddd

Add in-browser evaluation via Evaluate tab

824262a

Connect evaluation harness to HF-hosted OpenRA-RL environment

44493a3

Move HF Space sync to openra-rl org

5962a76

Add OpenRA-Bench leaderboard, evaluation harness, and rubrics

f96ea53