MiniGridEnv_Blog / README.md
yashu2000's picture
Blog Render issue Update
b997527 verified
---
title: MiniGridEnv Blog
emoji: 🐠
colorFrom: green
colorTo: pink
sdk: static
pinned: false
license: apache-2.0
short_description: Blog for MiniGridEnv for OpenEnv Comp in AgentX
---
# MiniGridEnv Blog
Static blog post for the OpenEnv track of the AgentX competition (UC Berkeley RDI), covering:
- An OpenEnv-native wrap of Farama's MiniGrid / BabyAI with text observations and NL actions.
- GRPO post-training (`MiniGridPT`) with **cross-episodic, LLM-rewritten, line-budgeted markdown memory**.
- **Branch-stable** memory-file naming so each GRPO chain keeps a stable file across optimizer steps.
## Files
- `index.html` β€” main blog (self-contained: inline CSS, Mermaid via CDN).
- `banner.png` β€” 3-panel hero image (Observe β†’ Act β†’ Remember).
- `style.css` β€” legacy placeholder from the Spaces scaffold; `index.html` inlines all styling.
## Rebuild the banner
The banner is generated from a matplotlib script kept with the other impl docs:
```bash
# from the repo root
python impl-context/build_blog_images.py
# writes MiniGridEnv_Blog/banner.png at 200 DPI
```
Dependencies: `pip install matplotlib numpy`.
## Open locally
```bash
open MiniGridEnv_Blog/index.html
# or: python -m http.server --directory MiniGridEnv_Blog 8080
```
## `<INSERT>` placeholders
The blog ships with a handful of `<INSERT: ...>` placeholders that must be filled before publishing:
- `<INSERT: GitHub URL>` β€” repo URL (hero badges, buttons, quickstart `git clone`, footer).
- `<INSERT: HF Space URL>` β€” live environment Space (topnav, hero buttons, footer).
- `<INSERT: Voyager arXiv URL>` / `<INSERT: Reflexion arXiv URL>` / `<INSERT: Generative Agents arXiv URL>` β€” arXiv links in the Foundations table (pre-filled paper IDs are in the surrounding text: `2305.16291`, `2303.11366`, `2304.03442`).
- `<INSERT: Lottery HF Space URL>` β€” sibling project Space in the Foundations table.
- `<INSERT>` cells in the Results table β€” measured completion rates for GRPO and GRPO+Memory per level once converged checkpoints are available.
- `<INSERT: verbatim memory snapshot per checkpoint>` β€” optional: replace the illustrative memory-evolution cards with verbatim snapshots after a memory-mode training run.
See the Spaces configuration reference at https://huggingface.co/docs/hub/spaces-config-reference.