| # Exploration |
|
|
| Ideas, technology research, and ad-hoc investigation notes. This is a scratchpad -- content here is not system-of-record. |
|
|
| **Diataxis type:** Exploration (learning-oriented, not yet distilled) |
|
|
| ## What Goes Here |
|
|
| - Technology evaluations and comparisons |
| - Prototype findings |
| - External API exploration |
| - Performance investigations |
| - Ideation and backlog notes |
|
|
| ## What Does NOT Go Here |
|
|
| - Durable learnings (go to `docs/learnings/`) |
| - Design decisions (go to `docs/design-docs/`) |
| - Implementation specs (go to `specs/`) |
| - Operational how-to guides (go to `docs/guides/`) |
|
|
| ## Exploration Index |
|
|
| | Topic | Type | Date | Summary | |
| |-------|------|------|---------| |
| | [grpo-collapse-analysis.md](grpo-collapse-analysis.md) | Investigation | 2026-04 | Post-mortem on Qwen3-1.7B GRPO collapse into degenerate null-argument tool calls | |
| | [grpo-plateau-plan.md](grpo-plateau-plan.md) | Investigation | 2026-04 | Interventions to push past 30-40% accuracy plateau in GRPO training | |
| | [grpo-training-session-log.md](grpo-training-session-log.md) | Investigation | 2026-04 | Running log of SFT warmup + GRPO training sessions on Colab L4 | |
| | [rl-vs-icl-research.md](rl-vs-icl-research.md) | Comparison | 2026-04 | When GRPO training adds value over pure prompting for small SQL agents | |
| | [train-grpo-walkthrough.md](train-grpo-walkthrough.md) | Prototype | 2026-04 | Step-by-step companion guide for train_grpo.ipynb | |
| |
| ## Types |
| |
| - **Tech Eval:** Evaluating a library, framework, or service |
| - **Prototype:** Findings from exploratory prototyping |
| - **Investigation:** Deep dive into a specific problem |
| - **Comparison:** Side-by-side analysis of options |
| |
| ## Graduating Content |
| |
| When exploration produces durable insights: |
| 1. Extract patterns to `docs/learnings/<category>.md` |
| 2. Create reference files in `docs/references/` for agent context |
| 3. Create how-to guides in `docs/guides/` for operational procedures |
| |