sql_env / docs_draft /TEAM_COORDINATION.md
hjerpe's picture
Upload folder using huggingface_hub
9e64e71 verified
# SQLEnv Team Coordination
## Team
| Person | Role | Contact |
|--------|------|---------|
| You (Hjerp) | Coordinator | - |
| Kevlar | Contributor | - |
| Jindal | Contributor | - |
---
## Kickoff Meeting - Feb 9
### 1. Personal Wins
> "What would make this project a win for you personally?"
| Person | Personal Win |
|--------|--------------|
| You | |
| Kevlar | |
| Jindal | |
### 2. Availability
| Person | Hours/Week | Best Times | Known Conflicts |
|--------|------------|------------|-----------------|
| You | | | |
| Kevlar | | | |
| Jindal | | | |
**Total available:** _____ hours/week
**Estimated need:** ~40-60 hours total over 16 days (~15-20 hrs/week combined)
**Gap?** If yes, scope discussion needed.
### 3. Ownership Areas
From project brief, 5 natural ownership areas. Let people **claim** - don't assign.
| Area | Owner | Notes |
|------|-------|-------|
| Environment Engineering | | OpenEnv integration, WebSocket, Docker, action handlers |
| Reward Design | | 3-layer rewards, progress metrics, anti-gaming |
| Dataset Curation | | Spider questions, answer verification, difficulty balance |
| Training Pipeline | | GRPO setup, prompts, evaluation, Green Agent |
| Storytelling/Blog | | Blog post, demos, results visualization |
**Note:** Some areas can be shared or split. Training Pipeline depends on Environment + Reward being done first.
---
## First Deliverables (Full Tickets)
Use **full format** for first deliverables (commitment test).
### Ticket 1
**WHO:**
**WHAT:**
**WHY:**
**CHALLENGE:**
**DUE:**
**DELIVERED:**
### Ticket 2
**WHO:**
**WHAT:**
**WHY:**
**CHALLENGE:**
**DUE:**
**DELIVERED:**
### Ticket 3
**WHO:**
**WHAT:**
**WHY:**
**CHALLENGE:**
**DUE:**
**DELIVERED:**
---
## Suggested First Deliverables (If Needed)
Based on Phase 1 requirements, here are ready-to-use tickets:
### Environment Scaffold
**WHO:** [Claim during meeting]
**WHAT:** Run `openenv init sql_env`, customize Pydantic models (SQLAction, SQLObservation), get `openenv validate` passing
**WHY:** Unblocks all environment work; proves Docker/WebSocket setup works
**CHALLENGE:** Docker configuration; WebSocket timeout settings; understanding OpenEnv API
**DUE:** [Wednesday EOD?]
**DELIVERED:**
### Initial Question Set
**WHO:** [Claim during meeting]
**WHAT:** Select 30 questions from Spider dev set (12 easy, 12 medium, 6 hard) with gold answers in JSON format
**WHY:** Enables manual testing of environment in Phase 2; needed for reward computation
**CHALLENGE:** Balancing difficulty; avoiding questions needing unsupported SQL features
**DUE:** [Wednesday EOD?]
**DELIVERED:**
### OpenEnv Tutorial Review
**WHO:** [Claim during meeting]
**WHAT:** Complete OpenEnv tutorial notebook, document key learnings and gotchas for team
**WHY:** Reduces ramp-up time for others; surfaces unknowns early
**CHALLENGE:** Tutorial may have gaps; need to map to our SQL use case
**DUE:** [Tuesday EOD?]
**DELIVERED:**
---
## Coordination Rituals
### Daily Async Standup
Post in shared doc/channel (30 seconds):
```
[Date] [Name]
Did: [What you accomplished]
Blocked: [Nothing / specific blocker]
```
**Where:** [TBD - decide in kickoff]
### Weekly Sync
**When:** [TBD - decide in kickoff]
**Duration:** 30 min
**Agenda:**
1. Blockers (10 min) - resolve or escalate
2. Decisions (10 min) - use decision format below
3. Next deliverables (10 min) - create tickets for next week
---
## Decisions Log
### Decision: [Topic]
**Context:** [Why needed now]
**Options:**
1. [Option A]
- Pro:
- Con:
2. [Option B]
- Pro:
- Con:
**Recommendation:**
**Decided:** [Date] - [Choice] - [Who consulted]
---
## Open Questions for Kickoff
From project brief - need team input:
1. **Reward components**: Expose as separate rewards to TRL, or sum into single scalar?
2. **Question selection**: Hand-pick for diversity, or random sample by difficulty?
3. **HINT action**: Add a hint mechanism, or keep it pure exploration?
---
## Timeline Summary
| Phase | Days | Key Milestone | Dependencies |
|-------|------|---------------|--------------|
| 1. Scaffold | 1-2 | `openenv validate` passes | None |
| 2. Core Loop | 3-5 | Full episode works manually | Phase 1 |
| 3. Dense Reward | 6-8 | Reward varies meaningfully | Phase 2 |
| 4. Training | 9-13 | Trained model beats random | Phases 1-3 |
| 5. Polish | 14-16 | All artifacts submitted | Phase 4 |
**Submission deadline:** ~16 days from kickoff
---
## Communication Channels
| Channel | Purpose |
|---------|---------|
| [TBD] | Daily standups |
| [TBD] | Quick questions / blockers |
| [TBD] | Code (GitHub repo) |
| Google Drive | Shared docs (this doc, project brief) |
---
## Next Sync
**When:** [Fill in after kickoff]
**Where:** [Fill in after kickoff]
---
## Completed Tickets Archive
Move completed tickets here with DELIVERED filled in.
*(None yet)*