File size: 4,899 Bytes
9e64e71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
# SQLEnv Team Coordination

## Team

| Person | Role | Contact |
|--------|------|---------|
| You (Hjerp) | Coordinator | - |
| Kevlar | Contributor | - |
| Jindal | Contributor | - |

---

## Kickoff Meeting - Feb 9

### 1. Personal Wins

> "What would make this project a win for you personally?"

| Person | Personal Win |
|--------|--------------|
| You | |
| Kevlar | |
| Jindal | |

### 2. Availability

| Person | Hours/Week | Best Times | Known Conflicts |
|--------|------------|------------|-----------------|
| You | | | |
| Kevlar | | | |
| Jindal | | | |

**Total available:** _____ hours/week
**Estimated need:** ~40-60 hours total over 16 days (~15-20 hrs/week combined)
**Gap?** If yes, scope discussion needed.

### 3. Ownership Areas

From project brief, 5 natural ownership areas. Let people **claim** - don't assign.

| Area | Owner | Notes |
|------|-------|-------|
| Environment Engineering | | OpenEnv integration, WebSocket, Docker, action handlers |
| Reward Design | | 3-layer rewards, progress metrics, anti-gaming |
| Dataset Curation | | Spider questions, answer verification, difficulty balance |
| Training Pipeline | | GRPO setup, prompts, evaluation, Green Agent |
| Storytelling/Blog | | Blog post, demos, results visualization |

**Note:** Some areas can be shared or split. Training Pipeline depends on Environment + Reward being done first.

---

## First Deliverables (Full Tickets)

Use **full format** for first deliverables (commitment test).

### Ticket 1

**WHO:** 
**WHAT:** 
**WHY:** 
**CHALLENGE:** 
**DUE:** 
**DELIVERED:** 

### Ticket 2

**WHO:** 
**WHAT:** 
**WHY:** 
**CHALLENGE:** 
**DUE:** 
**DELIVERED:** 

### Ticket 3

**WHO:** 
**WHAT:** 
**WHY:** 
**CHALLENGE:** 
**DUE:** 
**DELIVERED:** 

---

## Suggested First Deliverables (If Needed)

Based on Phase 1 requirements, here are ready-to-use tickets:

### Environment Scaffold

**WHO:** [Claim during meeting]
**WHAT:** Run `openenv init sql_env`, customize Pydantic models (SQLAction, SQLObservation), get `openenv validate` passing
**WHY:** Unblocks all environment work; proves Docker/WebSocket setup works
**CHALLENGE:** Docker configuration; WebSocket timeout settings; understanding OpenEnv API
**DUE:** [Wednesday EOD?]
**DELIVERED:** 

### Initial Question Set

**WHO:** [Claim during meeting]
**WHAT:** Select 30 questions from Spider dev set (12 easy, 12 medium, 6 hard) with gold answers in JSON format
**WHY:** Enables manual testing of environment in Phase 2; needed for reward computation
**CHALLENGE:** Balancing difficulty; avoiding questions needing unsupported SQL features
**DUE:** [Wednesday EOD?]
**DELIVERED:** 

### OpenEnv Tutorial Review

**WHO:** [Claim during meeting]
**WHAT:** Complete OpenEnv tutorial notebook, document key learnings and gotchas for team
**WHY:** Reduces ramp-up time for others; surfaces unknowns early
**CHALLENGE:** Tutorial may have gaps; need to map to our SQL use case
**DUE:** [Tuesday EOD?]
**DELIVERED:** 

---

## Coordination Rituals

### Daily Async Standup

Post in shared doc/channel (30 seconds):

```
[Date] [Name]
Did: [What you accomplished]
Blocked: [Nothing / specific blocker]
```

**Where:** [TBD - decide in kickoff]

### Weekly Sync

**When:** [TBD - decide in kickoff]
**Duration:** 30 min
**Agenda:**
1. Blockers (10 min) - resolve or escalate
2. Decisions (10 min) - use decision format below
3. Next deliverables (10 min) - create tickets for next week

---

## Decisions Log

### Decision: [Topic]

**Context:** [Why needed now]

**Options:**
1. [Option A]
   - Pro: 
   - Con: 

2. [Option B]
   - Pro: 
   - Con: 

**Recommendation:** 

**Decided:** [Date] - [Choice] - [Who consulted]

---

## Open Questions for Kickoff

From project brief - need team input:

1. **Reward components**: Expose as separate rewards to TRL, or sum into single scalar?
2. **Question selection**: Hand-pick for diversity, or random sample by difficulty?
3. **HINT action**: Add a hint mechanism, or keep it pure exploration?

---

## Timeline Summary

| Phase | Days | Key Milestone | Dependencies |
|-------|------|---------------|--------------|
| 1. Scaffold | 1-2 | `openenv validate` passes | None |
| 2. Core Loop | 3-5 | Full episode works manually | Phase 1 |
| 3. Dense Reward | 6-8 | Reward varies meaningfully | Phase 2 |
| 4. Training | 9-13 | Trained model beats random | Phases 1-3 |
| 5. Polish | 14-16 | All artifacts submitted | Phase 4 |

**Submission deadline:** ~16 days from kickoff

---

## Communication Channels

| Channel | Purpose |
|---------|---------|
| [TBD] | Daily standups |
| [TBD] | Quick questions / blockers |
| [TBD] | Code (GitHub repo) |
| Google Drive | Shared docs (this doc, project brief) |

---

## Next Sync

**When:** [Fill in after kickoff]
**Where:** [Fill in after kickoff]

---

## Completed Tickets Archive

Move completed tickets here with DELIVERED filled in.

*(None yet)*