Wave 7: Phase 2-4 of deep work loop — backlog, parallel research, three ADRs

ac4bfb4 12 days ago

1.98 kB

	# Deep Work Loop Log — Composer 2.5 Replication Framework

	Started: 2026-05-26
	Operator: Codeseys (Hermes Agent autonomous loop)
	Skill: `deep-work-loop` v1.0.0

	## Vision

	> Take any HuggingFace model → further RL train it using:
	> 1. RLVR (tests-pass reward),
	> 2. SDPO/hint-distillation (Composer 2.5's "targeted RL with textual feedback"),
	> 3. multi-teacher trace-replay DPO,
	> integrated against TRL/VeRL/OpenEnv with DiLoCo-style outer loop sync.
	>
	> Output: a published, reproducible framework — the "Composer 2.5 replication" the open ecosystem is missing.

	## Starting state

	- HEAD: `040eff8` (Wave 6: vision validation self-audit, 5/10 scorecard)
	- Tests: 38/38 green in `spikes/005-integrated-trainer-skeleton/`
	- Working tree: clean

	## Phase ledger

	\| Phase \| Description \| Status \| Started \| Done \|
	\|---\|---\|---\|---\|---\|
	\| 1 \| commit-state \| ✅ \| 2026-05-26 \| 2026-05-26 \|
	\| 2 \| backlog-audit (BACKLOG.md from VISION_VALIDATION) \| ✅ \| 2026-05-26 \| 2026-05-26 \|
	\| 3 \| parallel-research (3 subagents) \| 🟡 \| 2026-05-26 \| \|
	\| 4 \| architect with ADRs (ADR-001..003) \| ⏳ \| \| \|
	\| 5 \| plan in waves (W7–W10) \| ⏳ \| \| \|
	\| 6 \| execute W7 — Spike 006 (real HF model smoke) \| ⏳ \| \| \|
	\| 7 \| execute W8 — Spike 007 (real trace ingestion) \| ⏳ \| \| \|
	\| 8 \| execute W9 — Spike 008 (DiLoCo smoke) \| ⏳ \| \| \|
	\| 9 \| execute W10 — packaging \| ⏳ \| \| \|
	\| 10 \| (Modal-gated) Spike 002a-mini real GPU smoke \| ⏳ \| \| \|
	\| 11 \| cross-model-final-review \| ⏳ \| \| \|
	\| 12 \| update scorecard + push \| ⏳ \| \| \|

	## Constraints

	- Verify ALL claims against primary sources (Wave 2 lesson — subagent synthesis is not evidence).
	- Tests must pass before commit.
	- Memory L1 is at 99% — write to L2 wiki + L3 fact_store, not L1.
	- Modal budget: $20 hard cap for this loop. Anything more goes to user for approval.
	- No `upload_file` mixing with `git push` — `git push hf master:main` only.
	- Commit messages via `-F /tmp/<wave>-commit-msg.txt`.