docs: main is canonical branch; retire main-lags-master foot-gun

Record the 2026-06-09 decision that `main` is the canonical branch on both
HF repos. main==master converged via fast-forward; push to main going
forward. Reframe gotcha #1 from "main lags master" to "RESOLVED, keep it
synced" — the Modal SHA pins stay correct but their stale-main rationale
no longer applies.

Files changed (1) hide show

docs/PROJECT_STATE_AND_REMAINING_WORK.md +14 -2

docs/PROJECT_STATE_AND_REMAINING_WORK.md CHANGED Viewed

@@ -62,10 +62,22 @@ installable, with worked GSM8K-GRPO + SDPO-real-trace + A1-8B examples.
 - **`…-dd7b` (P3):** A4 combined arm + final A0–A4 comparison table. Blocked on A2 **and** A3
   (its value is reading the combined effect against the isolated baselines).
 ## Load-bearing gotchas (carry these forward)
-1. **`main` LAGS `master` on both HF repos.** Any Modal `git clone … && pip install` MUST
-   `git checkout master` / pin a master SHA, or `ImportError: make_dr_grpo_config`.
 2. **SDPO on real agent traces requires `strip_thinking=False`** — ~67% of error-recovery
    turns are pure thinking; stripping yields empty masks. Keep `max_seq_len ≥ 1536`.
 3. **OUTPUT_DIR clobber:** any sweep dimension (objective/lr/seed) you'll compare side-by-side

 - **`…-dd7b` (P3):** A4 combined arm + final A0–A4 comparison table. Blocked on A2 **and** A3
   (its value is reading the combined effect against the isolated baselines).
+## Branch convention (canonical: `main`)
+**`main` is the canonical branch on both HF repos** (decided 2026-06-09). As of that date
+`main == master == fb13ea3` (framework) / `37c0ea5` (LMA) — converged via clean fast-forward.
+**Push to `main`** (or to both in lockstep). `master` is retained only as a mirror; do not let
+the two drift. A fresh `git clone` now defaults to `main` and gets the complete tree (incl.
+`make_dr_grpo_config` + ADR-014), so the old "must checkout master" foot-gun is RETIRED as long
+as `main` stays current.
 ## Load-bearing gotchas (carry these forward)
+1. **Branch sync (RESOLVED 2026-06-09, keep it that way).** `main` previously LAGGED `master`
+   (frozen at Wave 19), which is why older Modal images pin a `master` SHA "because main predates
+   `make_dr_grpo_config`." That divergence is now fixed (`main == master`). **Keep pushing to
+   `main`** so it never lags again; the SHA pins in Modal images remain correct but their
+   "main is stale" rationale no longer applies once both branches stay in sync.
 2. **SDPO on real agent traces requires `strip_thinking=False`** — ~67% of error-recovery
    turns are pure thinking; stripping yields empty masks. Keep `max_seq_len ≥ 1536`.
 3. **OUTPUT_DIR clobber:** any sweep dimension (objective/lr/seed) you'll compare side-by-side