Codeseys commited on
Commit
d61036a
·
1 Parent(s): 6806cf7

Wave 21b: skip zero-signal SDPO on empty-recovery error turns + real-trace validation

Browse files

Found by validating the Wave 21 pipeline against REAL Claude Code session logs
(738 local sessions, 66 with is_error:true) rather than only the synthetic
fixture. Two outcomes: a collator fix and a permanent validation example.

## Collator fix: empty-recovery error turns

On real traces, ~67% of error sites have EMPTY recovery content when
strip_thinking=True — because the error-RECOVERY turn is frequently pure
[THINKING] (the model reasons about the failure, then silently retries a tool),
and stripping thinking empties it. The old code set any_errors=True and injected
a hint the moment hint_text existed, BEFORE checking recovery content, so an
empty-recovery turn produced an all-ignore_index sdpo_loss_mask: a zero-signal
SDPO row that wastes a forward pass and dilutes the channel.

Fix: only treat a turn as an SDPO error site when BOTH a hint was produced AND
the recovery turn has content (`hint_text and turn.get("content")`). Applied
symmetrically to `_build_hint_injected_trace` (teacher) and
`_build_aligned_student_one` (student) so the student/teacher message lists stay
in lockstep and the SDPO shape-match gate never breaks. Empty-recovery turns
fall through to the (also-skipped) empty passthrough.

## The strip_thinking x SDPO lesson

SDPO hint-distillation on real agent traces REQUIRES strip_thinking=False — the
recovery reasoning IS the thinking. With it kept: 0% empty-recovery, real signal.
With it stripped: ~67% empty, channel goes mostly dark. Documented in the new
example's README.

## Real-trace validation result (10 sessions, strip_thinking=False)
- 10/10 processed, 0 crashes
- 141 error sites, 170 structural-flagged users, 0 string-tag-only
- 0% empty-recovery, SDPO alignment 832/832 = 100.0%
Confirms the Wave 21 _build_chat_aligned_mask fix holds at population scale.

## Added
- examples/validate_real_trace_alignment/ (run.py + README): auto-discovers
error-bearing ~/.claude sessions, runs ingestion->adapter->collator->SDPO,
reports alignment ratio + empty-recovery rate. Exit 0 PASS / 1 FAIL / 2 no-data.
- 3 stub-based empty-recovery tests in trainer/tests (always run, no model):
empty -> no SDPO; mixed -> fires on non-empty; shapes stay matched.

## Tests
Full package + spike collator: 164 passed, 16 skipped, 0 failed.

Note: empty-recovery tests live in composer_replication/trainer/tests/ (the
PACKAGE collator), not the spike — spikes/005 imports the legacy trl_path copy.

composer_replication/trainer/data_collator.py CHANGED
@@ -296,25 +296,35 @@ class ComposerDataCollator:
296
  turn.get("tool_error", "unknown"),
297
  turn.get("error_meta", {}),
298
  )
299
- if hint_text:
 
 
 
 
 
 
 
 
 
300
  any_errors = True
 
301
  # Inject hint as a system-style addendum BEFORE the assistant's response
302
  teacher_messages.append({"role": "system", "content": hint_text})
303
  teacher_loss_segments.append((False, hint_text))
304
- if turn.get("content"):
305
- teacher_messages.append({
306
- "role": turn.get("role", "assistant"),
307
- "content": turn["content"],
308
- })
309
- teacher_loss_segments.append((True, turn["content"])) # post-hint tokens = loss
310
  continue
311
- # Non-error turn (or hint generator returned None) — passthrough
312
- if turn.get("content"):
 
313
  teacher_messages.append({
314
  "role": turn.get("role", "assistant"),
315
- "content": turn["content"],
316
  })
317
- teacher_loss_segments.append((False, turn["content"]))
318
 
319
  # Tokenize the full teacher conversation
320
  teacher_ids = self._tokenize_messages(teacher_messages)
@@ -449,28 +459,33 @@ class ComposerDataCollator:
449
  turn.get("tool_error", "unknown"),
450
  turn.get("error_meta", {}),
451
  )
452
- if hint_text:
 
 
 
 
453
  any_errors = True
 
454
  placeholder = self._make_placeholder_for_hint_length(hint_text)
455
  # Student gets a placeholder system-msg at the SAME slot
456
  # the teacher gets the hint system-msg.
457
  student_messages.append({"role": "system", "content": placeholder})
458
  student_loss_segments.append((False, placeholder))
459
- if turn.get("content"):
460
- student_messages.append({
461
- "role": turn.get("role", "assistant"),
462
- "content": turn["content"],
463
- })
464
- is_assistant = turn.get("role") == "assistant"
465
- student_loss_segments.append((is_assistant, turn["content"]))
466
  continue
467
- if turn.get("content"):
 
468
  student_messages.append({
469
  "role": turn.get("role", "assistant"),
470
- "content": turn["content"],
471
  })
472
  is_assistant = turn.get("role") == "assistant"
473
- student_loss_segments.append((is_assistant, turn["content"]))
474
 
475
  # Tokenize the full student conversation via apply_chat_template
476
  # (mirrors teacher's path so chat-template markers are identical).
 
296
  turn.get("tool_error", "unknown"),
297
  turn.get("error_meta", {}),
298
  )
299
+ # Only treat this as an SDPO error site when BOTH a hint was
300
+ # produced AND the recovery turn has content to distill against.
301
+ # Real Claude Code traces frequently have empty recovery content
302
+ # — e.g. when strip_thinking=True nukes a recovery turn that was
303
+ # pure [THINKING] reasoning (observed ~67% of real error sites).
304
+ # Injecting a hint with no recovery content produces an
305
+ # all-ignore_index mask: a zero-signal SDPO row that wastes a
306
+ # forward pass and silently dilutes the channel. Skip it; the
307
+ # turn then falls through to the (also-skipped) empty passthrough.
308
+ if hint_text and turn.get("content"):
309
  any_errors = True
310
+ recovery_content = turn.get("content") or ""
311
  # Inject hint as a system-style addendum BEFORE the assistant's response
312
  teacher_messages.append({"role": "system", "content": hint_text})
313
  teacher_loss_segments.append((False, hint_text))
314
+ teacher_messages.append({
315
+ "role": turn.get("role", "assistant"),
316
+ "content": recovery_content,
317
+ })
318
+ teacher_loss_segments.append((True, recovery_content)) # post-hint tokens = loss
 
319
  continue
320
+ # Non-error turn (or hint generator returned None / empty recovery) — passthrough
321
+ content = turn.get("content")
322
+ if content:
323
  teacher_messages.append({
324
  "role": turn.get("role", "assistant"),
325
+ "content": content,
326
  })
327
+ teacher_loss_segments.append((False, content))
328
 
329
  # Tokenize the full teacher conversation
330
  teacher_ids = self._tokenize_messages(teacher_messages)
 
459
  turn.get("tool_error", "unknown"),
460
  turn.get("error_meta", {}),
461
  )
462
+ # MUST mirror the teacher path's condition exactly (hint AND
463
+ # recovery content) or the student/teacher message lists diverge
464
+ # and the SDPO shape-match gate breaks. Empty-recovery error
465
+ # turns are skipped on both sides — see _build_hint_injected_trace.
466
+ if hint_text and turn.get("content"):
467
  any_errors = True
468
+ recovery_content = turn.get("content") or ""
469
  placeholder = self._make_placeholder_for_hint_length(hint_text)
470
  # Student gets a placeholder system-msg at the SAME slot
471
  # the teacher gets the hint system-msg.
472
  student_messages.append({"role": "system", "content": placeholder})
473
  student_loss_segments.append((False, placeholder))
474
+ student_messages.append({
475
+ "role": turn.get("role", "assistant"),
476
+ "content": recovery_content,
477
+ })
478
+ is_assistant = turn.get("role") == "assistant"
479
+ student_loss_segments.append((is_assistant, recovery_content))
 
480
  continue
481
+ content = turn.get("content")
482
+ if content:
483
  student_messages.append({
484
  "role": turn.get("role", "assistant"),
485
+ "content": content,
486
  })
487
  is_assistant = turn.get("role") == "assistant"
488
+ student_loss_segments.append((is_assistant, content))
489
 
490
  # Tokenize the full student conversation via apply_chat_template
491
  # (mirrors teacher's path so chat-template markers are identical).
composer_replication/trainer/tests/test_chat_template_alignment.py CHANGED
@@ -144,3 +144,102 @@ def test_real_chat_template_student_teacher_shapes_match(real_chat_tok, multitur
144
  collator = ComposerDataCollator(tokenizer=real_chat_tok, config=cfg)
145
  batch = collator([multiturn_error_trace])
146
  assert batch["input_ids"].shape == batch["ctx_teacher_input_ids"].shape
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
144
  collator = ComposerDataCollator(tokenizer=real_chat_tok, config=cfg)
145
  batch = collator([multiturn_error_trace])
146
  assert batch["input_ids"].shape == batch["ctx_teacher_input_ids"].shape
147
+
148
+
149
+ # ----------------------------------------------------------------------------
150
+ # Empty-recovery guard (Wave 21 — discovered on real Claude Code traces)
151
+ # ----------------------------------------------------------------------------
152
+ #
153
+ # ~67% of real error sites have EMPTY recovery content: when strip_thinking=True
154
+ # the recovery turn (which was pure [THINKING] reasoning) becomes empty. Injecting
155
+ # an SDPO hint with no recovery content yields an all-ignore_index mask — a
156
+ # zero-signal row that wastes a forward pass and dilutes the channel. The collator
157
+ # must treat empty-recovery error turns as non-error sites. These use a stub
158
+ # tokenizer (pure logic, no model needed) so they always run.
159
+
160
+
161
+ class _StubTok:
162
+ """Word-level deterministic tokenizer; apply_chat_template space-joins."""
163
+
164
+ pad_token_id = 0
165
+
166
+ def __init__(self) -> None:
167
+ self._v: dict[str, int] = {"<pad>": 0, "<bos>": 1, "<eos>": 2}
168
+
169
+ def _id(self, w: str) -> int:
170
+ if w not in self._v:
171
+ self._v[w] = len(self._v)
172
+ return self._v[w]
173
+
174
+ def __call__(self, text, **_k):
175
+ return {"input_ids": [self._id(w) for w in text.split()] if text else []}
176
+
177
+ def apply_chat_template(self, messages, tokenize=True, **_k): # noqa: ARG002
178
+ return [self._id(w) for w in " ".join(m.get("content", "") for m in messages).split()]
179
+
180
+
181
+ def _hint_for_tnf(kind, _meta):
182
+ return "HINT use a real tool" if kind == "tool_not_found" else None
183
+
184
+
185
+ def test_empty_recovery_does_not_fire_sdpo():
186
+ """An error turn with empty recovery content must NOT emit an SDPO mask."""
187
+ tok = _StubTok()
188
+ trace = {
189
+ "trace_id": "empty-recovery",
190
+ "turns": [
191
+ {"role": "user", "content": "do the thing"},
192
+ {"role": "assistant", "content": "", "tool_error": "tool_not_found", "error_meta": {}},
193
+ {"role": "user", "content": "tool not found"},
194
+ ],
195
+ "final_reward": 0.0,
196
+ }
197
+ cfg = CollatorConfig(hint_generator=_hint_for_tnf)
198
+ collator = ComposerDataCollator(tokenizer=tok, config=cfg)
199
+ batch = collator([trace])
200
+ assert "sdpo_loss_mask" not in batch, (
201
+ "Empty-recovery error turn fired a zero-signal SDPO mask; it must be skipped."
202
+ )
203
+
204
+
205
+ def test_mixed_recovery_fires_on_nonempty_only():
206
+ """A trace mixing empty + non-empty recovery turns fires SDPO from the
207
+ non-empty one and has loss-active positions."""
208
+ tok = _StubTok()
209
+ trace = {
210
+ "trace_id": "mixed-recovery",
211
+ "turns": [
212
+ {"role": "user", "content": "first task"},
213
+ {"role": "assistant", "content": "", "tool_error": "tool_not_found", "error_meta": {}},
214
+ {"role": "user", "content": "tool not found"},
215
+ {"role": "assistant", "content": "let me use a real tool instead",
216
+ "tool_error": "tool_not_found", "error_meta": {}},
217
+ ],
218
+ "final_reward": 0.0,
219
+ }
220
+ cfg = CollatorConfig(hint_generator=_hint_for_tnf)
221
+ collator = ComposerDataCollator(tokenizer=tok, config=cfg)
222
+ batch = collator([trace])
223
+ assert "sdpo_loss_mask" in batch
224
+ assert int((batch["sdpo_loss_mask"] == 1).sum()) > 0
225
+
226
+
227
+ def test_empty_recovery_keeps_student_teacher_shapes_matched():
228
+ """Even with a skipped empty-recovery turn, when SDPO DOES fire elsewhere
229
+ the student/teacher shapes must still match (lockstep skip on both sides)."""
230
+ tok = _StubTok()
231
+ trace = {
232
+ "trace_id": "mixed-shape",
233
+ "turns": [
234
+ {"role": "user", "content": "task"},
235
+ {"role": "assistant", "content": "", "tool_error": "tool_not_found", "error_meta": {}},
236
+ {"role": "user", "content": "tool not found"},
237
+ {"role": "assistant", "content": "recover now with a real tool",
238
+ "tool_error": "tool_not_found", "error_meta": {}},
239
+ ],
240
+ "final_reward": 0.0,
241
+ }
242
+ cfg = CollatorConfig(hint_generator=_hint_for_tnf)
243
+ collator = ComposerDataCollator(tokenizer=tok, config=cfg)
244
+ batch = collator([trace])
245
+ assert batch["input_ids"].shape == batch["ctx_teacher_input_ids"].shape
examples/validate_real_trace_alignment/README.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Real-trace SDPO alignment validation
2
+
3
+ Runs the full **ingestion → adapter → collator → SDPO** data path against your
4
+ own local Claude Code session logs (`~/.claude/projects/**/*.jsonl`) and reports
5
+ the live SDPO mask alignment ratio. This is the population-level proof that
6
+ Wave 21's `_build_chat_aligned_mask` fix holds on real-world data, not just the
7
+ synthetic fixture.
8
+
9
+ ## Run
10
+
11
+ ```bash
12
+ python examples/validate_real_trace_alignment/run.py
13
+ # options:
14
+ # --projects-dir ~/.claude/projects where to discover sessions
15
+ # --max-sessions 8 how many error-bearing sessions to sample
16
+ # --model Qwen/Qwen2.5-0.5B-Instruct a real chat-template tokenizer
17
+ # --pass-threshold 0.95 min alignment ratio to PASS
18
+ # --strip-thinking (default OFF — see below)
19
+ ```
20
+
21
+ Exit code: `0` PASS (alignment ≥ threshold, no crashes), `1` FAIL, `2` no
22
+ error-bearing sessions found / no chat template.
23
+
24
+ ## What it measures
25
+
26
+ - **ingestion yield** — states emitted, error sites detected
27
+ - **structural vs string-only flagging** — the Wave 21 `is_error` fix. The
28
+ ingester sets a structural `tool_error: True` boolean; `string-tag-only`
29
+ should be ~0 (the brittle `[TOOL_RESULT (ERROR)]` grep is fallback-only).
30
+ - **empty-recovery rate** — see below.
31
+ - **SDPO alignment** — fraction of in-loss `sdpo_loss_mask` positions where
32
+ student token id == teacher token id. ~100% means the mask lands exactly on
33
+ content tokens; <95% means chat-template drift has regressed.
34
+
35
+ ## The `--strip-thinking` gotcha (important for SDPO)
36
+
37
+ `ClaudeCodeIngester(strip_thinking=...)` controls whether `[THINKING]` blocks
38
+ survive. For most ingestion you strip them. **For SDPO hint-distillation you
39
+ must NOT** — on real Claude Code traces the error-*recovery* turn is very often
40
+ **pure thinking** (the model reasons about the failure, then silently retries a
41
+ tool). Strip it and that turn's content goes empty, so ~67% of error sites carry
42
+ no recovery content to distill against and produce a zero-signal SDPO row.
43
+
44
+ This script therefore defaults to `strip_thinking=False`. The collator also
45
+ guards against the empty case (an empty-recovery error turn is treated as a
46
+ non-error site rather than firing an all-`ignore_index` mask), but the *signal*
47
+ only exists if you keep the thinking. Pass `--strip-thinking` to see the
48
+ empty-recovery warning fire.
49
+
50
+ ## Representative result (Codeseys' machine, 2026-05-28)
51
+
52
+ ```
53
+ sessions processed: 10/10
54
+ total error sites: 141
55
+ structural-flagged users: 170
56
+ string-tag-only users: 0
57
+ empty-recovery sites: 0/141 (0%) # strip_thinking=False
58
+ SDPO alignment (REAL): 832/832 = 100.0%
59
+ RESULT: PASS ✅
60
+ ```
61
+
62
+ With `--strip-thinking` the same sessions report ~67% empty-recovery and the
63
+ measurable in-loss positions collapse accordingly — the lever is visible.
examples/validate_real_trace_alignment/run.py ADDED
@@ -0,0 +1,203 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Validate the full ingestion -> adapter -> collator -> SDPO data path against
2
+ REAL Claude Code session logs, and report the live SDPO alignment ratio.
3
+
4
+ Why this exists
5
+ ---------------
6
+ The synthetic fixture in `spikes/007-real-trace-ingestion/fixtures/` proves the
7
+ pipeline works on hand-built data. This script proves it on REAL traces — long
8
+ tool outputs, multi-block content, thinking blocks, genuinely weird tool errors
9
+ — which is where the Wave 19 chat-template drift bug (residual ~33%
10
+ misalignment) actually bit. Wave 21's `_build_chat_aligned_mask` fix is verified
11
+ here at the population level.
12
+
13
+ What it measures
14
+ ----------------
15
+ * ingestion yield (states emitted, error sites detected)
16
+ * structural vs string-only error flagging (the Wave 21 TOOL_ERROR_TAG fix —
17
+ structural should dominate; string-only should be ~0)
18
+ * SDPO alignment ratio: fraction of in-loss `sdpo_loss_mask` positions where
19
+ student token id == teacher token id. ~100% means the mask lands exactly on
20
+ content tokens; <95% means chat-template drift has regressed.
21
+
22
+ Usage
23
+ -----
24
+ python examples/validate_real_trace_alignment/run.py \
25
+ [--projects-dir ~/.claude/projects] \
26
+ [--max-sessions 8] [--model Qwen/Qwen2.5-0.5B-Instruct]
27
+
28
+ Requires a real chat-template tokenizer (transformers + a cached/instruct model)
29
+ and at least one local Claude Code session containing `is_error: true`. Exits 0
30
+ on PASS (>=95% alignment), 1 on FAIL, 2 if no error-bearing sessions were found.
31
+ """
32
+ from __future__ import annotations
33
+
34
+ import argparse
35
+ import os
36
+ import sys
37
+ import traceback
38
+ from pathlib import Path
39
+
40
+
41
+ def _discover_error_sessions(projects_dir: Path, limit: int) -> list[Path]:
42
+ """Find session JSONLs that contain at least one is_error:true tool_result,
43
+ skipping subagent (`agent-*`) files. Returns up to `limit`, smallest first
44
+ (faster to process, still representative)."""
45
+ hits: list[tuple[int, Path]] = []
46
+ for p in projects_dir.rglob("*.jsonl"):
47
+ if p.name.startswith("agent-"):
48
+ continue
49
+ try:
50
+ text = p.read_text(encoding="utf-8", errors="ignore")
51
+ except OSError:
52
+ continue
53
+ if '"is_error":true' in text or '"is_error": true' in text:
54
+ hits.append((p.stat().st_size, p))
55
+ hits.sort(key=lambda t: t[0])
56
+ return [p for _, p in hits[:limit]]
57
+
58
+
59
+ def main() -> int:
60
+ ap = argparse.ArgumentParser()
61
+ ap.add_argument("--projects-dir", default=str(Path.home() / ".claude" / "projects"))
62
+ ap.add_argument("--max-sessions", type=int, default=8)
63
+ ap.add_argument("--model", default="Qwen/Qwen2.5-0.5B-Instruct")
64
+ ap.add_argument("--pass-threshold", type=float, default=0.95)
65
+ ap.add_argument(
66
+ "--strip-thinking",
67
+ action="store_true",
68
+ help="Strip [THINKING] blocks. DEFAULT IS FALSE for SDPO: on real "
69
+ "Claude Code traces the error-recovery turn is frequently pure "
70
+ "thinking, so stripping it empties ~67%% of error sites and the SDPO "
71
+ "channel sees no signal. Keep thinking for hint-distillation.",
72
+ )
73
+ args = ap.parse_args()
74
+
75
+ os.environ.setdefault("HF_HUB_OFFLINE", "1")
76
+ os.environ.setdefault("TRANSFORMERS_OFFLINE", "1")
77
+
78
+ from transformers import AutoTokenizer
79
+
80
+ from composer_replication.ingestion import ClaudeCodeIngester
81
+ from composer_replication.ingestion.trace_examples import (
82
+ TOOL_ERROR_TAG,
83
+ claude_states_to_trace_examples,
84
+ )
85
+ from composer_replication.trainer.data_collator import (
86
+ CollatorConfig,
87
+ ComposerDataCollator,
88
+ )
89
+
90
+ projects_dir = Path(args.projects_dir).expanduser()
91
+ if not projects_dir.exists():
92
+ print(f"projects dir not found: {projects_dir}")
93
+ return 2
94
+
95
+ sessions = _discover_error_sessions(projects_dir, args.max_sessions)
96
+ if not sessions:
97
+ print(f"no error-bearing sessions under {projects_dir}")
98
+ return 2
99
+
100
+ tok = AutoTokenizer.from_pretrained(args.model)
101
+ if not getattr(tok, "chat_template", None):
102
+ print(f"{args.model} has no chat template; pick an -Instruct model")
103
+ return 2
104
+
105
+ def hint_gen(kind, _meta):
106
+ return f"Recover from the {kind}: re-check the path/args before retrying."
107
+
108
+ cfg = CollatorConfig(hint_generator=hint_gen, enable_replay_dpo=False, max_seq_len=8192)
109
+ collator = ComposerDataCollator(tokenizer=tok, config=cfg)
110
+
111
+ tot_states = tot_err_sites = 0
112
+ tot_aligned = tot_inloss = 0
113
+ n_struct = n_string_only = 0
114
+ n_empty_recovery = n_nonempty_recovery = 0
115
+ sessions_with_sdpo = 0
116
+ crashes: list[tuple[str, str]] = []
117
+
118
+ for path in sessions:
119
+ label = path.name[:18]
120
+ try:
121
+ ing = ClaudeCodeIngester(skip_sidechain=True, strip_thinking=args.strip_thinking)
122
+ states = list(ing.ingest(path))
123
+ for s in states:
124
+ for m in s["messages"]:
125
+ if m.get("role") != "user":
126
+ continue
127
+ if m.get("tool_error") is True:
128
+ n_struct += 1
129
+ elif isinstance(m.get("content"), str) and TOOL_ERROR_TAG in m["content"]:
130
+ n_string_only += 1
131
+ examples = claude_states_to_trace_examples(states)
132
+ # Count empty vs non-empty recovery content among detected error turns.
133
+ for ex in examples:
134
+ for t in ex["turns"]:
135
+ if t.get("tool_error"):
136
+ if (t.get("content") or "").strip():
137
+ n_nonempty_recovery += 1
138
+ else:
139
+ n_empty_recovery += 1
140
+ err_examples = [
141
+ ex for ex in examples if any(t.get("tool_error") for t in ex["turns"])
142
+ ]
143
+ tot_states += len(states)
144
+ tot_err_sites += sum(
145
+ sum(1 for t in ex["turns"] if t.get("tool_error")) for ex in examples
146
+ )
147
+
148
+ if err_examples:
149
+ batch = collator(err_examples[:4])
150
+ if "sdpo_loss_mask" in batch:
151
+ sessions_with_sdpo += 1
152
+ s_in = batch["input_ids"]
153
+ t_in = batch["ctx_teacher_input_ids"]
154
+ m_in = batch["sdpo_loss_mask"]
155
+ for row in range(s_in.shape[0]):
156
+ il = m_in[row] == 1
157
+ if int(il.sum()) == 0:
158
+ continue
159
+ tot_aligned += int((s_in[row][il] == t_in[row][il]).sum().item())
160
+ tot_inloss += int(il.sum().item())
161
+ print(f" OK {label}: {len(states):4d} states, {len(err_examples):3d} err-examples")
162
+ except Exception as e: # noqa: BLE001 — report-and-continue is the point
163
+ crashes.append((path.name, repr(e)))
164
+ print(f" CRASH {label}: {e!r}")
165
+ traceback.print_exc()
166
+
167
+ print("\n" + "=" * 64)
168
+ print("REAL-TRACE PIPELINE VALIDATION")
169
+ print("=" * 64)
170
+ print(f" sessions processed: {len(sessions) - len(crashes)}/{len(sessions)}")
171
+ print(f" total states emitted: {tot_states}")
172
+ print(f" total error sites: {tot_err_sites}")
173
+ print(f" structural-flagged users: {n_struct}")
174
+ print(f" string-tag-only users: {n_string_only} (Wave 21: should be ~0)")
175
+ _tot_recovery = n_empty_recovery + n_nonempty_recovery
176
+ if _tot_recovery:
177
+ pct_empty = 100 * n_empty_recovery / _tot_recovery
178
+ print(
179
+ f" empty-recovery sites: {n_empty_recovery}/{_tot_recovery} "
180
+ f"({pct_empty:.0f}%) — these fire NO SDPO signal"
181
+ )
182
+ if args.strip_thinking and pct_empty > 30:
183
+ print(
184
+ " ⚠ high empty-recovery rate with --strip-thinking: the recovery "
185
+ "turns are pure [THINKING]. Re-run WITHOUT --strip-thinking to "
186
+ "recover SDPO signal on these sites."
187
+ )
188
+ print(f" sessions firing SDPO: {sessions_with_sdpo}")
189
+
190
+ if not tot_inloss:
191
+ print(" no in-loss positions measured — cannot assess alignment")
192
+ return 2
193
+ ratio = tot_aligned / tot_inloss
194
+ print(f" SDPO alignment (REAL): {tot_aligned}/{tot_inloss} = {100 * ratio:.1f}%")
195
+ ok = ratio >= args.pass_threshold and not crashes
196
+ print(f" RESULT: {'PASS ✅' if ok else 'FAIL ❌'} (threshold {100*args.pass_threshold:.0f}%)")
197
+ if crashes:
198
+ print(f" {len(crashes)} crash(es): {[c[0] for c in crashes]}")
199
+ return 0 if ok else 1
200
+
201
+
202
+ if __name__ == "__main__":
203
+ sys.exit(main())