File size: 1,959 Bytes

b266c31

"""composer_replication.replaysim — N-teacher trace replay + dataset normalization.

Per ADR-004, this package consolidates the framework's
"replay an LLM trace through N teachers, get a DPO/preference dataset" flow:

    raw trace
        ↓ (existing teacher_replay.replay_trace)
    list[TeacherCallResult]
        ↓ (existing teacher_replay.extract_dpo_pairs)
    list[DPOPair]
        ↓ (NEW — composer_replication.replaysim.normalize.DJNormalizer)
    list[NormalizedDPOPair]   # length-filtered, dedup'd, chat-template-validated

The pre-normalization pipeline is unchanged. The normalizer is opt-in via
the new convenience function `replay_and_normalize_trace(...)` which wraps
the existing `replay_trace` + `extract_dpo_pairs` and pipes their output
through a `data-juicer` op-graph.

Adopting `data-juicer` (Alibaba, Apache-2.0) was the verdict from the
2026-05-26 reconnaissance — see docs/research/REPLAYSIM_NORMALIZATION_RECONNAISSANCE.md.
It's the only mature library with NATIVE multi-turn `messages` + DPO
preference-pair ops that runs CPU-only on the ops we need.

Optional dependency: `pip install -e .[replaysim]` pulls `data-juicer`.
Without it, the normalizer raises `ImportError` at use time but the
package still imports cleanly.

This module re-exports the existing `teacher_replay` API for convenience
so users can `from composer_replication.replaysim import replay_trace`.
"""
from __future__ import annotations

from composer_replication.replaysim.normalize import (
    DJNormalizer,
    NormalizedDPOPair,
    replay_and_normalize_trace,
)

# Re-exports from the pre-existing teacher_replay module (unchanged):
from composer_replication.teacher_replay import (
    DPOPair,
    TeacherCallResult,
    extract_dpo_pairs,
    replay_trace,
)

__all__ = [
    "DJNormalizer",
    "DPOPair",
    "NormalizedDPOPair",
    "TeacherCallResult",
    "extract_dpo_pairs",
    "replay_and_normalize_trace",
    "replay_trace",
]