Reinforcement Learning
Transformers
English
post-training
distillation
agentic-coding
composer-2.5
cursor
kimi-k2
grpo
dapo
diloco
openenv
trl
verl
research
methodology
Instructions to use Codeseys/composer-replication-framework with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Codeseys/composer-replication-framework with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Codeseys/composer-replication-framework", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| """composer_replication.replaysim — N-teacher trace replay + dataset normalization. | |
| Per ADR-004, this package consolidates the framework's | |
| "replay an LLM trace through N teachers, get a DPO/preference dataset" flow: | |
| raw trace | |
| ↓ (existing teacher_replay.replay_trace) | |
| list[TeacherCallResult] | |
| ↓ (existing teacher_replay.extract_dpo_pairs) | |
| list[DPOPair] | |
| ↓ (NEW — composer_replication.replaysim.normalize.DJNormalizer) | |
| list[NormalizedDPOPair] # length-filtered, dedup'd, chat-template-validated | |
| The pre-normalization pipeline is unchanged. The normalizer is opt-in via | |
| the new convenience function `replay_and_normalize_trace(...)` which wraps | |
| the existing `replay_trace` + `extract_dpo_pairs` and pipes their output | |
| through a `data-juicer` op-graph. | |
| Adopting `data-juicer` (Alibaba, Apache-2.0) was the verdict from the | |
| 2026-05-26 reconnaissance — see docs/research/REPLAYSIM_NORMALIZATION_RECONNAISSANCE.md. | |
| It's the only mature library with NATIVE multi-turn `messages` + DPO | |
| preference-pair ops that runs CPU-only on the ops we need. | |
| Optional dependency: `pip install -e .[replaysim]` pulls `data-juicer`. | |
| Without it, the normalizer raises `ImportError` at use time but the | |
| package still imports cleanly. | |
| This module re-exports the existing `teacher_replay` API for convenience | |
| so users can `from composer_replication.replaysim import replay_trace`. | |
| """ | |
| from __future__ import annotations | |
| from composer_replication.replaysim.normalize import ( | |
| DJNormalizer, | |
| NormalizedDPOPair, | |
| replay_and_normalize_trace, | |
| ) | |
| # Re-exports from the pre-existing teacher_replay module (unchanged): | |
| from composer_replication.teacher_replay import ( | |
| DPOPair, | |
| TeacherCallResult, | |
| extract_dpo_pairs, | |
| replay_trace, | |
| ) | |
| __all__ = [ | |
| "DJNormalizer", | |
| "DPOPair", | |
| "NormalizedDPOPair", | |
| "TeacherCallResult", | |
| "extract_dpo_pairs", | |
| "replay_and_normalize_trace", | |
| "replay_trace", | |
| ] | |