Reinforcement Learning
Transformers
English
post-training
distillation
agentic-coding
composer-2.5
cursor
kimi-k2
grpo
dapo
diloco
openenv
trl
verl
research
methodology
Instructions to use Codeseys/composer-replication-framework with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Codeseys/composer-replication-framework with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Codeseys/composer-replication-framework", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| # composer_replication | |
| The Composer 2.5 Replication Framework, packaged for `pip install`. | |
| This package re-exports the verified APIs that live in the | |
| [`spikes/`](../spikes/) directory of the parent repository, so that downstream | |
| code can `import composer_replication` instead of poking at `sys.path`. | |
| ## Package map | |
| | module | source spike | purpose | | |
| |---|---|---| | |
| | `composer_replication.loss` | spike 006 | Free `compose_loss(model, batch, ...)` 3-channel loss composer + `LossComponents` dataclass | | |
| | `composer_replication.batch` | spike 006 | `build_batch(tokenizer)` — real chat-template batch from any HF tokenizer | | |
| | `composer_replication.opsd` | spike 005 | `generalized_jsd_loss` (verified port of `siyan-zhao/OPSD`) | | |
| | `composer_replication.teacher_replay` | spike 001/005 | `replay_trace`, `extract_dpo_pairs`, `TraceState`, `TeacherSpec` (multi-teacher OpenRouter replay) | | |
| | `composer_replication.hint_generator` | spike 005 | Hint-text construction at error sites for SDPO channel | | |
| | `composer_replication.trainer` | spike 005 | `ComposerReplicationTrainer` (TRL `GRPOTrainer` subclass with the 3 channels) | | |
| | `composer_replication.ingestion` | spike 007 | `ClaudeCodeIngester` (Claude Code session JSONL → `TraceState`) | | |
| | `composer_replication.diloco` | spike 008 | `make_diloco_outer_loop` (wraps `torchft.local_sgd.DiLoCo`) | | |
| ## Why a package on top of spikes? | |
| The spikes are research artifacts: each one has its own `README.md`, tests, | |
| verdict, and a `sys.path` hack to find sibling modules. They live forever as | |
| verification harnesses. | |
| Most users want to `pip install -e . && python my_training_script.py`. This | |
| package is the pip-installable face of the framework. The two surfaces stay | |
| in sync because the package modules are 1:1 copies of the spike modules with | |
| only the import paths changed (sibling-relative → package-absolute). | |
| ## Quickstart | |
| See [`examples/qwen_05b_quickstart/`](../examples/qwen_05b_quickstart/) at | |
| the repo root. | |