Reinforcement Learning
Transformers
English
post-training
distillation
agentic-coding
composer-2.5
cursor
kimi-k2
grpo
dapo
diloco
openenv
trl
verl
research
methodology
Instructions to use Codeseys/composer-replication-framework with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Codeseys/composer-replication-framework with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Codeseys/composer-replication-framework", dtype="auto") - Notebooks
- Google Colab
- Kaggle
File size: 1,980 Bytes
ac05fbf | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | # composer_replication
The Composer 2.5 Replication Framework, packaged for `pip install`.
This package re-exports the verified APIs that live in the
[`spikes/`](../spikes/) directory of the parent repository, so that downstream
code can `import composer_replication` instead of poking at `sys.path`.
## Package map
| module | source spike | purpose |
|---|---|---|
| `composer_replication.loss` | spike 006 | Free `compose_loss(model, batch, ...)` 3-channel loss composer + `LossComponents` dataclass |
| `composer_replication.batch` | spike 006 | `build_batch(tokenizer)` — real chat-template batch from any HF tokenizer |
| `composer_replication.opsd` | spike 005 | `generalized_jsd_loss` (verified port of `siyan-zhao/OPSD`) |
| `composer_replication.teacher_replay` | spike 001/005 | `replay_trace`, `extract_dpo_pairs`, `TraceState`, `TeacherSpec` (multi-teacher OpenRouter replay) |
| `composer_replication.hint_generator` | spike 005 | Hint-text construction at error sites for SDPO channel |
| `composer_replication.trainer` | spike 005 | `ComposerReplicationTrainer` (TRL `GRPOTrainer` subclass with the 3 channels) |
| `composer_replication.ingestion` | spike 007 | `ClaudeCodeIngester` (Claude Code session JSONL → `TraceState`) |
| `composer_replication.diloco` | spike 008 | `make_diloco_outer_loop` (wraps `torchft.local_sgd.DiLoCo`) |
## Why a package on top of spikes?
The spikes are research artifacts: each one has its own `README.md`, tests,
verdict, and a `sys.path` hack to find sibling modules. They live forever as
verification harnesses.
Most users want to `pip install -e . && python my_training_script.py`. This
package is the pip-installable face of the framework. The two surfaces stay
in sync because the package modules are 1:1 copies of the spike modules with
only the import paths changed (sibling-relative → package-absolute).
## Quickstart
See [`examples/qwen_05b_quickstart/`](../examples/qwen_05b_quickstart/) at
the repo root.
|