Reinforcement Learning
Transformers
English
post-training
distillation
agentic-coding
composer-2.5
cursor
kimi-k2
grpo
dapo
diloco
openenv
trl
verl
research
methodology
Instructions to use Codeseys/composer-replication-framework with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Codeseys/composer-replication-framework with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Codeseys/composer-replication-framework", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| """composer_replication.distillation — pluggable self-distillation losses. | |
| Per ADR-007, three losses additive to the framework's existing | |
| SDPO/OPSD (`generalized_jsd_loss`): | |
| - SimPO: reference-free DPO replacement (channel 3 alternative) | |
| - TAID: annealed teacher interpolation (wraps generalized_jsd_loss for channel 2) | |
| - Entropy-Aware OPD: token-wise gated forward/reverse KL (alternative | |
| channel-2 wrapper, per ICLR 2026 Spotlight) | |
| All three are pure PyTorch — no external deps — so they ship in the core | |
| package without optional extras. | |
| Usage in `compose_loss`: | |
| >>> from composer_replication import compose_loss | |
| >>> components = compose_loss( | |
| ... model, batch, | |
| ... dpo_variant="simpo", # channel 3: DPO -> SimPO | |
| ... sdpo_wrapper="taid", # channel 2: SDPO -> TAID | |
| ... taid_t=0.4, # current TAID interpolation coeff | |
| ... ) | |
| """ | |
| from __future__ import annotations | |
| from composer_replication.distillation.simpo import simpo_loss | |
| from composer_replication.distillation.taid import TAIDScheduler, taid_loss | |
| from composer_replication.distillation.entropy_aware_opd import entropy_aware_opd_loss | |
| __all__ = [ | |
| "simpo_loss", | |
| "taid_loss", | |
| "TAIDScheduler", | |
| "entropy_aware_opd_loss", | |
| ] | |