File size: 7,572 Bytes

639a760

# Publication Release Checklist

> **Last updated:** 2026-05-25
> **Current state:** all materials drafted; nothing posted publicly yet.
> Use this checklist to coordinate the publication wave when ready to ship.

## What's drafted

| Artifact | Path | Status | Word count (approx) |
|---|---|---|---|
| Longform methodology paper | [`publications/PAPER_v0.md`](PAPER_v0.md) | ✅ DRAFTED | ~6,500 |
| Blog post (HF Blog format) | [`publications/BLOG_POST.md`](BLOG_POST.md) | ✅ DRAFTED | ~2,400 |
| HF Discussion thread (repo Community tab) | [`publications/HF_DISCUSSION_POST.md`](HF_DISCUSSION_POST.md) | ✅ DRAFTED | ~700 |
| Twitter / X thread (13-tweet + 5-tweet + LinkedIn variants) | [`publications/TWITTER_THREAD.md`](TWITTER_THREAD.md) | ✅ DRAFTED | ~1,200 |
| `CITATION.cff` (HF/GitHub Citation Format) | [`/CITATION.cff`](../CITATION.cff) | ✅ DRAFTED | n/a |
| `CITATION.bib` (BibTeX) | [`/CITATION.bib`](../CITATION.bib) | ✅ DRAFTED | n/a |
| Repo README (model card with frontmatter) | [`/README.md`](../README.md) | ✅ Already published (v3 with wave 4 status) | ~1,000 |

All draft materials are in `publications/` and **not yet posted**. Nothing is gated by review; everything is a self-publish decision. Ready to ship.

## Pre-flight check before shipping any of these

These items should be confirmed before posting any of the public-facing materials. Most are already done from earlier waves but listing here for completeness:

- [x] HF repo is public (`Codeseys/composer-replication-framework`)
- [x] All linked URLs resolve (cross-checked during drafts)
- [x] Test suite passes (`38/38` as of wave 4)
- [x] Spike 001 is reproducible (deterministic states + recorded results)
- [x] Cursor blog is correctly summarized (audit notice in `research/01-composer-2.5.md`)
- [x] Upstream papers cited correctly (OPSD, SDPO, Cursor blog with arXiv IDs verified)
- [x] License is MIT and consistent across `LICENSE` + `README.md` frontmatter + `CITATION.cff`
- [ ] **`CITATION.cff` author block updated with real name/ORCID** if desired (currently just "Codeseys")
- [ ] **Choose final author identity** for the byline (Codeseys handle? real name? affiliation?)
- [ ] **HF Discussion title / tags chosen** — suggested in `HF_DISCUSSION_POST.md`
- [ ] **Blog thumbnail prepared** — placeholder path in `BLOG_POST.md` frontmatter (`/blog/assets/composer-replication-framework/thumbnail.png`); needs a real image
- [ ] **arXiv submission decided** — see § "arXiv submission" below

## Sequencing recommendation

If publishing all materials, this order minimizes risk and maximizes signal:

1. **HF Discussion post first** (lowest-stakes — repo Community tab; anyone landing on the repo will see it; it pre-announces the methodology paper).
2. **Blog post / personal site second** (anchor narrative, ~2,400 words, easy to share).
3. **X / LinkedIn third** (after the blog post URL exists to anchor the thread).
4. **arXiv submission last** (if doing this — needs more polish; see below).

Three-day gap between (1) and (2) is reasonable to let the discussion post collect any early feedback that should be incorporated into the blog.

## Distribution / amplification ideas

- Cross-post the blog to:
  - HuggingFace blog (PR against `huggingface/blog` repo). Their submission process is documented at https://huggingface.co/docs/hub/en/blog
  - Personal blog / Substack / Medium
- Post the discussion in:
  - r/LocalLLaMA (will be eaten by their algorithm but worth one shot)
  - r/MachineLearning if you tag `[R]` and frame as "novel methodology, no results yet — looking for feedback"
  - HackerNews "Show HN: …" — pre-experimental disclosure should be in the title
  - LessWrong / Alignment Forum if you frame the reward-hacking section as the lead
- Tag in the Twitter thread:
  - `@cursor_ai` (Cursor team)
  - `@huggingface` (TRL team)
  - `@volcanoengine` (VeRL team)
  - `@MoonshotAI` (Kimi K2.5)
  - `@PrimeIntellect`

## arXiv submission (decide later)

The methodology paper is currently in markdown. Pros and cons of a formal arXiv release:

**Pros**
- Citable DOI; appears in Google Scholar / Semantic Scholar
- Reaches a non-HF research audience
- Forces a higher polish bar, which catches errors

**Cons**
- Needs LaTeX conversion (~1 day of formatting work)
- The "no experimental results yet" framing is unusual for arXiv; reviewers may dismiss
- Once posted, it's permanent — corrections live as v2/v3 markers

**Recommendation:** post the HF blog and discussion first; decide on arXiv only after spike 002–004 produce results. Then make it a v0.1 paper *with* experimental backing. The current methodology paper becomes Section 2–4 of that future paper, with new sections 5+ for the empirical results.

If you do submit to arXiv now anyway: cs.LG primary, cs.AI cross-list. Title same as `PAPER_v0.md`. Abstract from the paper. Frame in the comments section as "pre-experimental methodology release; experimental validation in follow-up."

## Embargo / coordination notes

- **Cursor team coordination:** not strictly required (their blog is public, their cited papers are public, no proprietary info), but a polite heads-up tweet on day-of release is reasonable since the post heavily engages their work. `@cursor_ai` tag on tweet 1 of the X thread.
- **OPSD authors coordination:** Siyan Zhao et al. — also not required (MIT code, public paper) but tagging the lead author on the X thread is a polite signal of citation. Their handles: try `@siyan_zhao` (verify before tagging).
- **SDPO authors coordination:** same — Hübotter et al. lead author handles unverified, skip tagging if not findable.

## Risk register

| Risk | Likelihood | Mitigation |
|---|---|---|
| Someone runs spike 004 first and beats us to publication | Medium | Acknowledged. Trade-off accepted. The integration architecture is independently citable. |
| Methodology error caught after publication | Medium | Drafts have been audited (DeepWiki for code, primary-source-read for Cursor blog). 38 unit tests catch wiring bugs. The "what's NOT proven" section in the paper is explicit about open claims. |
| Hostile read claiming we overclaim novelty | Low | The paper explicitly compares to rStar / Math-Shepherd / Magpie / MoA and concedes "absence of evidence is not evidence of absence" in §9. |
| Cursor team objects to characterization | Low | Everything cited from their public blog with explicit `[BLOG-VERIFIED]` tags. SDPO/OPSD framing is supported by their own footnote. |
| Repo gets a flood of PRs / discussion noise | Low | Welcome the noise. Maintain `CONTRIBUTING.md` (TBD) when traffic justifies. |

## Post-publication tracking (if you ship)

Things to monitor in the first 2 weeks after publication:

- HF repo: stars, forks, downloads (reachable via API)
- HF Discussions tab: new threads, especially anything flagging methodology errors
- X thread: replies from people working on TRL / VeRL / OpenEnv (especially extension-point critiques)
- Citations / mentions in adjacent posts (set up Google Scholar Alert)
- arXiv mentions (if any related work cites pre-print or blog)

If a methodology error surfaces, the response protocol:
1. Acknowledge in the Discussion thread within 24 hours.
2. Patch the affected file in the repo with a clear commit message.
3. Add an "Errata" section to `PAPER_v0.md` documenting what was wrong and what changed.
4. Don't try to silently rewrite history.

---

*Drafts ready. Ship when you decide. The repo is in a clean state to support any subset of the publication wave above.*