baladithyab

Wave 5: full publication-materials drafts (pre-experimental release set)

639a760 12 days ago

7.57 kB

	# Publication Release Checklist

	> Last updated: 2026-05-25
	> Current state: all materials drafted; nothing posted publicly yet.
	> Use this checklist to coordinate the publication wave when ready to ship.

	## What's drafted

	\| Artifact \| Path \| Status \| Word count (approx) \|
	\|---\|---\|---\|---\|
	\| Longform methodology paper \| [`publications/PAPER_v0.md`](PAPER_v0.md) \| ✅ DRAFTED \| ~6,500 \|
	\| Blog post (HF Blog format) \| [`publications/BLOG_POST.md`](BLOG_POST.md) \| ✅ DRAFTED \| ~2,400 \|
	\| HF Discussion thread (repo Community tab) \| [`publications/HF_DISCUSSION_POST.md`](HF_DISCUSSION_POST.md) \| ✅ DRAFTED \| ~700 \|
	\| Twitter / X thread (13-tweet + 5-tweet + LinkedIn variants) \| [`publications/TWITTER_THREAD.md`](TWITTER_THREAD.md) \| ✅ DRAFTED \| ~1,200 \|
	\| `CITATION.cff` (HF/GitHub Citation Format) \| [`/CITATION.cff`](../CITATION.cff) \| ✅ DRAFTED \| n/a \|
	\| `CITATION.bib` (BibTeX) \| [`/CITATION.bib`](../CITATION.bib) \| ✅ DRAFTED \| n/a \|
	\| Repo README (model card with frontmatter) \| [`/README.md`](../README.md) \| ✅ Already published (v3 with wave 4 status) \| ~1,000 \|

	All draft materials are in `publications/` and not yet posted. Nothing is gated by review; everything is a self-publish decision. Ready to ship.

	## Pre-flight check before shipping any of these

	These items should be confirmed before posting any of the public-facing materials. Most are already done from earlier waves but listing here for completeness:

	- [x] HF repo is public (`Codeseys/composer-replication-framework`)
	- [x] All linked URLs resolve (cross-checked during drafts)
	- [x] Test suite passes (`38/38` as of wave 4)
	- [x] Spike 001 is reproducible (deterministic states + recorded results)
	- [x] Cursor blog is correctly summarized (audit notice in `research/01-composer-2.5.md`)
	- [x] Upstream papers cited correctly (OPSD, SDPO, Cursor blog with arXiv IDs verified)
	- [x] License is MIT and consistent across `LICENSE` + `README.md` frontmatter + `CITATION.cff`
	- [ ] `CITATION.cff` author block updated with real name/ORCID if desired (currently just "Codeseys")
	- [ ] Choose final author identity for the byline (Codeseys handle? real name? affiliation?)
	- [ ] HF Discussion title / tags chosen — suggested in `HF_DISCUSSION_POST.md`
	- [ ] Blog thumbnail prepared — placeholder path in `BLOG_POST.md` frontmatter (`/blog/assets/composer-replication-framework/thumbnail.png`); needs a real image
	- [ ] arXiv submission decided — see § "arXiv submission" below

	## Sequencing recommendation

	If publishing all materials, this order minimizes risk and maximizes signal:

	1. HF Discussion post first (lowest-stakes — repo Community tab; anyone landing on the repo will see it; it pre-announces the methodology paper).
	2. Blog post / personal site second (anchor narrative, ~2,400 words, easy to share).
	3. X / LinkedIn third (after the blog post URL exists to anchor the thread).
	4. arXiv submission last (if doing this — needs more polish; see below).

	Three-day gap between (1) and (2) is reasonable to let the discussion post collect any early feedback that should be incorporated into the blog.

	## Distribution / amplification ideas

	- Cross-post the blog to:
	- HuggingFace blog (PR against `huggingface/blog` repo). Their submission process is documented at https://huggingface.co/docs/hub/en/blog
	- Personal blog / Substack / Medium
	- Post the discussion in:
	- r/LocalLLaMA (will be eaten by their algorithm but worth one shot)
	- r/MachineLearning if you tag `[R]` and frame as "novel methodology, no results yet — looking for feedback"
	- HackerNews "Show HN: …" — pre-experimental disclosure should be in the title
	- LessWrong / Alignment Forum if you frame the reward-hacking section as the lead
	- Tag in the Twitter thread:
	- `@cursor_ai` (Cursor team)
	- `@huggingface` (TRL team)
	- `@volcanoengine` (VeRL team)
	- `@MoonshotAI` (Kimi K2.5)
	- `@PrimeIntellect`

	## arXiv submission (decide later)

	The methodology paper is currently in markdown. Pros and cons of a formal arXiv release:

	Pros
	- Citable DOI; appears in Google Scholar / Semantic Scholar
	- Reaches a non-HF research audience
	- Forces a higher polish bar, which catches errors

	Cons
	- Needs LaTeX conversion (~1 day of formatting work)
	- The "no experimental results yet" framing is unusual for arXiv; reviewers may dismiss
	- Once posted, it's permanent — corrections live as v2/v3 markers

	Recommendation: post the HF blog and discussion first; decide on arXiv only after spike 002–004 produce results. Then make it a v0.1 paper with experimental backing. The current methodology paper becomes Section 2–4 of that future paper, with new sections 5+ for the empirical results.

	If you do submit to arXiv now anyway: cs.LG primary, cs.AI cross-list. Title same as `PAPER_v0.md`. Abstract from the paper. Frame in the comments section as "pre-experimental methodology release; experimental validation in follow-up."

	## Embargo / coordination notes

	- Cursor team coordination: not strictly required (their blog is public, their cited papers are public, no proprietary info), but a polite heads-up tweet on day-of release is reasonable since the post heavily engages their work. `@cursor_ai` tag on tweet 1 of the X thread.
	- OPSD authors coordination: Siyan Zhao et al. — also not required (MIT code, public paper) but tagging the lead author on the X thread is a polite signal of citation. Their handles: try `@siyan_zhao` (verify before tagging).
	- SDPO authors coordination: same — Hübotter et al. lead author handles unverified, skip tagging if not findable.

	## Risk register

	\| Risk \| Likelihood \| Mitigation \|
	\|---\|---\|---\|
	\| Someone runs spike 004 first and beats us to publication \| Medium \| Acknowledged. Trade-off accepted. The integration architecture is independently citable. \|
	\| Methodology error caught after publication \| Medium \| Drafts have been audited (DeepWiki for code, primary-source-read for Cursor blog). 38 unit tests catch wiring bugs. The "what's NOT proven" section in the paper is explicit about open claims. \|
	\| Hostile read claiming we overclaim novelty \| Low \| The paper explicitly compares to rStar / Math-Shepherd / Magpie / MoA and concedes "absence of evidence is not evidence of absence" in §9. \|
	\| Cursor team objects to characterization \| Low \| Everything cited from their public blog with explicit `[BLOG-VERIFIED]` tags. SDPO/OPSD framing is supported by their own footnote. \|
	\| Repo gets a flood of PRs / discussion noise \| Low \| Welcome the noise. Maintain `CONTRIBUTING.md` (TBD) when traffic justifies. \|

	## Post-publication tracking (if you ship)

	Things to monitor in the first 2 weeks after publication:

	- HF repo: stars, forks, downloads (reachable via API)
	- HF Discussions tab: new threads, especially anything flagging methodology errors
	- X thread: replies from people working on TRL / VeRL / OpenEnv (especially extension-point critiques)
	- Citations / mentions in adjacent posts (set up Google Scholar Alert)
	- arXiv mentions (if any related work cites pre-print or blog)

	If a methodology error surfaces, the response protocol:
	1. Acknowledge in the Discussion thread within 24 hours.
	2. Patch the affected file in the repo with a clear commit message.
	3. Add an "Errata" section to `PAPER_v0.md` documenting what was wrong and what changed.
	4. Don't try to silently rewrite history.

	---

	Drafts ready. Ship when you decide. The repo is in a clean state to support any subset of the publication wave above.