tfrere's picture
tfrere HF Staff
feat(storage): first-class data - no silent failures in the persistence pipeline
7a42df5

Collab Editor - Test Plan

Companion document to SPECIFICATION.md. Covers the critical paths that, if broken, would cause data loss, publishing failures, or security holes. Intentionally not exhaustive - the project is evolving fast, and these tests are chosen to survive refactors.

Principles

  • Test behavior, not implementation details.
  • Focus on what breaks silently (data loss, corrupt publish, auth bypass).
  • Keep tests decoupled from internals so they survive refactors.
  • Fast to run - no heavy dependencies (Playwright PDF, real HF API) in the main suite.

Stack

Layer Tool Scope
Backend unit/integration Vitest + Supertest Publisher, API routes, auth guards
Yjs helpers In-memory Y.Doc fixtures Publisher extraction, storage logic
Frontend unit Vitest + Testing Library Agent tool execution, undo batching
E2E Playwright (backend/e2e, frontend/tests) Critical editor flows: load, edit, comments, publish round-trip

E2E is opt-in (npm run test:e2e in backend/, plus Playwright specs in frontend/tests/); the fast suite (npm run test) stays hermetic and Playwright-free so it can run in CI without a browser install.


1. Publisher Pipeline (P0)

The core value of the product. If publishing breaks, the article disappears.

1.1 Y.Doc extraction

# Test Given / When / Then Priority
1.1.1 Extract frontmatter from Y.Doc Given a Y.Doc with title, authors, affiliations in Y.Map("frontmatter") / When extractFromYDoc runs / Then returned object contains all frontmatter fields with correct types P0
1.1.2 Extract content from empty doc Given a Y.Doc with an empty Y.XmlFragment("default") / When extracted / Then returns empty content without throwing P0
1.1.3 Extract with citations Given a Y.Doc with entries in Y.Map("citations") / When extracted / Then citations map is included in output as CSL-JSON P0

1.2 HTML generation

# Test Given / When / Then Priority
1.2.1 Produces valid self-contained HTML Given extracted doc data / When renderArticleHTML runs / Then output is valid HTML with inline CSS, no external stylesheet links P0
1.2.2 CSS variables resolved Given template CSS with @custom-media / When resolveCustomMedia runs / Then output contains only standard @media rules, no @custom-media P0
1.2.3 TOC generated from headings Given content with h2 and h3 / When rendered / Then HTML contains TOC nav with matching anchor links P1
1.2.4 Theme toggle present Given any doc / When rendered / Then output contains theme toggle SVG (sun/moon) and associated script P1
1.2.5 Bibliography injected Given doc with citations / When rendered / Then HTML contains bibliography section with formatted entries P0

1.3 Post-processing

# Test Given / When / Then Priority
1.3.1 Accordion to details/summary Given HTML with accordion div / When post-processed / Then output uses <details> and <summary> tags P1
1.3.2 htmlEmbed to iframe Given HTML with htmlEmbed node / When post-processed / Then output contains <iframe> with correct src P0
1.3.3 Mermaid to pre block Given HTML with mermaid node / When post-processed / Then output contains <pre class="mermaid"> P1

1.4 Publish idempotency and restore

# Test Given / When / Then Priority
1.4.1 Publish twice gives same result Given a Y.Doc / When published twice / Then both HTML outputs are byte-identical (no timestamps or random IDs) P0
1.4.2 Published article restored on boot Given published assets in HF dataset but empty local FS / When ensurePublishedRestored runs / Then data/published/default/index.html exists locally P0
1.4.3 GET / serves published article Given local published index.html exists / When GET / / Then response is the published HTML with 200 P0

2. Persistence and HF Storage (P0)

Data loss is game over for a collaborative editor.

2.1 Local persistence

# Test Given / When / Then Priority
2.1.1 Store writes .yjs file Given a Y.Doc update triggers debouncedSave via onChange / When the debounce (2s) elapses / Then data/<name>.yjs exists and contains valid Yjs binary P0
2.1.2 Fetch reads local file Given data/default.yjs exists / When Database.fetch / Then Y.Doc is hydrated with stored content P0
2.1.3 Fetch falls back to HF pull Given no local .yjs file but HF dataset has one / When Database.fetch / Then file is pulled from HF and Y.Doc is hydrated P0

2.2 HF dataset sync

# Test Given / When / Then Priority
2.2.1 Push debounced at 10s Given two rapid store calls / When 10s elapses / Then only one HF push is made P1
2.2.2 flushAll on SIGTERM Given pending debounced pushes / When SIGTERM received / Then all pending data is pushed before exit P0

2.3 Image upload

# Test Given / When / Then Priority
2.3.1 Upload returns proxy URL Given a valid image file / When POST /api/upload / Then response contains a /d/images/... URL routed through the editor's dataset proxy (the underlying HF dataset is private) P1
2.3.2 Reject oversized file Given a file > 10MB / When POST /api/upload / Then response is 413 or 400 with error message P1
2.3.3 Proxy serves images Given an uploaded image / When GET /d/images/<file> / Then 200 + image bytes (proxy attaches a server-side token to fetch the private HF dataset) P1
2.3.4 Proxy whitelist Given any path under /d/articles/... (raw Y.js drafts) / When GET / Then 404 - never expose drafts via the proxy P0

2.4 Storage status & disaster recovery

# Test Given / When / Then Priority
2.4.1 Status surfaces dataset error Given createRepo returns 403 / When GET /api/storage/status / Then response lastError.stage === "dataset-create" with statusCode: 403 P0
2.4.2 Status clears on recovery Given a previous push error / When the next push succeeds / Then lastError is null and lastCloudPushAt is updated P1
2.4.3 Status auth-gated Given an anonymous request (oauthEnabled) / When GET /api/storage/status / Then 403 - don't leak dataset error details P1
2.4.4 Eager creation on login Given a successful /api/auth/status with canEdit / When the request completes / Then ensureDatasetExists has been attempted (success surfaces within one storage-status poll, failure surfaces too) P0
2.4.5 Admin export streams .yjs Given an editor user / When GET /api/admin/export-doc?name=default / Then 200 + Content-Disposition: attachment + raw .yjs body P1
2.4.6 Admin export auth-gated Given a non-canEdit request / When GET /api/admin/export-doc / Then 403 P0

3. API Routes - HTTP Contracts (P1)

Test the request/response shape, not the internal logic.

3.1 Publish

# Test Given / When / Then Priority
3.1.1 Publish returns success Given a valid Y.Doc in Hocuspocus / When POST /api/publish / Then response contains { success: true, htmlUrl } P0
3.1.2 Publish writes local HTML Given POST /api/publish succeeds / When checking local FS / Then data/published/default/index.html exists P0

3.2 Chat (AI Agent)

# Test Given / When / Then Priority
3.2.1 Stream returns valid SSE Given a chat message / When POST /api/chat / Then response is text/event-stream with parseable SSE events P1
3.2.2 Tool calls included in stream Given a prompt that triggers a tool / When streamed / Then SSE contains tool_call events with name and arguments P1

3.3 Citations

# Test Given / When / Then Priority
3.3.1 Resolve DOI to CSL-JSON Given a valid DOI string / When POST /api/citations/resolve / Then response contains CSL-JSON with title and authors P1
3.3.2 Import BibTeX Given a valid BibTeX string / When POST /api/citations/import-bib / Then response contains CSL-JSON entries P1
3.3.3 Format to HTML bibliography Given CSL-JSON entries + style / When POST /api/citations/format / Then response contains HTML string P1

3.4 Auth status

# Test Given / When / Then Priority
3.4.1 Unauthenticated status Given no cookie / When GET /api/auth/status / Then { authenticated: false, canEdit: false } P1
3.4.2 Authenticated status Given valid hf_access_token cookie / When GET /api/auth/status / Then { authenticated: true, canEdit: true, user: {...} } P1

4. Auth and Security (P0)

4.1 Route protection

# Test Given / When / Then Priority
4.1.1 Publish requires auth Given OAuth enabled + no cookie / When POST /api/publish / Then 401 or 403 P0
4.1.2 Reset-document requires auth Given OAuth enabled + no cookie / When POST /api/admin/reset-document / Then 401 or 403 P0
4.1.3 Upload works without auth Given no cookie / When POST /api/upload with valid image / Then 200 (upload is not auth-gated per spec) P1
4.1.4 Upload rate-limited per IP Given more than 30 uploads within 60s from the same IP / When POST /api/upload / Then 429 with Retry-After header (anti-abuse since upload is anonymous) P1

4.2 OAuth flow

# Test Given / When / Then Priority
4.2.1 CSRF state validated Given an OAuth callback with wrong state / When GET /auth/callback / Then rejected (400 or redirect to error) P0
4.2.2 Cookie set on success Given valid OAuth callback / When processed / Then hf_access_token cookie is set (httpOnly, secure, sameSite: none) P0

4.3 WebSocket auth

# Test Given / When / Then Priority
4.3.1 WS rejected without token Given OAuth enabled / When WebSocket connects to /collab without token / Then connection rejected P0
4.3.2 WS accepted with valid token Given OAuth enabled + valid token / When WebSocket connects to /collab / Then connection accepted, sync starts P1

4.4 XSS (known risk from spec)

# Test Given / When / Then Priority
4.4.1 Licence field escaped Given meta.licence containing <script>alert(1)</script> / When published HTML rendered / Then script tag is escaped or stripped P0
4.4.2 Bibliography HTML sanitized Given biblioHtml containing malicious script / When injected into published page / Then script is escaped or stripped P0

5. AI Agent - Plumbing (P1)

We do NOT test LLM output quality. We test the infrastructure around it.

5.1 Context building

# Test Given / When / Then Priority
5.1.1 Context includes doc text Given a doc with content / When building chat context / Then context object contains document text P1
5.1.2 Context includes selection Given a text selection / When building chat context / Then context object contains selection text and position P1
5.1.3 Context includes frontmatter Given doc with title and authors / When building chat context / Then context object contains frontmatter fields P1

5.2 Tool execution (client-side)

# Test Given / When / Then Priority
5.2.1 replaceSelection applies Given a selection in the editor / When replaceSelection tool called with new text / Then selection is replaced in the TipTap doc P1
5.2.2 applyDiff does search/replace Given doc with "Hello world" / When applyDiff called with search="world" replace="editor" / Then doc contains "Hello editor" P1
5.2.3 updateFrontmatter modifies map Given frontmatter with title "Old" / When updateFrontmatter called with title="New" / Then Y.Map("frontmatter").get("title") is "New" P1

5.3 Undo batching

# Test Given / When / Then Priority
5.3.1 Agent edits batch into one undo Given agent executes 3 tool calls (replaceSelection + applyDiff + updateFrontmatter) / When user presses Cmd+Z once / Then all 3 changes are reverted P0
5.3.2 Manual edits not in agent batch Given user types text then agent edits / When user presses Cmd+Z / Then only agent edits are reverted, user text remains P1

What we do NOT test (and why)

Area Reason
Yjs real-time sync Yjs/Hocuspocus are mature libs - testing their CRDT sync is noise
Visual rendering of components Project evolves too fast, snapshot tests would break constantly
PDF generation (Playwright) Heavy dependency, visual output - better as a manual smoke test
Slash commands / Bubble toolbar UI that will change - covered by the opt-in Playwright suite instead of the hermetic one
CSS architecture Not meaningfully unit-testable
TipTap extension registration Framework internals, not business logic

Summary

Section P0 tests P1 tests Total
1. Publisher Pipeline 8 4 12
2. Persistence / Storage 4 3 7
3. API Routes 2 6 8
4. Auth / Security 5 2 7
5. AI Agent 1 7 8
Total 20 22 42

Start with the 20 P0 tests. They cover the "if this breaks, we lose data or trust" surface. Add P1 tests as the codebase stabilizes.