File size: 14,174 Bytes
755a930
 
 
 
 
 
 
 
 
 
 
5576f9a
755a930
 
 
 
 
 
5576f9a
 
 
755a930
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5576f9a
755a930
 
 
 
 
 
 
 
 
 
 
 
 
 
c362999
755a930
c362999
 
755a930
7a42df5
 
 
 
 
 
 
 
 
 
 
755a930
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5576f9a
755a930
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5576f9a
755a930
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
# Collab Editor - Test Plan

Companion document to [SPECIFICATION.md](./SPECIFICATION.md). Covers the critical paths that, if broken, would cause data loss, publishing failures, or security holes. Intentionally **not** exhaustive - the project is evolving fast, and these tests are chosen to survive refactors.

## Principles

- Test **behavior**, not implementation details.
- Focus on what **breaks silently** (data loss, corrupt publish, auth bypass).
- Keep tests **decoupled from internals** so they survive refactors.
- Fast to run - no heavy dependencies (Playwright PDF, real HF API) in the main suite.

## Stack

| Layer | Tool | Scope |
|-------|------|-------|
| Backend unit/integration | Vitest + Supertest | Publisher, API routes, auth guards |
| Yjs helpers | In-memory Y.Doc fixtures | Publisher extraction, storage logic |
| Frontend unit | Vitest + Testing Library | Agent tool execution, undo batching |
| E2E | Playwright (`backend/e2e`, `frontend/tests`) | Critical editor flows: load, edit, comments, publish round-trip |

E2E is opt-in (`npm run test:e2e` in `backend/`, plus Playwright specs in `frontend/tests/`); the fast suite (`npm run test`) stays hermetic and Playwright-free so it can run in CI without a browser install.

---

## 1. Publisher Pipeline (P0)

The core value of the product. If publishing breaks, the article disappears.

### 1.1 Y.Doc extraction

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 1.1.1 | Extract frontmatter from Y.Doc | Given a Y.Doc with title, authors, affiliations in `Y.Map("frontmatter")` / When `extractFromYDoc` runs / Then returned object contains all frontmatter fields with correct types | P0 |
| 1.1.2 | Extract content from empty doc | Given a Y.Doc with an empty `Y.XmlFragment("default")` / When extracted / Then returns empty content without throwing | P0 |
| 1.1.3 | Extract with citations | Given a Y.Doc with entries in `Y.Map("citations")` / When extracted / Then citations map is included in output as CSL-JSON | P0 |

### 1.2 HTML generation

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 1.2.1 | Produces valid self-contained HTML | Given extracted doc data / When `renderArticleHTML` runs / Then output is valid HTML with inline CSS, no external stylesheet links | P0 |
| 1.2.2 | CSS variables resolved | Given template CSS with `@custom-media` / When `resolveCustomMedia` runs / Then output contains only standard `@media` rules, no `@custom-media` | P0 |
| 1.2.3 | TOC generated from headings | Given content with h2 and h3 / When rendered / Then HTML contains TOC nav with matching anchor links | P1 |
| 1.2.4 | Theme toggle present | Given any doc / When rendered / Then output contains theme toggle SVG (sun/moon) and associated script | P1 |
| 1.2.5 | Bibliography injected | Given doc with citations / When rendered / Then HTML contains bibliography section with formatted entries | P0 |

### 1.3 Post-processing

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 1.3.1 | Accordion to details/summary | Given HTML with accordion div / When post-processed / Then output uses `<details>` and `<summary>` tags | P1 |
| 1.3.2 | htmlEmbed to iframe | Given HTML with htmlEmbed node / When post-processed / Then output contains `<iframe>` with correct src | P0 |
| 1.3.3 | Mermaid to pre block | Given HTML with mermaid node / When post-processed / Then output contains `<pre class="mermaid">` | P1 |

### 1.4 Publish idempotency and restore

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 1.4.1 | Publish twice gives same result | Given a Y.Doc / When published twice / Then both HTML outputs are byte-identical (no timestamps or random IDs) | P0 |
| 1.4.2 | Published article restored on boot | Given published assets in HF dataset but empty local FS / When `ensurePublishedRestored` runs / Then `data/published/default/index.html` exists locally | P0 |
| 1.4.3 | GET / serves published article | Given local published index.html exists / When GET / / Then response is the published HTML with 200 | P0 |

---

## 2. Persistence and HF Storage (P0)

Data loss is game over for a collaborative editor.

### 2.1 Local persistence

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 2.1.1 | Store writes .yjs file | Given a Y.Doc update triggers `debouncedSave` via `onChange` / When the debounce (2s) elapses / Then `data/<name>.yjs` exists and contains valid Yjs binary | P0 |
| 2.1.2 | Fetch reads local file | Given `data/default.yjs` exists / When Database.fetch / Then Y.Doc is hydrated with stored content | P0 |
| 2.1.3 | Fetch falls back to HF pull | Given no local .yjs file but HF dataset has one / When Database.fetch / Then file is pulled from HF and Y.Doc is hydrated | P0 |

### 2.2 HF dataset sync

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 2.2.1 | Push debounced at 10s | Given two rapid store calls / When 10s elapses / Then only one HF push is made | P1 |
| 2.2.2 | flushAll on SIGTERM | Given pending debounced pushes / When SIGTERM received / Then all pending data is pushed before exit | P0 |

### 2.3 Image upload

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 2.3.1 | Upload returns proxy URL | Given a valid image file / When POST /api/upload / Then response contains a `/d/images/...` URL routed through the editor's dataset proxy (the underlying HF dataset is private) | P1 |
| 2.3.2 | Reject oversized file | Given a file > 10MB / When POST /api/upload / Then response is 413 or 400 with error message | P1 |
| 2.3.3 | Proxy serves images | Given an uploaded image / When GET `/d/images/<file>` / Then 200 + image bytes (proxy attaches a server-side token to fetch the private HF dataset) | P1 |
| 2.3.4 | Proxy whitelist | Given any path under `/d/articles/...` (raw Y.js drafts) / When GET / Then 404 - never expose drafts via the proxy | P0 |

### 2.4 Storage status & disaster recovery

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 2.4.1 | Status surfaces dataset error | Given `createRepo` returns 403 / When GET /api/storage/status / Then response `lastError.stage === "dataset-create"` with `statusCode: 403` | P0 |
| 2.4.2 | Status clears on recovery | Given a previous push error / When the next push succeeds / Then `lastError` is null and `lastCloudPushAt` is updated | P1 |
| 2.4.3 | Status auth-gated | Given an anonymous request (oauthEnabled) / When GET /api/storage/status / Then 403 - don't leak dataset error details | P1 |
| 2.4.4 | Eager creation on login | Given a successful /api/auth/status with canEdit / When the request completes / Then `ensureDatasetExists` has been attempted (success surfaces within one storage-status poll, failure surfaces too) | P0 |
| 2.4.5 | Admin export streams .yjs | Given an editor user / When GET /api/admin/export-doc?name=default / Then 200 + `Content-Disposition: attachment` + raw .yjs body | P1 |
| 2.4.6 | Admin export auth-gated | Given a non-canEdit request / When GET /api/admin/export-doc / Then 403 | P0 |

---

## 3. API Routes - HTTP Contracts (P1)

Test the request/response shape, not the internal logic.

### 3.1 Publish

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 3.1.1 | Publish returns success | Given a valid Y.Doc in Hocuspocus / When POST /api/publish / Then response contains `{ success: true, htmlUrl }` | P0 |
| 3.1.2 | Publish writes local HTML | Given POST /api/publish succeeds / When checking local FS / Then `data/published/default/index.html` exists | P0 |

### 3.2 Chat (AI Agent)

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 3.2.1 | Stream returns valid SSE | Given a chat message / When POST /api/chat / Then response is `text/event-stream` with parseable SSE events | P1 |
| 3.2.2 | Tool calls included in stream | Given a prompt that triggers a tool / When streamed / Then SSE contains tool_call events with name and arguments | P1 |

### 3.3 Citations

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 3.3.1 | Resolve DOI to CSL-JSON | Given a valid DOI string / When POST /api/citations/resolve / Then response contains CSL-JSON with title and authors | P1 |
| 3.3.2 | Import BibTeX | Given a valid BibTeX string / When POST /api/citations/import-bib / Then response contains CSL-JSON entries | P1 |
| 3.3.3 | Format to HTML bibliography | Given CSL-JSON entries + style / When POST /api/citations/format / Then response contains HTML string | P1 |

### 3.4 Auth status

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 3.4.1 | Unauthenticated status | Given no cookie / When GET /api/auth/status / Then `{ authenticated: false, canEdit: false }` | P1 |
| 3.4.2 | Authenticated status | Given valid hf_access_token cookie / When GET /api/auth/status / Then `{ authenticated: true, canEdit: true, user: {...} }` | P1 |

---

## 4. Auth and Security (P0)

### 4.1 Route protection

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 4.1.1 | Publish requires auth | Given OAuth enabled + no cookie / When POST /api/publish / Then 401 or 403 | P0 |
| 4.1.2 | Reset-document requires auth | Given OAuth enabled + no cookie / When POST /api/admin/reset-document / Then 401 or 403 | P0 |
| 4.1.3 | Upload works without auth | Given no cookie / When POST /api/upload with valid image / Then 200 (upload is not auth-gated per spec) | P1 |
| 4.1.4 | Upload rate-limited per IP | Given more than 30 uploads within 60s from the same IP / When POST /api/upload / Then 429 with `Retry-After` header (anti-abuse since upload is anonymous) | P1 |

### 4.2 OAuth flow

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 4.2.1 | CSRF state validated | Given an OAuth callback with wrong state / When GET /auth/callback / Then rejected (400 or redirect to error) | P0 |
| 4.2.2 | Cookie set on success | Given valid OAuth callback / When processed / Then `hf_access_token` cookie is set (httpOnly, secure, sameSite: none) | P0 |

### 4.3 WebSocket auth

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 4.3.1 | WS rejected without token | Given OAuth enabled / When WebSocket connects to /collab without token / Then connection rejected | P0 |
| 4.3.2 | WS accepted with valid token | Given OAuth enabled + valid token / When WebSocket connects to /collab / Then connection accepted, sync starts | P1 |

### 4.4 XSS (known risk from spec)

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 4.4.1 | Licence field escaped | Given `meta.licence` containing `<script>alert(1)</script>` / When published HTML rendered / Then script tag is escaped or stripped | P0 |
| 4.4.2 | Bibliography HTML sanitized | Given biblioHtml containing malicious script / When injected into published page / Then script is escaped or stripped | P0 |

---

## 5. AI Agent - Plumbing (P1)

We do NOT test LLM output quality. We test the infrastructure around it.

### 5.1 Context building

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 5.1.1 | Context includes doc text | Given a doc with content / When building chat context / Then context object contains document text | P1 |
| 5.1.2 | Context includes selection | Given a text selection / When building chat context / Then context object contains selection text and position | P1 |
| 5.1.3 | Context includes frontmatter | Given doc with title and authors / When building chat context / Then context object contains frontmatter fields | P1 |

### 5.2 Tool execution (client-side)

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 5.2.1 | replaceSelection applies | Given a selection in the editor / When replaceSelection tool called with new text / Then selection is replaced in the TipTap doc | P1 |
| 5.2.2 | applyDiff does search/replace | Given doc with "Hello world" / When applyDiff called with search="world" replace="editor" / Then doc contains "Hello editor" | P1 |
| 5.2.3 | updateFrontmatter modifies map | Given frontmatter with title "Old" / When updateFrontmatter called with title="New" / Then `Y.Map("frontmatter").get("title")` is "New" | P1 |

### 5.3 Undo batching

| # | Test | Given / When / Then | Priority |
|---|------|---------------------|----------|
| 5.3.1 | Agent edits batch into one undo | Given agent executes 3 tool calls (replaceSelection + applyDiff + updateFrontmatter) / When user presses Cmd+Z once / Then all 3 changes are reverted | P0 |
| 5.3.2 | Manual edits not in agent batch | Given user types text then agent edits / When user presses Cmd+Z / Then only agent edits are reverted, user text remains | P1 |

---

## What we do NOT test (and why)

| Area | Reason |
|------|--------|
| Yjs real-time sync | Yjs/Hocuspocus are mature libs - testing their CRDT sync is noise |
| Visual rendering of components | Project evolves too fast, snapshot tests would break constantly |
| PDF generation (Playwright) | Heavy dependency, visual output - better as a manual smoke test |
| Slash commands / Bubble toolbar | UI that will change - covered by the opt-in Playwright suite instead of the hermetic one |
| CSS architecture | Not meaningfully unit-testable |
| TipTap extension registration | Framework internals, not business logic |

---

## Summary

| Section | P0 tests | P1 tests | Total |
|---------|----------|----------|-------|
| 1. Publisher Pipeline | 8 | 4 | 12 |
| 2. Persistence / Storage | 4 | 3 | 7 |
| 3. API Routes | 2 | 6 | 8 |
| 4. Auth / Security | 5 | 2 | 7 |
| 5. AI Agent | 1 | 7 | 8 |
| **Total** | **20** | **22** | **42** |

Start with the 20 P0 tests. They cover the "if this breaks, we lose data or trust" surface. Add P1 tests as the codebase stabilizes.