diff --git a/docs/REWORK_PLAN.md b/docs/REWORK_PLAN.md index a622edd..028824f 100644 --- a/docs/REWORK_PLAN.md +++ b/docs/REWORK_PLAN.md @@ -1,6 +1,6 @@ # Initiative Tracker — Rework Plan -Status: **DRAFT — pending approval to execute** +Status: **APPROVED — executing** Owner: draistrick (fork → `keen99/ttrpg-initiative-tracker`, private) Upstream: `code.draft13.com/robert/ttrpg-initiative-tracker` (friend's Gitea) @@ -32,7 +32,7 @@ Upstream: `code.draft13.com/robert/ttrpg-initiative-tracker` (friend's Gitea) - Friend keeps it; we fork off it. ### Why a backend is mandatory -Browser sandbox cannot write the filesystem. No sqlite file, no `/data/db.sqlite`, nothing. Browser JS is blocked from disk by design. Therefore cross-device storage (DM ↔ tablet ↔ player view) requires a separate Node process owning the DB file and serving the browser over HTTP/WebSocket. There is no browser-only path. The backend is step one, not deferred. +Browser sandbox cannot write the filesystem. No sqlite file, no `/data/db.sqlite`, nothing. Browser JS is blocked from disk by design. Therefore cross-device storage (DM ↔ tablet ↔ player view) requires a separate Node process owning the DB file and serving the browser over HTTP/WebSocket. There is no browser-only path. **The backend is step one, not deferred.** ### Known bug: initiative skips / lost state Two failure classes observed: @@ -57,6 +57,13 @@ Current undo = stale snapshot write-back. Interleaved undos = data loss. Suspect Postgres deferred until public multiuser exposure is real. SQLite schema ports easily if that day comes. +### Backend design +- Owns SQLite file. Only writer. +- Holds authoritative state. Turn logic (initiative order, next-turn, add/remove mid-combat) runs server-side inside SQLite transaction. +- Client sends **action** (e.g. `NEXT_TURN`, not the resulting state). Server computes result, persists, broadcasts diff. +- Kills last-write-wins races by construction. +- WS broadcast on every state change → all connected clients (DM view, player display, tablet) update instantly. + ### Three storage impls, one interface (frontend) The storage interface is the test seam and the upstream-compat layer. @@ -72,16 +79,9 @@ The storage interface is the test seam and the upstream-compat layer. - `deleteDoc(path)`, `batch(ops)` - `subscribeDoc(path, cb)` / `subscribeCollection(path, cb)` → real-time push -Firebase impl: existing `onSnapshot` + SDK calls, moved verbatim behind interface. -WS impl: thin client; dispatches **actions** to backend, receives **state updates** via WS subscribe. -Memory impl: in-memory Map + EventEmitter, for tests that don't need the backend. - -### Backend design -- Owns SQLite file. Only writer. -- Holds authoritative state. FSM (turn logic) runs server-side inside SQLite transaction. -- Client sends **action** (e.g. `NEXT_TURN`, not the resulting state). Server computes result, persists, broadcasts diff. -- Kills last-write-wins races by construction. -- WS broadcast on every state change → all connected clients (DM view, player display, tablet) update instantly. +Firebase impl: existing `onSnapshot` + SDK calls, moved verbatim behind interface (M2). +WS impl: thin client; dispatches **actions** to backend, receives **state updates** via WS subscribe (M2). +Memory impl: in-memory Map + EventEmitter, for tests (M3). ### Repo layout (npm workspaces) @@ -98,17 +98,16 @@ Memory impl: in-memory Map + EventEmitter, for tests that don't need the backend server/ # NEW index.js # Express + ws bootstrap db.js # better-sqlite3, schema, migrations - fsm/ - turn.js # turn-order state machine (pure) - handlers/ # action handlers (call fsm, persist, broadcast) + turn.js # turn-order logic (pure, server-authoritative) + handlers/ # action handlers (call turn logic, persist, broadcast) server.test.js # API + WS integration tests shared/ # pure logic, no I/O, importable by client + server + tests - turn.js # turn FSM re-export (single source) + turn.js # turn logic (single source; server imports, tests import) types.js - shared.test/ # FSM unit tests (characterization + desired) + shared.test/ # turn logic unit tests (characterization + desired) turn.characterization.test.js turn.desired.test.js - docker-compose.yml # NEW — Milestone E + docker-compose.yml # NEW — M5 docs/ REWORK_PLAN.md # this file ``` @@ -123,34 +122,36 @@ Memory impl: in-memory Map + EventEmitter, for tests that don't need the backend Each milestone = independently mergeable PR upstream (unless marked ❌). -### Milestone 0 — Repo + branch setup -- Fresh branch off `main` (not `dsr-rework` — avoid contamination). Name: `rework-backend`. -- Add `upstream` remote (friend's Gitea, read-only fetch). +| M | Does | Tests? | +|---|---|---| +| 0 | repo, branch, remotes | no | +| 1 | build backend (Node+Express+ws+better-sqlite3) | unit tests as built | +| 2 | frontend WS adapter — app runs vs backend, cross-device works | yes | +| 3 | characterization tests lock current behavior (skip bug included) | yes | +| 4 | skip fix + manual override, regression-protected | yes | +| 5 | docker compose in-house | smoke | +| 6 | undo rework (tx events) | unit | +| 7 | playwright multi-window e2e (deferred) | e2e | + +### Milestone 0 — Repo + branch setup ✅ +- Fresh branch off `main` (not `dsr-rework`). Name: `rework-backend`. +- `upstream` remote = friend's Gitea (read-only fetch). - Push origin = `keen99/ttrpg-initiative-tracker` (private). - npm workspaces root config. - Commit this plan. -- **Exit criteria:** clean branch, plan committed, remotes set. +- **Exit criteria:** clean branch, plan committed, remotes set. ✅ DONE (commit ad7979d, then plan-restored). - **Upstream-PRable:** n/a (fork infra) -### Milestone 1 — Turn FSM extraction + characterization tests -- Extract pure turn-order logic from `App.js` (`handleNextTurn`, `computeTurnOrder*`, sort, add/remove mid-combat) into `shared/turn.js`. -- Pure function: `(state, action) → state`. No I/O, no Firebase, no React. -- Port **verbatim** — bugs included. -- Write characterization tests capturing current behavior (including the skip bug). Lock reality. -- FSM unit-testable in Node with zero infra. -- **Exit criteria:** all characterization tests green. Behavior provably identical to current. -- **Upstream-PRable:** ✅ pure refactor, zero behavior change, no Firebase dependency introduced. - -### Milestone 2 — Backend skeleton +### Milestone 1 — Build backend - `server/`: Express + ws + better-sqlite3. - Schema mirrors current Firestore doc tree (campaigns, encounters subcoll, activeDisplay, logs). -- FSM (from Milestone 1) runs server-side inside SQLite transaction. -- WS broadcast on every state change. -- Backend integration tests: spin server on random port, assert WS pushes + SQLite persists. -- **Exit criteria:** backend boots, serves state over WS, persists to SQLite, tests green. +- Turn logic (initiative order, next-turn, add/remove mid-combat) ported from `App.js` into `server/turn.js` (pure function, server-authoritative). Port verbatim — bugs included for now. +- Actions dispatched to backend; server computes result, persists in SQLite tx, broadcasts via WS. +- Unit tests as built: turn logic unit tests (characterization capturing current behavior), plus basic API/WS smoke. +- **Exit criteria:** backend boots, serves state over WS, persists to SQLite, unit tests green. - **Upstream-PRable:** ❌ divergence (friend stays Firebase). -### Milestone 3 — Frontend WS adapter +### Milestone 2 — Frontend WS adapter - Define `storage/types.js` interface. - Move all ~30 Firestore call sites from `App.js` into `storage/firebase.js` behind interface (verbatim). - Implement `storage/ws.js` per interface, talking to backend. Dispatches actions, subscribes to WS. @@ -161,15 +162,23 @@ Each milestone = independently mergeable PR upstream (unless marked ❌). - **Exit criteria:** app runs fully against local backend, no Firebase. Multi-device sync works. - **Upstream-PRable:** ⚠️ partial. Storage interface + firebase extract = ✅. WS impl = ❌. -### Milestone 4 — Red tests + fix skip bug + manual turn override +### Milestone 3 — Characterization tests lock current behavior +- Lock current behavior end-to-end via integration tests against running backend (turn logic now server-side). +- Capture the skip bug as a characterization test (whatever current does = locked, bugs included). +- Cover: START, NEXT_TURN, PAUSE, RESUME, ADD_PARTICIPANT, REMOVE_PARTICIPANT, TOGGLE_ACTIVE, REORDER, APPLY_DAMAGE/HEAL, DEATH_SAVE, END. +- Iterate until confident: baseline solid, regressions impossible to silently slip. +- **Exit criteria:** characterization suite green against backend. Baseline locked. +- **Upstream-PRable:** ✅ if kept storage-agnostic (tests target turn logic shape). + +### Milestone 4 — Skip fix + manual turn override - Write desired-behavior tests (red): - Never-skip invariant: after `NEXT_TURN`, current participant is always a valid active participant, or encounter cleanly ends. - Mid-combat add enters turn order correctly. - Remove mid-combat doesn't skip next. - Pause/resume preserves order. -- Fix FSM until red tests go green. Skip bug dies. -- Add new action: `JUMP_TURN_TO(participantId)`. DM clicks participant → cursor jumps → that participant's turn now → future `NEXT_TURN` continues from there. UI button label: "Make This Turn" (candidates: "Force Turn Here"). -- Regression-protected by Milestone 1 characterization + new desired tests. +- Fix turn logic until red tests go green. Skip bug dies. +- Add new action: `JUMP_TURN_TO(participantId)`. DM clicks participant → cursor jumps → that participant's turn now → future `NEXT_TURN` continues from there. UI button label: "Make This Turn". +- Regression-protected by M3 characterization + new desired tests. - **Exit criteria:** skip bug gone + provably cannot regress. Manual override works. - **Upstream-PRable:** ✅ logic fix + new feature, both beneficial. @@ -187,12 +196,12 @@ Each milestone = independently mergeable PR upstream (unless marked ❌). - Replaces current fragile `/logs` snapshot-write undo. - Migration: keep old undo working for existing entries until cleared; new format for new entries. - **Exit criteria:** undo works transactionally; interleaved undos don't corrupt. -- **Upstream-PRable:** ✅ if logic kept storage-agnostic (FSM-level). Backend-specific events table = ❌. +- **Upstream-PRable:** ⚠️ partial. Turn-logic-level undo = ✅. Backend events table = ❌. ### Milestone 7 — Playwright E2E (deferred) - Multi-window E2E: DM view + display + player view in separate browser contexts against running backend. - Verify realtime sync end-to-end. -- **Only build if sync regresses or we deviate significantly.** FSM + backend integration tests cover most regression risk cheaper. +- **Only build if sync regresses or we deviate significantly.** Turn-logic unit + backend integration tests cover most regression risk cheaper. - **Exit criteria:** e2e green for core combat flow across 3 windows. - **Upstream-PRable:** ✅ if test infra shared. @@ -207,14 +216,14 @@ Each milestone = independently mergeable PR upstream (unless marked ❌). ## Testing strategy ### Layers -1. **FSM unit tests** (Jest, pure functions) — every turn transition, skip invariants, manual override. Cheap, essential. Covers skip bug permanently. (Milestones 1, 4) -2. **Backend integration tests** (Jest) — spin server on random port, assert WS pushes + SQLite persists + transactional correctness. (Milestone 2) -3. **Frontend adapter contract tests** (Jest, `memory`) — impl parity against interface. (Milestone 3) -4. **Playwright multi-window E2E** — deferred. Only realtime sync glue FSM can't reach. (Milestone 7) +1. **Turn logic unit tests** (Jest, pure functions) — every turn transition, skip invariants, manual override. Built in M1 (characterization), extended in M4 (desired). Cheap, essential. +2. **Backend integration tests** (Jest) — spin server on random port, assert WS pushes + SQLite persists + transactional correctness. (M1+) +3. **Frontend adapter contract tests** (Jest, `memory`) — impl parity against interface. (M2) +4. **Playwright multi-window E2E** — deferred. Only realtime sync glue turn logic can't reach. (M7) -### Two-pass on FSM (Milestones 1 → 4) -1. **Characterization** — capture current behavior exactly (bugs included). Locks extraction as provably identical. Lets later refactor port safely. -2. **Desired-behavior (red)** — write what *should* happen. Fail today. Fix → green. Bug dies, stays dead. +### Two-pass on turn logic (M1 → M4) +1. **Characterization** (M1/M3) — capture current behavior exactly (bugs included). Locks extraction/port as provably identical. Lets later fix be provable. +2. **Desired-behavior (red)** (M4) — write what *should* happen. Fail today. Fix → green. Bug dies, stays dead. ### Manual smoke via config flags - `STORAGE=firebase` → current behavior (friend's path, upstream default). @@ -232,12 +241,12 @@ Each milestone = independently mergeable PR upstream (unless marked ❌). | Milestone | Upstream-PRable? | Why | |---|---|---| | 0 repo setup | n/a | fork infra | -| 1 FSM extract + characterization | ✅ | pure refactor, identical behavior | -| 2 backend | ❌ | divergence (friend stays Firebase) | -| 3 WS adapter | ⚠️ partial | interface + firebase extract ✅, WS ❌ | +| 1 backend | ❌ | divergence (friend stays Firebase) | +| 2 WS adapter | ⚠️ partial | interface + firebase extract ✅, WS ❌ | +| 3 characterization tests | ✅ | if storage-agnostic | | 4 skip fix + manual override | ✅ | logic fix + beneficial feature | | 5 docker compose | ❌ | divergence | -| 6 undo rework | ⚠️ partial | FSM logic ✅, events table ❌ | +| 6 undo rework | ⚠️ partial | turn-logic-level ✅, events table ❌ | | 7 playwright | ✅ | if test infra shared | Default `STORAGE=firebase` + `AUTH_MODE=none` (unset) = upstream sees literally zero change. @@ -246,23 +255,23 @@ Default `STORAGE=firebase` + `AUTH_MODE=none` (unset) = upstream sees literally ## Risks -- **CRA + workspaces friction.** Create React App may resist monorepo layout. Mitigation: keep `src/` as CRA root, `shared/` as separate workspace imported via alias. Eject/craco only if forced. -- **FSM extraction correctness.** Current logic tangled; verbatim port risks subtle drift. Mitigation: characterization tests first, diff behavior before any fix. +- **CRA + workspaces friction.** Create React App may resist monorepo layout. Mitigation: keep `src/` as CRA root, `server/` + `shared/` as separate workspaces imported via alias. Eject/craco only if forced. +- **Turn logic port correctness.** Current logic tangled; verbatim port risks subtle drift. Mitigation: characterization tests in M1/M3 lock behavior before any fix. - **Firebase drift untested.** Mitigation: interface contract; friend's path his to maintain. - **Undo history migration.** Existing log entries use old snapshot format. Mitigation: keep old undo working until cleared, new format for new entries. - **WS reconnect/state-sync edge cases.** Transient drop mid-combat. Mitigation: client requests full state resync on (re)connect; server is source of truth. --- -## Open decisions (need answers before Milestone 0) +## Decisions (locked) -1. **Branch name** off `main`. Propose: `rework-backend`. Confirm or rename. -2. **Manual turn override.** Action `JUMP_TURN_TO(participantId)`. UI button label "Make This Turn" (alt: "Force Turn Here"). Pick. -3. **npm workspaces** for `shared/` + `server/` alongside CRA `src/`. Prefer yes. CRA may fight → alias as fallback. +1. **Branch:** `rework-backend` off `main`. +2. **Manual turn override:** action `JUMP_TURN_TO(participantId)`. UI button "Make This Turn". +3. **npm workspaces** for `server/` + `shared/` alongside CRA `src/`. Fallback alias if CRA fights. --- -## Next action (on approval) +## Next action -Milestone 0: create branch `rework-backend` off `main`, set remotes, commit this plan. -Then Milestone 1 kickoff: trace `handleNextTurn` full path + write first characterization test capturing the skip bug. +M0 ✅ DONE. +M1 kickoff: scaffold `server/` workspace, set up better-sqlite3 + Express + ws, port turn logic from `App.js` into `server/turn.js`, write first unit tests.