Files
david raistrick a2c63cc77f docs(REWORK_PLAN): M5 done, PRable
M5 Docker: single container (caddy+node) verified working. REST roundtrip,
WS push, 20-round replay CLEAN, UI styled. Done.

PRable: separate docker/ tree, root Dockerfile untouched, firebase
default preserved (STORAGE=firebase). Friend merges, gets our docker
infra without touching his firebase path.

M0-M5 all done.
2026-07-01 19:26:23 -04:00

267 lines
13 KiB
Markdown

# Initiative Tracker — Rework Plan
Status: **APPROVED — executing**
Owner: draistrick (fork → `keen99/ttrpg-initiative-tracker`, private)
Upstream: `code.draft13.com/robert/ttrpg-initiative-tracker` (friend's Gitea)
---
## Goals
1. **Replace Firebase with self-hosted backend.** Browser cannot own a DB file (sandbox). Cross-device (DM + tablet + player view) requires a real backend. Backend is the foundation, built first.
2. **Automated test ecosystem as the baseline.** Lock current behavior before changing it.
3. **Remain mergeable upstream.** Default behavior (Firebase) preserved behind flag. Upstream `main` stays clean. Friend keeps Firebase path.
4. **Self-hostable in local Docker** (in-house network). Public exposure = future, only after auth + multiuser safety.
## Non-Goals (this plan)
- Ripping Firebase. Kept as default adapter upstream.
- Public/multiuser deployment. Deferred.
- Rewriting the entire 2935-line `App.js`. Only extract what testability demands.
- Feature/bug work. That lives in `TODO.md`. This plan = infra + backend + test harness only.
---
## Problem Statement
### Why Firebase is wrong here (for this fork)
- Requires Google account + network for a single-user tabletop tool.
- Realtime value (DM view ↔ player display) is real but solvable locally.
- API key baked into client bundle (CRA `REACT_APP_*` at build); security depends entirely on console rules not in repo.
- Vendor lock + quota; `onSnapshot` on collections burns reads.
- Friend keeps it; we fork off it.
### Why a backend is mandatory
Browser sandbox cannot write the filesystem. No sqlite file, no `/data/db.sqlite`, nothing. Browser JS is blocked from disk by design. Therefore cross-device storage (DM ↔ tablet ↔ player view) requires a separate Node process owning the DB file and serving the browser over HTTP/WebSocket. There is no browser-only path. **The backend is step one, not deferred.**
---
## Architecture
### Stack (locked)
- **Node.js** runtime
- **Express** web framework
- **ws** WebSocket lib (realtime push, replaces `onSnapshot`)
- **better-sqlite3** SQLite driver (synchronous, simple, fast)
- **SQLite** DB (single file, docker volume, trivial backup)
- **Jest** test runner (already in CRA deps)
Postgres deferred until public multiuser exposure is real. SQLite schema ports easily if that day comes.
### Backend design
- Owns SQLite file. Only writer.
- Holds authoritative state.
- Generic KV doc store (firebase mirror): single `docs` table (path PK, parent, data JSON, updated_at). Opaque JSON at arbitrary path strings. No shape-specific endpoints. App logic stays client-side.
- WS broadcast on every state change → all connected clients (DM view, player display, tablet) update instantly.
### Three storage impls, one interface (frontend)
The storage interface is the test seam and the upstream-compat layer.
| Impl | When used | Automated-tested? |
|---|---|---|
| `firebase.js` | default (`STORAGE=firebase`) — upstream path | No — requires live Firebase project |
| `ws.js` | `STORAGE=ws` — our fork, talks to backend | Yes — against running backend |
| `memory.js` | test-only, in-process | Yes — fast, deterministic |
**Frontend interface contract** (all three implement):
- `getDoc(path)`, `setDoc(path, data, opts)`, `updateDoc(path, patch)`
- `deleteDoc(path)`, `batch(ops)`
- `subscribeDoc(path, cb)` / `subscribeCollection(path, cb)` → real-time push
Firebase impl: existing `onSnapshot` + SDK calls, moved verbatim behind interface (M2).
WS impl: thin adapter; generic KV ops, receives **state updates** via WS subscribe (M2).
Memory impl: in-memory Map + EventEmitter, for tests (M3).
### Repo layout (npm workspaces)
```
/
package.json # workspaces root
src/ # React frontend (existing, refactored behind storage interface)
storage/
index.js # factory: pick impl from STORAGE env
firebase.js # extracted from current App.js (verbatim)
ws.js # NEW — talks to backend
memory.js # NEW — test only
contract.js # interface spec (runStorageContract)
tests/ # frontend tests
server/ # NEW
index.js # Express + ws bootstrap, generic KV REST
db.js # better-sqlite3, docs table (KV), broadcast
handlers.js # REST handlers
tests/ # adapter vs live backend (Layer 2 test)
shared/ # pure logic, no I/O, importable by client + server + tests
turn.js # turn logic (single source; tests import)
tests/ # turn logic unit tests (characterization + desired)
data/ # gitignored sqlite DB
docker-compose.yml # NEW — M5
docs/
REWORK_PLAN.md # this file
DEVELOPMENT.md
GLOSSARY.md
TODO.md # bugs + features (separate from this plan)
```
### Auth
- **Now:** `AUTH_MODE=none`. App gated by nginx HTTP basic auth (reuse friend's existing pattern). In-house only. Risk acceptable: someone sees your initiative counter.
- **Future:** `AUTH_MODE=token` — real login, real users. Only if/when publicly exposed. Not built this plan.
---
## Milestones
Each milestone = independently mergeable PR upstream (unless marked ❌).
| M | Does | Tests? |
|---|---|---|
| 0 | repo, branch, remotes | no |
| 1 | build backend (Node+Express+ws+better-sqlite3) | unit tests as built |
| 2 | frontend WS adapter — app runs vs backend, cross-device works | yes |
| 3 | characterization tests lock current behavior | yes |
| 4 | resolve initiative rotation corruption (BUG-5) | yes |
| 5 | docker single container (caddy+node) | smoke ✅ |
| 6 | _moved to TODO backlog (feature work)_ | - |
| 7 | playwright multi-window e2e (deferred) | e2e |
| 8 | (future) public exposure | - |
### Milestone 0 — Repo + branch setup ✅
- Fresh branch off `main` (not `dsr-rework`). Name: `rework-backend`.
- `upstream` remote = friend's Gitea (read-only fetch).
- Push origin = `keen99/ttrpg-initiative-tracker` (private).
- npm workspaces root config.
- Commit this plan.
- **Exit criteria:** clean branch, plan committed, remotes set. ✅ DONE.
- **Upstream-PRable:** n/a (fork infra)
### Milestone 1 — Build backend ✅
- `server/`: Express + ws + better-sqlite3.
- Generic KV doc store (firebase mirror): `docs` table (path PK, parent, data JSON, updated_at). REST: GET/PUT/PATCH/DELETE `/api/doc?path=`, GET `/api/collection?path=`, POST `/api/collection`, POST `/api/batch`. WS: subscribe by path.
- Server holds authoritative state. No turn logic server-side (logic stays client-side in `shared/turn.js`).
- **Exit criteria:** backend boots, serves state over WS, persists to SQLite, unit tests green. ✅ DONE.
- **Upstream-PRable:** ❌ divergence (friend stays Firebase).
### Milestone 2 — Frontend WS adapter ✅
- Define `storage/contract.js` interface spec.
- Move all Firestore call sites from `App.js` into `storage/firebase.js` behind interface (verbatim).
- Implement `storage/ws.js` per interface, talking to backend. Generic KV ops, subscribes to WS.
- Implement `storage/memory.js` for frontend unit tests.
- `storage/index.js` factory: `STORAGE` env → pick impl. Default `firebase` (upstream unchanged).
- App runs against backend with `STORAGE=ws`.
- Cross-device verified manually: DM view + player display + tablet.
- **Exit criteria:** app runs fully against local backend, no Firebase. Multi-device sync works. ✅ DONE.
- **Upstream-PRable:** ⚠️ partial. Storage interface + firebase extract = ✅. WS impl = ❌.
### Milestone 3 — Characterization tests lock current behavior ✅
- Lock current behavior via tests.
- Cover: START, NEXT_TURN, PAUSE, RESUME, ADD_PARTICIPANT, REMOVE_PARTICIPANT, TOGGLE_ACTIVE, REORDER, APPLY_DAMAGE/HEAL, DEATH_SAVE, END.
- Two layers: Layer 1 (App + firebase mock, proves call shape), Layer 2 (ws adapter vs live backend, proves translation).
- Iterate until confident: baseline solid, regressions impossible to silently slip.
- **Exit criteria:** characterization suite green. Baseline locked. ✅ DONE.
- **Upstream-PRable:** ✅ if kept storage-agnostic (tests target turn logic shape).
### Milestone 4 — Resolve initiative rotation corruption (BUG-5) ✅
- **Fixed** (commit `494327f`).
- Slot-array turn order model + DRY advance core (`nextActiveAfter`).
Both `nextTurn` + `computeTurnOrderAfterRemoval` delegate → one advance
path, no drift.
- 500-round replay: 0 skips, 0 double-acts.
- Tests: `turn.skip.test.js`, `turn.dry.test.js` (advance parity lock).
- **Upstream-PRable:** ✅ bug fix.
### Milestone 5 — Docker compose ✅
- Single container: caddy (front, static + proxy) + node backend (internal :4001).
- Files in `docker/` tree (kept separate from upstream root Dockerfile):
- `docker/Dockerfile` — build FE + BE, runtime caddy+node
- `docker/Caddyfile` — proxy /api + /ws to node, static SPA fallback
- `docker/entrypoint.sh` — node bg + caddy fg
- `docker/docker-compose.yml` — one `app` service, volume for sqlite
- Run: `docker compose -f docker/docker-compose.yml up --build` (or `cd docker && docker compose up --build`). Port 8080.
- No `image:` field => compose auto-names, never pulls service image (private).
- **Exit criteria:** `docker compose up` runs full stack in-house. ✅ DONE.
- Verified: REST roundtrip, WS subscribe+push, replay 20 rounds CLEAN (0 skips/doubles/shifts), UI styled (Tailwind compiles).
- **Upstream-PRable:** ✅ separate docker/ tree, root Dockerfile untouched, firebase default preserved.
### Milestone 6 — Undo rework — _MOVED to TODO backlog_
- Moved: feature work (transactional undo), not infra. Lives in `TODO.md` now.
- Scope: events table `(type, payload, undo_payload, undone, ts)`; undo = apply undo_payload in tx.
### Milestone 7 — Playwright E2E (deferred)
- Multi-window E2E: DM view + display + player view in separate browser contexts against running backend.
- Verify realtime sync end-to-end.
- **Only build if sync regresses or we deviate significantly.** Turn-logic unit + backend integration tests cover most regression risk cheaper.
- **Exit criteria:** e2e green for core combat flow across 3 windows.
- **Upstream-PRable:** ✅ if test infra shared.
### Milestone 8 — (Future) Public exposure
- Real auth (`AUTH_MODE=token`).
- Rate limiting, CSRF, hardening.
- Postgres migration if load warrants.
- Only if we decide to expose publicly + multiuser.
---
## Testing strategy
### Layers
1. **Turn logic unit tests** (Jest, pure functions, `shared/tests/`). Characterization + desired. Cheap, essential.
2. **Backend integration tests** (Jest, `server/tests/`) — spin server on random port, assert WS pushes + SQLite persists + transactional correctness.
3. **Frontend adapter contract tests** (Jest, `src/tests/`) — impl parity against interface (memory). Firebase mock harness for Layer 1 App tests.
### Characterization → desired
1. **Characterization** — capture current behavior exactly (bugs included). Locks extraction/port as provably identical. Lets later fix be provable.
2. **Desired-behavior (red)** — write what *should* happen. Fail today. Fix → green. Bug stays dead. (Bug fixes live in TODO.md, tracked separately.)
### Manual smoke via config flags
- `STORAGE=firebase` → current behavior (friend's path, upstream default).
- `STORAGE=ws` → our path, local backend.
- docker-compose profiles mirror the above.
### Accepted test gap
- Firebase adapter untested (requires live project). Accepted cost.
- Mitigated by: interface contract; if firebase impl drifts, integration smoke only.
---
## Mergeability upstream
| Milestone | Upstream-PRable? | Why |
|---|---|---|
| 0 repo setup | n/a | fork infra |
| 1 backend | ❌ | divergence (friend stays Firebase) |
| 2 WS adapter | ⚠️ partial | interface + firebase extract ✅, WS ❌ |
| 3 characterization tests | ✅ | if storage-agnostic |
| 4 BUG-5 rotation fix | ✅ | bug fix |
| 5 docker | ✅ | separate docker/ tree, root Dockerfile untouched, firebase preserved |
| 6 undo (moved to TODO) | - | - |
| 7 playwright | ✅ | if test infra shared |
Default `STORAGE=firebase` + `AUTH_MODE=none` (unset) = upstream sees literally zero change.
---
## Risks
- **CRA + workspaces friction.** Create React App may resist monorepo layout. Mitigation: keep `src/` as CRA root, `server/` + `shared/` as separate workspaces imported via alias. Eject/craco only if forced.
- **Firebase drift untested.** Mitigation: interface contract; friend's path his to maintain.
- **Undo history migration.** Existing log entries use old snapshot format. Mitigation: keep old undo working until cleared, new format for new entries.
- **WS reconnect/state-sync edge cases.** Transient drop mid-combat. Mitigation: client requests full state resync on (re)connect; server is source of truth.
---
## Decisions (locked)
1. **Branch:** `rework-backend` off `main`.
2. **npm workspaces** for `server/` + `shared/` alongside CRA `src/`. Fallback alias if CRA fights.
3. **Backend = generic KV doc store** (firebase mirror), not shape-specific endpoints. Thin adapter passthrough. Opaque JSON at arbitrary path strings.
---
## Current status
- M0 ✅, M1 ✅, M2 ✅, M3 ✅, M4 ✅, M5 ✅
- Backend live: port 4001, db `./data/tracker.sqlite`
- Frontend: port 3999 with `REACT_APP_STORAGE=ws`
- Test suite: ~160 tests (shared + server + FE). Bugs tracked in `TODO.md`.
- Next milestones: M5 docker-compose. Undo moved to TODO backlog.