Files
ttrpg-initiative-tracker/docs/REWORK_PLAN.md
T
david raistrick c6d3b7e1a6 docs: move dead-not-skipped (FEAT-1) to TODO backlog, M4 = BUG-5 fix
REWORK_PLAN.md M4 = resolve initiative rotation corruption (BUG-5).
Mid-round add/revive corrupts rotation. RED locked.

TODO.md FEAT-1 = dead participants stay in turn order (user request,
Saturday game). Feature backlog, not milestone.
2026-06-30 16:33:02 -04:00

13 KiB

Initiative Tracker — Rework Plan

Status: APPROVED — executing Owner: draistrick (fork → keen99/ttrpg-initiative-tracker, private) Upstream: code.draft13.com/robert/ttrpg-initiative-tracker (friend's Gitea)


Goals

  1. Replace Firebase with self-hosted backend. Browser cannot own a DB file (sandbox). Cross-device (DM + tablet + player view) requires a real backend. Backend is the foundation, built first.
  2. Automated test ecosystem as the baseline. Lock current behavior before changing it.
  3. Remain mergeable upstream. Default behavior (Firebase) preserved behind flag. Upstream main stays clean. Friend keeps Firebase path.
  4. Self-hostable in local Docker (in-house network). Public exposure = future, only after auth + multiuser safety.

Non-Goals (this plan)

  • Ripping Firebase. Kept as default adapter upstream.
  • Public/multiuser deployment. Deferred.
  • Rewriting the entire 2935-line App.js. Only extract what testability demands.
  • Feature/bug work. That lives in TODO.md. This plan = infra + backend + test harness only.

Problem Statement

Why Firebase is wrong here (for this fork)

  • Requires Google account + network for a single-user tabletop tool.
  • Realtime value (DM view ↔ player display) is real but solvable locally.
  • API key baked into client bundle (CRA REACT_APP_* at build); security depends entirely on console rules not in repo.
  • Vendor lock + quota; onSnapshot on collections burns reads.
  • Friend keeps it; we fork off it.

Why a backend is mandatory

Browser sandbox cannot write the filesystem. No sqlite file, no /data/db.sqlite, nothing. Browser JS is blocked from disk by design. Therefore cross-device storage (DM ↔ tablet ↔ player view) requires a separate Node process owning the DB file and serving the browser over HTTP/WebSocket. There is no browser-only path. The backend is step one, not deferred.


Architecture

Stack (locked)

  • Node.js runtime
  • Express web framework
  • ws WebSocket lib (realtime push, replaces onSnapshot)
  • better-sqlite3 SQLite driver (synchronous, simple, fast)
  • SQLite DB (single file, docker volume, trivial backup)
  • Jest test runner (already in CRA deps)

Postgres deferred until public multiuser exposure is real. SQLite schema ports easily if that day comes.

Backend design

  • Owns SQLite file. Only writer.
  • Holds authoritative state.
  • Generic KV doc store (firebase mirror): single docs table (path PK, parent, data JSON, updated_at). Opaque JSON at arbitrary path strings. No shape-specific endpoints. App logic stays client-side.
  • WS broadcast on every state change → all connected clients (DM view, player display, tablet) update instantly.

Three storage impls, one interface (frontend)

The storage interface is the test seam and the upstream-compat layer.

Impl When used Automated-tested?
firebase.js default (STORAGE=firebase) — upstream path No — requires live Firebase project
ws.js STORAGE=ws — our fork, talks to backend Yes — against running backend
memory.js test-only, in-process Yes — fast, deterministic

Frontend interface contract (all three implement):

  • getDoc(path), setDoc(path, data, opts), updateDoc(path, patch)
  • deleteDoc(path), batch(ops)
  • subscribeDoc(path, cb) / subscribeCollection(path, cb) → real-time push

Firebase impl: existing onSnapshot + SDK calls, moved verbatim behind interface (M2). WS impl: thin adapter; generic KV ops, receives state updates via WS subscribe (M2). Memory impl: in-memory Map + EventEmitter, for tests (M3).

Repo layout (npm workspaces)

/
  package.json              # workspaces root
  src/                      # React frontend (existing, refactored behind storage interface)
    storage/
      index.js              # factory: pick impl from STORAGE env
      firebase.js           # extracted from current App.js (verbatim)
      ws.js                 # NEW — talks to backend
      memory.js             # NEW — test only
      contract.js           # interface spec (runStorageContract)
    tests/                  # frontend tests
  server/                   # NEW
    index.js                # Express + ws bootstrap, generic KV REST
    db.js                   # better-sqlite3, docs table (KV), broadcast
    handlers.js             # REST handlers
    tests/                  # adapter vs live backend (Layer 2 test)
  shared/                   # pure logic, no I/O, importable by client + server + tests
    turn.js                 # turn logic (single source; tests import)
    tests/                  # turn logic unit tests (characterization + desired)
  data/                     # gitignored sqlite DB
  docker-compose.yml        # NEW — M5
  docs/
    REWORK_PLAN.md          # this file
    DEVELOPMENT.md
    GLOSSARY.md
  TODO.md                   # bugs + features (separate from this plan)

Auth

  • Now: AUTH_MODE=none. App gated by nginx HTTP basic auth (reuse friend's existing pattern). In-house only. Risk acceptable: someone sees your initiative counter.
  • Future: AUTH_MODE=token — real login, real users. Only if/when publicly exposed. Not built this plan.

Milestones

Each milestone = independently mergeable PR upstream (unless marked ).

M Does Tests?
0 repo, branch, remotes no
1 build backend (Node+Express+ws+better-sqlite3) unit tests as built
2 frontend WS adapter — app runs vs backend, cross-device works yes
3 characterization tests lock current behavior yes
4 resolve initiative rotation corruption (BUG-5) yes
5 docker compose in-house smoke
6 undo rework (tx events) unit
7 playwright multi-window e2e (deferred) e2e

Milestone 0 — Repo + branch setup

  • Fresh branch off main (not dsr-rework). Name: rework-backend.
  • upstream remote = friend's Gitea (read-only fetch).
  • Push origin = keen99/ttrpg-initiative-tracker (private).
  • npm workspaces root config.
  • Commit this plan.
  • Exit criteria: clean branch, plan committed, remotes set. DONE.
  • Upstream-PRable: n/a (fork infra)

Milestone 1 — Build backend

  • server/: Express + ws + better-sqlite3.
  • Generic KV doc store (firebase mirror): docs table (path PK, parent, data JSON, updated_at). REST: GET/PUT/PATCH/DELETE /api/doc?path=, GET /api/collection?path=, POST /api/collection, POST /api/batch. WS: subscribe by path.
  • Server holds authoritative state. No turn logic server-side (logic stays client-side in shared/turn.js).
  • Exit criteria: backend boots, serves state over WS, persists to SQLite, unit tests green. DONE.
  • Upstream-PRable: divergence (friend stays Firebase).

Milestone 2 — Frontend WS adapter

  • Define storage/contract.js interface spec.
  • Move all Firestore call sites from App.js into storage/firebase.js behind interface (verbatim).
  • Implement storage/ws.js per interface, talking to backend. Generic KV ops, subscribes to WS.
  • Implement storage/memory.js for frontend unit tests.
  • storage/index.js factory: STORAGE env → pick impl. Default firebase (upstream unchanged).
  • App runs against backend with STORAGE=ws.
  • Cross-device verified manually: DM view + player display + tablet.
  • Exit criteria: app runs fully against local backend, no Firebase. Multi-device sync works. DONE.
  • Upstream-PRable: ⚠️ partial. Storage interface + firebase extract = . WS impl = .

Milestone 3 — Characterization tests lock current behavior

  • Lock current behavior via tests.
  • Cover: START, NEXT_TURN, PAUSE, RESUME, ADD_PARTICIPANT, REMOVE_PARTICIPANT, TOGGLE_ACTIVE, REORDER, APPLY_DAMAGE/HEAL, DEATH_SAVE, END.
  • Two layers: Layer 1 (App + firebase mock, proves call shape), Layer 2 (ws adapter vs live backend, proves translation).
  • Iterate until confident: baseline solid, regressions impossible to silently slip.
  • Exit criteria: characterization suite green. Baseline locked. DONE.
  • Upstream-PRable: if kept storage-agnostic (tests target turn logic shape).

Milestone 4 — Resolve initiative rotation corruption (BUG-5)

  • Real bug. Mid-round add/revive corrupts rotation.
  • 13 dupes / 100 rounds (deterministic seeded test).
  • Root cause: computeTurnOrderAfterAddition appends id to turnOrderIds end. Round wrap re-sorts by initiative. currentTurnParticipantId pointer stale → nextTurn revisits.
  • RED test locked: shared/tests/turn.combat.test.js.
  • Detail in TODO.md BUG-5.
  • Exit criteria: RED green. Rotation invariant holds across add/remove/revive.
  • Upstream-PRable: bug fix.

Milestone 5 — Docker compose

  • docker-compose.yml:
    • backend service (Node + sqlite volume)
    • nginx service (static frontend + reverse proxy + http basic auth)
  • Profiles: firebase (frontend only, current behavior) vs backend (full stack).
  • Exit criteria: docker compose up runs full stack in-house.
  • Upstream-PRable: divergence.

Milestone 6 — Undo rework

  • Events table: every mutating action writes (type, payload, undo_payload, undone, ts).
  • Undo = apply undo_payload in same SQLite tx, flip undone. Transactional, no stale clobber.
  • Replaces current fragile /logs snapshot-write undo.
  • Migration: keep old undo working for existing entries until cleared; new format for new entries.
  • Exit criteria: undo works transactionally; interleaved undos don't corrupt.
  • Upstream-PRable: ⚠️ partial. Turn-logic-level undo = . Backend events table = .

Milestone 7 — Playwright E2E (deferred)

  • Multi-window E2E: DM view + display + player view in separate browser contexts against running backend.
  • Verify realtime sync end-to-end.
  • Only build if sync regresses or we deviate significantly. Turn-logic unit + backend integration tests cover most regression risk cheaper.
  • Exit criteria: e2e green for core combat flow across 3 windows.
  • Upstream-PRable: if test infra shared.

Milestone 8 — (Future) Public exposure

  • Real auth (AUTH_MODE=token).
  • Rate limiting, CSRF, hardening.
  • Postgres migration if load warrants.
  • Only if we decide to expose publicly + multiuser.

Testing strategy

Layers

  1. Turn logic unit tests (Jest, pure functions, shared/tests/). Characterization + desired. Cheap, essential.
  2. Backend integration tests (Jest, server/tests/) — spin server on random port, assert WS pushes + SQLite persists + transactional correctness.
  3. Frontend adapter contract tests (Jest, src/tests/) — impl parity against interface (memory). Firebase mock harness for Layer 1 App tests.

Characterization → desired

  1. Characterization — capture current behavior exactly (bugs included). Locks extraction/port as provably identical. Lets later fix be provable.
  2. Desired-behavior (red) — write what should happen. Fail today. Fix → green. Bug stays dead. (Bug fixes live in TODO.md, tracked separately.)

Manual smoke via config flags

  • STORAGE=firebase → current behavior (friend's path, upstream default).
  • STORAGE=ws → our path, local backend.
  • docker-compose profiles mirror the above.

Accepted test gap

  • Firebase adapter untested (requires live project). Accepted cost.
  • Mitigated by: interface contract; if firebase impl drifts, integration smoke only.

Mergeability upstream

Milestone Upstream-PRable? Why
0 repo setup n/a fork infra
1 backend divergence (friend stays Firebase)
2 WS adapter ⚠️ partial interface + firebase extract , WS
3 characterization tests if storage-agnostic
4 BUG-5 rotation fix bug fix
5 docker compose divergence
6 undo rework ⚠️ partial turn-logic-level , events table
7 playwright if test infra shared

Default STORAGE=firebase + AUTH_MODE=none (unset) = upstream sees literally zero change.


Risks

  • CRA + workspaces friction. Create React App may resist monorepo layout. Mitigation: keep src/ as CRA root, server/ + shared/ as separate workspaces imported via alias. Eject/craco only if forced.
  • Firebase drift untested. Mitigation: interface contract; friend's path his to maintain.
  • Undo history migration. Existing log entries use old snapshot format. Mitigation: keep old undo working until cleared, new format for new entries.
  • WS reconnect/state-sync edge cases. Transient drop mid-combat. Mitigation: client requests full state resync on (re)connect; server is source of truth.

Decisions (locked)

  1. Branch: rework-backend off main.
  2. npm workspaces for server/ + shared/ alongside CRA src/. Fallback alias if CRA fights.
  3. Backend = generic KV doc store (firebase mirror), not shape-specific endpoints. Thin adapter passthrough. Opaque JSON at arbitrary path strings.

Current status

  • M0 , M1 , M2 , M3
  • Backend live: port 4001, db ./data/tracker.sqlite
  • Frontend: port 3999 with REACT_APP_STORAGE=ws
  • Test suite: ~160 tests (shared + server + FE). Bugs tracked in TODO.md.
  • Next milestones: M5 docker-compose, M6 undo rework.