Commit Graph

2 Commits

Author SHA1 Message Date
Diseri Pearson
3333202f3a Dual-token rotation grace window (24h default)
Token rotation used to be immediate cutover — push gap from when
ops rotates to when the customer's .env is updated and portal
restarted. Now the old token keeps working for 24h after rotation,
so customer ops has a full workday to swap it in without dropping a
single push tick.

Backend
- Customer entity gains PreviousTokenHash + PreviousTokenExpiresAt
  (both nullable). Non-unique index on PreviousTokenHash so the
  OR-lookup in FindByTokenAsync stays cheap.
- CustomerService.RotateTokenAsync(id, graceWindow=null, ct):
  copies the existing TokenHash into PreviousTokenHash with
  PreviousTokenExpiresAt = now + graceWindow (default 24h, lifted
  to CustomerService.DefaultTokenGracePeriod), then issues a new
  current token. Second rotation overwrites the previous slot —
  at most one previous token is ever honoured.
- CustomerService.FindByTokenAsync matches either current OR
  (previous AND PreviousTokenExpiresAt > now). IsActive=false
  still rejects both.
- DTO exposes PreviousTokenExpiresAt so the UI can render the
  grace window status.
- New EF migration AddPreviousTokenGraceWindow on AdminDbContext.

Frontend
- Customers table "Token" column shows an "Old token valid until …"
  orange tag with a tooltip whenever the grace window is active,
  plus the issue/rotation date as before.
- TokenShownOnceModal mentions the 24h grace window so ops knows
  they have time to update .env without urgency.
- Rotate-token popconfirm copy updated to reflect the new behavior.

Tests (+5, 61/61 passing)
- CustomerTokenGraceTests covers: create doesn't set previous;
  rotate moves current into previous slot with future expiry;
  zero grace window rejects original immediately; second rotation
  overwrites previous (original dies, first-rotation becomes the
  new previous); inactive customer rejects both current AND previous.

Verified end-to-end on the dev host
- Migration applied cleanly on the existing admin_fleet DB (existing
  DEV0001 customer got NULL previous columns, no data loss).
- Created GRACE01 → got token1.
- Rotated → got token2. PreviousTokenExpiresAt = +24h. Both token1
  and token2 push successfully (200).
- Rotated again → got token3. token1 push now returns 401 (gone).
  token2 push still 200 (now the previous). token3 push 200 (current).

Docs
- FLEET-DESIGN.md §6 rewritten — no longer "immediate cutover".
- §11 "open seams" row for this feature marked as shipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 10:45:31 +02:00
Diseri Pearson
880525b306 Add Fleet ingest design doc (portal/docs/FLEET-DESIGN.md)
Locked design for Admin / cross-customer aggregation feature.
Implementation lands in phases 13-15.

Key decisions captured:
- Same portal binary, RunMode=Client|Admin config flag.
- Two DbContext classes (ClientDbContext + AdminDbContext) to keep
  schemas cleanly separated and migrations sane.
- Fleet ingest is opt-in (FleetIngest__Enabled=false works exactly
  as today, no data leaves customer stack).
- Push by ReceivedAt, not Time, so firmware offline-buffer replays
  are picked up automatically.
- Per-tick batch cap so a back-fill wave from one customer doesn't
  starve other customers' pushes.
- SHA-256 token hash (not bcrypt) for the high-throughput ingest
  endpoint; tokens shown once on Admin Customers page.
- Realtime continuous aggregates with wide start_offset so late
  back-fills materialize on the next refresh tick.
- No retention policy. TimescaleDB compression on chunks older than
  7 days handles long-term storage cost.
- Open seams (tariff sync, RLS, GDPR delete, dual-token rotation,
  sharding) documented with v2 extension paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 09:55:07 +02:00