Tau.Acuvim/portal/README.md
Diseri Pearson c5787a7a7f Phase 15: Admin operator surface + fleet dashboards + onboarding docs
The Admin stack now has a usable operator UI for managing the fleet.
End-to-end verified locally: Client pushes → Admin dashboard reflects
the activity within the CA refresh window.

Backend (Admin-only)
- FleetQueryService: dashboard headline (totals, active count, today's
  measurements + kWh from the hourly_per_device CA) and per-customer
  detail (sites, devices, last 50 measurements, last 20 ingest events).
- /api/fleet/dashboard and /api/fleet/customers/{id}/detail endpoints.
- DTOs added; Program.cs wires the service + endpoints under RunMode=Admin.

Frontend
- DashboardPage now branches on RunMode — Admin renders the fleet
  headline (statistic cards + customer summary table with lag tags),
  Client keeps the existing placeholder.
- AdminCustomerDetailPage drills into one customer: descriptions card +
  tabs for Recent ingest (with rejection counts, batch sizes, time-spread
  for visible firmware-replay waves), Recent measurements, Sites, Devices.
- AdminCustomersPage rows are clickable → /admin/customers/:id (skips
  the click when target is a button/popover so action buttons still work).
- App.tsx adds the /admin/customers/:id route, RequireRole-gated.

Grafana
- grafana/dashboards-admin/fleet-overview.json — 4 stat panels (active
  customers, total, last-24h samples, today's kWh) plus 2 time series
  (per-customer active power, per-customer hourly kWh). Reads from
  fleet.hourly_per_device CA.
- grafana/dashboards-admin/customer-drilldown.json — parameterized by
  $customer (template variable querying fleet.Customers). Per-device
  active power, cumulative kWh, recent ingest events table.

Docs
- README: Phase 15 section describing the new admin UI surface +
  pointer to dashboard-admin folder.
- OPERATIONS: new "Fleet aggregator (Admin stack)" section covering
  one-time provisioning (Admin portal + Admin Grafana), end-to-end
  customer-onboarding workflow (register on Admin → drop token in
  customer .env → restart → verify in UI/SQL), common ops (rotate
  token, disable, investigate, compression stats, force CA refresh,
  decommission), and Admin-DB backup notes.
- README decommissioning note now mentions deleting from fleet.Customers
  if the customer was registered for aggregation.

Verified end-to-end
- Phase 14's Client + Admin stacks rebuilt with Phase 15 code.
- /api/fleet/dashboard returns correct totals (1 customer, 1 active,
  measurements + kWh derived from CA).
- /api/fleet/customers/{id}/detail returns sites, devices, recent
  measurements, recent ingest events.
- Ingested a fresh measurement on Client → after CA refresh, totals
  in Admin dashboard advance correctly.
- All 53 tests still passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 10:27:55 +02:00

20 KiB

Tau Acuvim Portal

Customer-facing, white-labeled power monitoring portal. One stack per customer, deployed behind Traefik with the customer ID (lowercased — Docker Compose v2 requirement) as the container prefix: customer ABC0001 produces abc0001_portal, abc0001_grafana, abc0001_timescale.

This project lives next to console/ (internal management interface) and firmware/ (ESP32) in the same repo. The three projects share no code; the portal stands alone.


Contents

  1. Overview
  2. Architecture
  3. Configuration template
  4. Local setup
  5. Docker Compose
  6. Database migrations
  7. Accessing the app
  8. Accessing Grafana
  9. Default local test credentials
  10. Production deployment notes
  11. Security notes
  12. Testing
  13. Operations

Overview

A customer signs in to their own branded portal, sees their meters' live + historical power readings via embedded Grafana dashboards, and (for admins) configures branding, municipality tariffs, users, sites, and devices. Each customer gets a fully isolated stack: their own database, their own Grafana, their own branding. Traefik routes <customer>.portal.example.com to the right containers.

Tech stack

Layer Technology
Run modes RunMode=Client (per-customer, default) or RunMode=Admin (fleet aggregation) — same binary, config-selected. See docs/FLEET-DESIGN.md.
Backend .NET 10 minimal API, EF Core 10, Npgsql, ASP.NET Core Identity, Serilog
Frontend React 18 + TypeScript + Vite, Ant Design 5, TanStack Query, react-router
Database TimescaleDB 2.17 on PostgreSQL 16
Graphing Grafana 11 (provisioned datasource + dashboards)
Container Docker / Docker Compose; Traefik for routing
Auth Cookie-based via ASP.NET Identity (SPA-friendly, 401/403 not redirects)

Architecture

Containers, per customer

                ┌────────────────────────────────────────────────────┐
                │                    Traefik                         │
                │   host: <customer>.portal.example.com              │
                └────────────────────────────────────────────────────┘
        ┌─────────────────────────┬─────────────────────────┐
        ▼                         ▼                         ▼
┌───────────────────┐   ┌───────────────────┐   ┌───────────────────┐
│ <PREFIX>_portal   │   │ <PREFIX>_grafana  │   │<PREFIX>_timescale │
│ .NET API + SPA    │──▶│ Grafana 11        │──▶│ TimescaleDB + Pg16│
│ :8080             │   │ :3000             │   │ :5432             │
└───────────────────┘   └───────────────────┘   └───────────────────┘

<PREFIX> = the customer's 7-digit ID lowercased (e.g. abc0001 for customer ABC0001), set via COMPOSE_PROJECT_NAME. Compose v2 rejects uppercase project names.

Backend layout

  • Combined container — Dockerfile builds the React SPA, then a multi-stage .NET build, then copies the SPA into wwwroot. One image, one process per customer.
  • Single AppDbContext with three schemas:
    • identity — ASP.NET Identity tables.
    • app — branding, municipalities, tariffs, periods.
    • monitoring — sites, devices, power measurements (hypertable).
  • Minimal API with endpoints grouped under Endpoints/*.cs. Services in Services/*.cs. Typed options in Configuration/.

Frontend layout

  • src/pages/ — page-level components (DashboardsPage, SettingsPage, etc.).
  • src/components/ — shared + feature components.
  • src/api/ — typed axios calls.
  • src/hooks/useAuth, useBranding — global state via Context.
  • RequireAuth / RequireRole — route guards.
  • AntD ConfigProvider is themed dynamically from BrandingProvider (white-labelling).

Configuration template

All configurable values are declared in src/Tau.Acuvim.Portal/appsettings.template.json — checked in, no secrets. It's the lowest-priority configuration source: everything overrides it.

Precedence (lowest → highest)

  1. appsettings.template.json — shippable defaults (always loaded).
  2. appsettings.json — runtime infra (Serilog, AllowedHosts).
  3. appsettings.{Environment}.json — per-environment overrides.
  4. appsettings.Local.json — gitignored, your local overrides.
  5. Environment variables — use __ as section separator (e.g. Authentication__DefaultAdminPassword=...). Production secrets go here.

Sections

Section Purpose
Application Name, environment, public URL
Database Provider, connection string, MigrateOnStartup, AutoProvisionLocalTimescaleDb
TimescaleDb Host/port/db/user/password for auto-provision in dev
Grafana Base URL, internal URL, path prefix, embed mode, dashboard list
WhiteLabel App name, logo URL, colours, footer, logo storage path
Authentication Cookie name, lockout, default admin email/password
Monitoring Hypertable chunk interval, aggregate flag

Database connection resolution

  1. If Database:ConnectionString is non-empty → use it.
  2. Else if Database:AutoProvisionLocalTimescaleDb=true AND env is not Production → build from the TimescaleDb:* block (Host/Port/Database/Username/Password).
  3. Otherwise → app refuses to start with a clear error.

AutoProvisionLocalTimescaleDb=true in Production is a hard failure — production must supply its own connection string via env var or secret.


Local setup

Prerequisites

  • .NET 10 SDK
  • Node 22+ / npm
  • Docker Desktop
  • dotnet-ef tool: dotnet tool install --global dotnet-ef

First-time: generate the initial migrations

Two DbContext classes — one per RunMode — each with its own migration folder.

cd C:\AcuvimDev\Tau.Acuvim\portal

# Client (RunMode=Client, default) — identity + branding + monitoring + rates
dotnet ef migrations add InitialCreate `
  --context AppDbContext `
  --project src/Tau.Acuvim.Portal/Tau.Acuvim.Portal.csproj `
  --output-dir Migrations

# Admin (RunMode=Admin) — identity + branding + fleet
$env:Application__RunMode='Admin'
$env:Database__ConnectionString='Host=localhost;Database=stub;Username=u;Password=p'  # parsed only
dotnet ef migrations add InitialFleet `
  --context AdminDbContext `
  --project src/Tau.Acuvim.Portal/Tau.Acuvim.Portal.csproj `
  --output-dir Migrations/Admin
Remove-Item Env:Application__RunMode
Remove-Item Env:Database__ConnectionString

Commit both Migrations/ and Migrations/Admin/. MigrateAsync on startup applies whatever exists for the active context.

cd C:\AcuvimDev\Tau.Acuvim\portal
Copy-Item .env.example .env
docker compose up --build -d
docker compose ps

Then:

Stop + wipe:

docker compose down -v

Option B: backend + DB in Docker, frontend via Vite

Same docker compose up, then in another terminal:

cd C:\AcuvimDev\Tau.Acuvim\portal\frontend
npm install
npm run dev

Vite serves at http://localhost:5174 and proxies /api + /health to the .NET container on :8080.

Option C: everything local (no Docker for the app)

Postgres still needs Docker (or a local install):

docker compose up -d timescaledb
cd C:\AcuvimDev\Tau.Acuvim\portal\src\Tau.Acuvim.Portal
dotnet run     # listens on :8080
# in another terminal
cd C:\AcuvimDev\Tau.Acuvim\portal\frontend
npm run dev    # :5174

Docker Compose

Dev — docker-compose.yml

Three services: portal, timescaledb, grafana. Persistent named volumes (timescale-data, grafana-data, portal-keys, portal-branding). Healthcheck on Postgres; portal waits for healthy. Grafana ships with anonymous Viewer for easy local access; provisioned TimescaleDB datasource (uid: timescaledb) and any JSON dashboards under grafana/dashboards/.

Host port mappings: 8080→portal, 5433→timescaledb, 3001→grafana. Chosen to coexist with the console stack.

Prod — docker-compose.prod.yml

Same services, no host port mappings. Joins external traefik-public network. Per-customer Traefik labels (subdomain routing for portal, same-origin path-prefix routing for Grafana at /grafana). Grafana sub-path + GF_SERVER_ROOT_URL configured. All secrets via env vars.

Run:

docker network create traefik-public        # once on the host
docker compose -f docker-compose.prod.yml --env-file .env up -d

See OPERATIONS.md for the full per-customer deployment loop.


Database migrations

MigrateAsync runs on startup (controlled by Database:MigrateOnStartup, default true). Immediately after, TimescaleBootstrapper runs an idempotent block:

  1. CREATE EXTENSION IF NOT EXISTS timescaledb (defensive).
  2. SELECT create_hypertable('monitoring."PowerMeasurements"', 'Time', if_not_exists => TRUE, migrate_data => TRUE).
  3. SELECT set_chunk_time_interval('monitoring."PowerMeasurements"', INTERVAL '<MonitoringOptions.ChunkTimeInterval>').

Safe to re-run on every start.

Adding a new migration

When you change the entity model:

cd C:\AcuvimDev\Tau.Acuvim\portal
dotnet ef migrations add <DescriptiveName> `
  --project src/Tau.Acuvim.Portal/Tau.Acuvim.Portal.csproj `
  --output-dir Migrations

Commit. Next deploy applies it automatically.


Accessing the app

Environment URL
Local (Docker combined) http://localhost:8080
Local (Vite dev) http://localhost:5174
Production https://<customer-host> (e.g. https://abc0001.portal.example.com)

API base path: /api. Swagger UI in dev: /swagger.

Health endpoints

  • GET /health — liveness (process alive). Returns Healthy if the app responds.
  • GET /health/ready — readiness. Returns Healthy only if TimescaleDB answers.

Nav surface

Page Who sees it
Dashboard Any authenticated user
Dashboards (embedded Grafana) Any authenticated user
Sites Admin only
Settings (Branding / Rates / Users / Grafana / App config) Admin only

Accessing Grafana

Local dev

  • Direct: http://localhost:3001 — anonymous Viewer can browse provisioned dashboards. Admin/GRAFANA_ADMIN_PASSWORD for editing.
  • Embedded: portal Dashboards page — iframe src points at the local Grafana base URL.

Production

  • Direct browser access to <customer-host>/grafana/* is gated by your chosen auth mode (see Security notes). Anonymous is off in the prod compose.
  • Embedded: portal Dashboards page — same-origin iframe via Traefik path prefix /grafana.

Provisioning

  • Datasource: grafana/provisioning/datasources/timescaledb.yml (uid: timescaledb).
  • Dashboard provider: grafana/provisioning/dashboards/dashboards.yml (auto-discovers JSON in grafana/dashboards/, refresh 30s).
  • Starter dashboard: grafana/dashboards/power-overview.json — active power + cumulative energy + latest-power stat, parameterised by a device template variable.

To add a dashboard:

  1. Drop the JSON into grafana/dashboards/.
  2. Add an entry to Grafana.Dashboards in appsettings.template.json (or override in appsettings.Local.json) with the same Uid. The portal's Dashboards page picks it up after a refresh.

Default local test credentials

Generated locally — change before publishing the stack to anyone.

  • Email: admin@example.com
  • Password: ChangeMe123!

Defined in appsettings.template.jsonAuthentication. The bootstrapper seeds this account only if no account with that email exists, and never overwrites a changed password.

Production guard: if ASPNETCORE_ENVIRONMENT=Production and the default password is still ChangeMe123!, the app refuses to start with an explicit error. Override Authentication__DefaultAdminPassword via env var before deploying.


Production deployment notes

For the per-customer deployment loop see OPERATIONS.md. The short version:

  1. One Compose project per customer. Set COMPOSE_PROJECT_NAME=abc0001 (lowercase form of the customer ID — Compose v2 rejects uppercase). Containers are named abc0001_portal, abc0001_grafana, abc0001_timescale.
  2. One subdomain per customer. Set CUSTOMER_HOST=abc0001.portal.example.com. Wildcard DNS + wildcard TLS cert via Traefik's resolver (certresolver=le).
  3. Decide your Grafana auth mode (see Security notes). The prod compose deliberately leaves Grafana auth off so the iframe refuses to load until you pick.
  4. Set all secrets via env vars (not files):
    • POSTGRES_PASSWORD
    • GRAFANA_ADMIN_PASSWORD
    • Authentication__DefaultAdminPassword
  5. External traefik-public Docker network must exist (created once on the host running Traefik).
  6. Up the stack:
    docker compose -f docker-compose.prod.yml --env-file .env up -d
    
  7. Verify the three containers report healthy and https://<customer-host>/health/ready returns Healthy.

Security notes

What's protected by default

  • ASP.NET Core Identity with lockout (5 failed attempts → 15 min) and strong password requirements (8+ chars, upper + lower + digit).
  • Cookies are HttpOnly + SameSite=Lax + Secure in prod, scoped to the portal subdomain.
  • Admin-only endpoints are gated by an AdminOnly policy (RequireRole("Admin")). Confirmed at backend; nav hidden on frontend.
  • Cannot delete your own account — backend block, not just UI.
  • GET /api/admin/config-overview is admin-only; the DTO never includes the connection string or any password. Redaction by construction, not filtering.
  • Branding logo upload rejects files >2 MB and extensions outside {png, jpg, jpeg, svg, webp}.
  • Anti-forgery is left on by default on cookie-authenticated endpoints; the logo upload explicitly opts out (multipart needs it disabled). Other admin endpoints accept JSON over same-site Lax cookies, which is CSRF-safe for state-changing same-origin SPA requests.
  • Security headers (X-Content-Type-Options, X-Frame-Options: SAMEORIGIN, Referrer-Policy: strict-origin-when-cross-origin) on every response. HSTS in prod.

Production refuse-to-start guards

  • App refuses to start in Production if Authentication:DefaultAdminPassword is still ChangeMe123!.
  • App refuses to start in Production if Database:AutoProvisionLocalTimescaleDb=true (you must supply an explicit connection string).
  • App refuses to start if no connection string can be resolved at all.

Grafana embedding — three production auth options

The dev compose runs Grafana with anonymous Viewer (safe on localhost). The prod compose has anonymous off and leaves the auth mode unset on purpose — pick one before publishing:

Option What it does Trade-off
(a) Traefik forwardAuth Traefik middleware calls a portal /api/auth/check endpoint on every Grafana request; portal cookie required, else 401 Zero changes to Grafana. Best when "any portal user = same dashboards."
(b) Grafana auth.proxy GF_AUTH_PROXY_ENABLED=true; trust an X-WEBAUTH-USER header set by Traefik Maps portal user → Grafana user, gets per-user folders/perms. Sanitise the header — never let a client set it directly.
(c) Service-account API key + render tokens Portal mints short-lived render tokens; SPA embeds via ?auth_token=... Most moving parts. Right when dashboards are stitched into custom UI per-panel rather than full Grafana.

Until one is wired, prod-mode Grafana refuses anonymous access and the iframe shows a login page — the intended safe default.

Other considerations

  • Same-origin embed (prod path-prefix routing through Traefik) sidesteps third-party-cookie blockers that increasingly break cross-origin Grafana iframes.
  • Provisioned datasource is editable: false — admins cannot accidentally rewire Grafana from its UI.
  • Default password complexity is tunable in Program.csIdentityOptions. Lockout is tunable in the same block.
  • TimescaleDB licensing — we use Apache-licensed timescale/timescaledb:*-pg16. Stay on community features (hypertables, continuous aggregates) if you ever sell this as managed DBaaS.

Testing

Backend unit tests under tests/Tau.Acuvim.Portal.Tests/ cover cost calculation, rate validation, connection-string resolution, day-of-week math:

cd C:\AcuvimDev\Tau.Acuvim\portal\tests\Tau.Acuvim.Portal.Tests
dotnet test

See TESTING.md for the full manual integration scenario, frontend test scaffolding recipe, and edge-case checklist.


Operations

For per-customer provisioning, secret rotation, backups, and health monitoring see OPERATIONS.md.

Admin / Fleet mode

A second deployment of the same image — RunMode=Admin, separate DB — aggregates data from all customer stacks for a fleet-wide operator view. See docs/FLEET-DESIGN.md for the full design.

Phase 15 (this release): the operator surface is live. Sign in to the Admin stack and:

  • Dashboard shows fleet headline — customer / active counts, today's measurement count and kWh imported, per-customer summary table with lag indicators. Auto-refreshes every 30s.
  • Customers lists registered customers; click a row to drill into one.
  • Customer detail page shows mirrored sites, mirrored devices, the 50 most recent measurements, and the last 20 ingest events (with rejection counts, batch sizes, time-spreads — useful when a firmware replay arrives and you want to see the wave).
  • grafana/dashboards-admin/ ships Fleet Overview + Customer Drilldown dashboards reading from the realtime fleet.hourly_per_device continuous aggregate. Mount this folder instead of grafana/dashboards/ on the Admin's Grafana container — see OPERATIONS.md.

Phase 14: the full push pipeline. Customer stacks with FleetIngest__Enabled=true run a FleetPushService background loop that batches sites, devices, and measurements (cursor by ReceivedAt — firmware buffer-and-replay back-fills get picked up automatically) and POSTs them to FleetIngest__Url with X-Customer-Token. Admin's /api/fleet/ingest upserts and writes an IngestEvents audit row per batch. Admin's FleetTimescaleBootstrapper makes fleet.PowerMeasurements a hypertable with compression-after-7-days and a realtime fleet.hourly_per_device continuous aggregate.

Spin up an Admin stack:

docker exec <client-timescale> createdb -U power_user admin_fleet    # one-time

docker run -d --name admin-portal --network <existing-network> `
  -e Application__RunMode=Admin `
  -e Database__ConnectionString='Host=<host>;Port=5432;Database=admin_fleet;Username=power_user;Password=<secret>' `
  -e Authentication__DefaultAdminPassword=<rotate-from-template-default> `
  -p 8090:8080 `
  portal-dev-portal

Then sign in at http://localhost:8090Customers → register the customer → token shown once.

Enable push on the customer stack: add to that customer's .env:

Application__RunMode=Client
FleetIngest__Enabled=true
FleetIngest__Url=http://admin-portal:8080/api/fleet/ingest   # container DNS in-network, or https://admin-host
FleetIngest__Token=<token from Customers page>
FleetIngest__IntervalSeconds=60                              # default
FleetIngest__BatchSize=5000                                  # default

Restart the customer's portal container — the FleetPushService starts on its own. Verify the audit trail on the Admin side:

docker exec <admin-timescale> psql -U power_user -d admin_fleet -c `
  'SELECT \"BatchType\",\"RowsAccepted\",\"ReceivedAt\" FROM fleet.\"IngestEvents\" ORDER BY \"ReceivedAt\" DESC LIMIT 10;'