Tau.Acuvim/portal/README.md
Diseri Pearson e17921a122 Add portal: customer-facing white-labeled monitoring stack
New top-level portal/ project, peer to console/ and firmware/. Delivers a
.NET 10 + React 18 + TimescaleDB + Grafana stack, one container set per
customer behind Traefik. Built in 12 phases per FrontEndPrompt spec; no
changes to existing console or firmware.

Backend (src/Tau.Acuvim.Portal/):
- .NET 10 minimal API, Serilog, ASP.NET Identity (cookie auth, lockout).
- Single AppDbContext with identity / app / monitoring schemas.
- MigrateAsync + TimescaleBootstrapper (idempotent hypertable creation)
  + IdentityBootstrapper (seeded admin + branding) on startup.
- Pure CostCalculator + DB-backed RateService for tariffs (effective-dated,
  TOU periods, VAT, fixed charges, per-municipality timezone).
- BrandingService with logo upload to mounted volume.
- Time-series ingest + bucketed query services (time_bucket aggregates,
  ON CONFLICT for idempotent re-delivery).
- ConfigOverviewService with redaction-by-construction (passwords never in
  payload).
- DataProtection keys persisted to /data/keys volume for cookie survival
  across container restarts.

Frontend (frontend/):
- React 18 + TypeScript + Vite + Ant Design 5 + TanStack Query.
- BrandingProvider + ThemedRoot for live re-themed white-labelling.
- RequireAuth / RequireRole guards.
- Pages: Login, Dashboard, Dashboards (embedded Grafana), Sites (admin),
  Settings tabs (Branding / Rates / Users / Grafana / App config).

Infra:
- Dev (docker-compose.yml) and prod (docker-compose.prod.yml) compose
  files. Three services per customer; Traefik subdomain + same-origin
  /grafana path-prefix routing wired with labels.
- Grafana 11 with provisioned timescaledb datasource (uid pinned) and
  starter power-overview.json dashboard with device template variable.
- Compose project name documented as lowercase (Compose v2 requirement).

Tests (tests/Tau.Acuvim.Portal.Tests/):
- xUnit, 40 tests. Covers CostCalculator (period match, TZ, overlap,
  VAT, fixed), ConnectionStringResolver (all 4 precedence branches incl.
  Production refusal), TariffValidator, DayOfWeekFlag.
- All passing locally against .NET 10.

Docs:
- README.md (onboarding + 11 spec sections), OPERATIONS.md (per-customer
  provisioning, secret rotation, backup, troubleshooting), TESTING.md
  (manual integration scenarios, frontend test scaffolding recipe).

Production safety guards:
- Refuses to start if Authentication:DefaultAdminPassword is unchanged
  default in Production.
- Refuses to start if Database:AutoProvisionLocalTimescaleDb=true in
  Production.
- Prod Grafana ships with anonymous off and auth mode unset (three
  options documented in README Security) so iframe refuses to load
  until a deliberate prod auth choice is made.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 09:30:30 +02:00

370 lines
17 KiB
Markdown

# Tau Acuvim Portal
Customer-facing, white-labeled power monitoring portal. One stack per customer, deployed behind Traefik with the customer ID (lowercased — Docker Compose v2 requirement) as the container prefix: customer `ABC0001` produces `abc0001_portal`, `abc0001_grafana`, `abc0001_timescale`.
This project lives next to `console/` (internal management interface) and `firmware/` (ESP32) in the same repo. The three projects share no code; the portal stands alone.
---
## Contents
1. [Overview](#overview)
2. [Architecture](#architecture)
3. [Configuration template](#configuration-template)
4. [Local setup](#local-setup)
5. [Docker Compose](#docker-compose)
6. [Database migrations](#database-migrations)
7. [Accessing the app](#accessing-the-app)
8. [Accessing Grafana](#accessing-grafana)
9. [Default local test credentials](#default-local-test-credentials)
10. [Production deployment notes](#production-deployment-notes)
11. [Security notes](#security-notes)
12. [Testing](#testing)
13. [Operations](#operations)
---
## Overview
A customer signs in to their own branded portal, sees their meters' live + historical power readings via embedded Grafana dashboards, and (for admins) configures branding, municipality tariffs, users, sites, and devices. Each customer gets a fully isolated stack: their own database, their own Grafana, their own branding. Traefik routes `<customer>.portal.example.com` to the right containers.
### Tech stack
| Layer | Technology |
|---|---|
| Backend | .NET 10 minimal API, EF Core 10, Npgsql, ASP.NET Core Identity, Serilog |
| Frontend | React 18 + TypeScript + Vite, Ant Design 5, TanStack Query, react-router |
| Database | TimescaleDB 2.17 on PostgreSQL 16 |
| Graphing | Grafana 11 (provisioned datasource + dashboards) |
| Container | Docker / Docker Compose; Traefik for routing |
| Auth | Cookie-based via ASP.NET Identity (SPA-friendly, 401/403 not redirects) |
---
## Architecture
### Containers, per customer
```
┌────────────────────────────────────────────────────┐
│ Traefik │
│ host: <customer>.portal.example.com │
└────────────────────────────────────────────────────┘
┌─────────────────────────┬─────────────────────────┐
▼ ▼ ▼
┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐
│ <PREFIX>_portal │ │ <PREFIX>_grafana │ │<PREFIX>_timescale │
│ .NET API + SPA │──▶│ Grafana 11 │──▶│ TimescaleDB + Pg16│
│ :8080 │ │ :3000 │ │ :5432 │
└───────────────────┘ └───────────────────┘ └───────────────────┘
```
`<PREFIX>` = the customer's 7-digit ID **lowercased** (e.g. `abc0001` for customer `ABC0001`), set via `COMPOSE_PROJECT_NAME`. Compose v2 rejects uppercase project names.
### Backend layout
- **Combined container** — Dockerfile builds the React SPA, then a multi-stage .NET build, then copies the SPA into `wwwroot`. One image, one process per customer.
- **Single `AppDbContext`** with three schemas:
- `identity` — ASP.NET Identity tables.
- `app` — branding, municipalities, tariffs, periods.
- `monitoring` — sites, devices, power measurements (hypertable).
- **Minimal API** with endpoints grouped under `Endpoints/*.cs`. Services in `Services/*.cs`. Typed options in `Configuration/`.
### Frontend layout
- `src/pages/` — page-level components (`DashboardsPage`, `SettingsPage`, etc.).
- `src/components/` — shared + feature components.
- `src/api/` — typed axios calls.
- `src/hooks/useAuth`, `useBranding` — global state via Context.
- `RequireAuth` / `RequireRole` — route guards.
- AntD `ConfigProvider` is themed dynamically from `BrandingProvider` (white-labelling).
---
## Configuration template
All configurable values are declared in `src/Tau.Acuvim.Portal/appsettings.template.json` — checked in, no secrets. It's the lowest-priority configuration source: everything overrides it.
### Precedence (lowest → highest)
1. `appsettings.template.json` — shippable defaults (always loaded).
2. `appsettings.json` — runtime infra (Serilog, AllowedHosts).
3. `appsettings.{Environment}.json` — per-environment overrides.
4. `appsettings.Local.json` — gitignored, your local overrides.
5. Environment variables — use `__` as section separator (e.g. `Authentication__DefaultAdminPassword=...`). Production secrets go here.
### Sections
| Section | Purpose |
|---|---|
| `Application` | Name, environment, public URL |
| `Database` | Provider, connection string, `MigrateOnStartup`, `AutoProvisionLocalTimescaleDb` |
| `TimescaleDb` | Host/port/db/user/password for auto-provision in dev |
| `Grafana` | Base URL, internal URL, path prefix, embed mode, dashboard list |
| `WhiteLabel` | App name, logo URL, colours, footer, logo storage path |
| `Authentication` | Cookie name, lockout, default admin email/password |
| `Monitoring` | Hypertable chunk interval, aggregate flag |
### Database connection resolution
1. If `Database:ConnectionString` is non-empty → use it.
2. Else if `Database:AutoProvisionLocalTimescaleDb=true` AND env is not `Production` → build from the `TimescaleDb:*` block (Host/Port/Database/Username/Password).
3. Otherwise → app refuses to start with a clear error.
`AutoProvisionLocalTimescaleDb=true` in Production is a hard failure — production must supply its own connection string via env var or secret.
---
## Local setup
### Prerequisites
- .NET 10 SDK
- Node 22+ / npm
- Docker Desktop
- `dotnet-ef` tool: `dotnet tool install --global dotnet-ef`
### First-time: generate the initial migration
Identity / branding / rates / monitoring entities are defined in code; the migration files themselves are generated artifacts. Run once:
```powershell
cd C:\AcuvimDev\Tau.Acuvim\portal
dotnet ef migrations add InitialCreate `
--project src/Tau.Acuvim.Portal/Tau.Acuvim.Portal.csproj `
--output-dir Migrations
```
Commit the resulting `Migrations/` folder. From then on, `MigrateAsync` on startup applies whatever exists — no manual step at deploy time.
### Option A: full stack in Docker (recommended)
```powershell
cd C:\AcuvimDev\Tau.Acuvim\portal
Copy-Item .env.example .env
docker compose up --build -d
docker compose ps
```
Then:
- Portal: http://localhost:8080
- Grafana: http://localhost:3001 (anonymous Viewer in dev)
- TimescaleDB: `localhost:5433` (user/db from `.env`)
Stop + wipe:
```powershell
docker compose down -v
```
### Option B: backend + DB in Docker, frontend via Vite
Same `docker compose up`, then in another terminal:
```powershell
cd C:\AcuvimDev\Tau.Acuvim\portal\frontend
npm install
npm run dev
```
Vite serves at http://localhost:5174 and proxies `/api` + `/health` to the .NET container on `:8080`.
### Option C: everything local (no Docker for the app)
Postgres still needs Docker (or a local install):
```powershell
docker compose up -d timescaledb
cd C:\AcuvimDev\Tau.Acuvim\portal\src\Tau.Acuvim.Portal
dotnet run # listens on :8080
# in another terminal
cd C:\AcuvimDev\Tau.Acuvim\portal\frontend
npm run dev # :5174
```
---
## Docker Compose
### Dev — `docker-compose.yml`
Three services: `portal`, `timescaledb`, `grafana`. Persistent named volumes (`timescale-data`, `grafana-data`, `portal-keys`, `portal-branding`). Healthcheck on Postgres; portal waits for healthy. Grafana ships with anonymous Viewer for easy local access; provisioned `TimescaleDB` datasource (uid: `timescaledb`) and any JSON dashboards under `grafana/dashboards/`.
Host port mappings: `8080→portal`, `5433→timescaledb`, `3001→grafana`. Chosen to coexist with the console stack.
### Prod — `docker-compose.prod.yml`
Same services, no host port mappings. Joins external `traefik-public` network. Per-customer Traefik labels (subdomain routing for portal, same-origin path-prefix routing for Grafana at `/grafana`). Grafana sub-path + `GF_SERVER_ROOT_URL` configured. All secrets via env vars.
Run:
```powershell
docker network create traefik-public # once on the host
docker compose -f docker-compose.prod.yml --env-file .env up -d
```
See [OPERATIONS.md](./OPERATIONS.md) for the full per-customer deployment loop.
---
## Database migrations
`MigrateAsync` runs on startup (controlled by `Database:MigrateOnStartup`, default `true`). Immediately after, `TimescaleBootstrapper` runs an idempotent block:
1. `CREATE EXTENSION IF NOT EXISTS timescaledb` (defensive).
2. `SELECT create_hypertable('monitoring."PowerMeasurements"', 'Time', if_not_exists => TRUE, migrate_data => TRUE)`.
3. `SELECT set_chunk_time_interval('monitoring."PowerMeasurements"', INTERVAL '<MonitoringOptions.ChunkTimeInterval>')`.
Safe to re-run on every start.
### Adding a new migration
When you change the entity model:
```powershell
cd C:\AcuvimDev\Tau.Acuvim\portal
dotnet ef migrations add <DescriptiveName> `
--project src/Tau.Acuvim.Portal/Tau.Acuvim.Portal.csproj `
--output-dir Migrations
```
Commit. Next deploy applies it automatically.
---
## Accessing the app
| Environment | URL |
|---|---|
| Local (Docker combined) | http://localhost:8080 |
| Local (Vite dev) | http://localhost:5174 |
| Production | `https://<customer-host>` (e.g. `https://abc0001.portal.example.com`) |
API base path: `/api`. Swagger UI in dev: `/swagger`.
### Health endpoints
- `GET /health` — liveness (process alive). Returns `Healthy` if the app responds.
- `GET /health/ready` — readiness. Returns `Healthy` only if TimescaleDB answers.
### Nav surface
| Page | Who sees it |
|---|---|
| Dashboard | Any authenticated user |
| Dashboards (embedded Grafana) | Any authenticated user |
| Sites | Admin only |
| Settings (Branding / Rates / Users / Grafana / App config) | Admin only |
---
## Accessing Grafana
### Local dev
- Direct: http://localhost:3001 — anonymous Viewer can browse provisioned dashboards. Admin/`GRAFANA_ADMIN_PASSWORD` for editing.
- Embedded: portal **Dashboards** page — iframe `src` points at the local Grafana base URL.
### Production
- Direct browser access to `<customer-host>/grafana/*` is gated by your chosen auth mode (see Security notes). Anonymous is **off** in the prod compose.
- Embedded: portal **Dashboards** page — same-origin iframe via Traefik path prefix `/grafana`.
### Provisioning
- Datasource: `grafana/provisioning/datasources/timescaledb.yml` (uid: `timescaledb`).
- Dashboard provider: `grafana/provisioning/dashboards/dashboards.yml` (auto-discovers JSON in `grafana/dashboards/`, refresh 30s).
- Starter dashboard: `grafana/dashboards/power-overview.json` — active power + cumulative energy + latest-power stat, parameterised by a `device` template variable.
To add a dashboard:
1. Drop the JSON into `grafana/dashboards/`.
2. Add an entry to `Grafana.Dashboards` in `appsettings.template.json` (or override in `appsettings.Local.json`) with the same `Uid`. The portal's Dashboards page picks it up after a refresh.
---
## Default local test credentials
Generated locally — change before publishing the stack to anyone.
- Email: `admin@example.com`
- Password: `ChangeMe123!`
Defined in `appsettings.template.json``Authentication`. The bootstrapper seeds this account only if no account with that email exists, and never overwrites a changed password.
**Production guard:** if `ASPNETCORE_ENVIRONMENT=Production` and the default password is still `ChangeMe123!`, the app refuses to start with an explicit error. Override `Authentication__DefaultAdminPassword` via env var before deploying.
---
## Production deployment notes
For the per-customer deployment loop see [OPERATIONS.md](./OPERATIONS.md). The short version:
1. **One Compose project per customer.** Set `COMPOSE_PROJECT_NAME=abc0001` (lowercase form of the customer ID — Compose v2 rejects uppercase). Containers are named `abc0001_portal`, `abc0001_grafana`, `abc0001_timescale`.
2. **One subdomain per customer.** Set `CUSTOMER_HOST=abc0001.portal.example.com`. Wildcard DNS + wildcard TLS cert via Traefik's resolver (`certresolver=le`).
3. **Decide your Grafana auth mode** (see Security notes). The prod compose deliberately leaves Grafana auth **off** so the iframe refuses to load until you pick.
4. **Set all secrets via env vars** (not files):
- `POSTGRES_PASSWORD`
- `GRAFANA_ADMIN_PASSWORD`
- `Authentication__DefaultAdminPassword`
5. **External `traefik-public` Docker network must exist** (created once on the host running Traefik).
6. **Up the stack:**
```
docker compose -f docker-compose.prod.yml --env-file .env up -d
```
7. **Verify** the three containers report healthy and `https://<customer-host>/health/ready` returns `Healthy`.
---
## Security notes
### What's protected by default
- **ASP.NET Core Identity** with lockout (5 failed attempts → 15 min) and strong password requirements (8+ chars, upper + lower + digit).
- **Cookies are HttpOnly + SameSite=Lax + Secure in prod**, scoped to the portal subdomain.
- **Admin-only endpoints** are gated by an `AdminOnly` policy (`RequireRole("Admin")`). Confirmed at backend; nav hidden on frontend.
- **Cannot delete your own account** — backend block, not just UI.
- **`GET /api/admin/config-overview`** is admin-only; the DTO never includes the connection string or any password. Redaction by construction, not filtering.
- **Branding logo upload** rejects files >2 MB and extensions outside `{png, jpg, jpeg, svg, webp}`.
- **Anti-forgery is left on by default** on cookie-authenticated endpoints; the logo upload explicitly opts out (multipart needs it disabled). Other admin endpoints accept JSON over `same-site Lax` cookies, which is CSRF-safe for state-changing same-origin SPA requests.
- **Security headers** (`X-Content-Type-Options`, `X-Frame-Options: SAMEORIGIN`, `Referrer-Policy: strict-origin-when-cross-origin`) on every response. HSTS in prod.
### Production refuse-to-start guards
- App refuses to start in `Production` if `Authentication:DefaultAdminPassword` is still `ChangeMe123!`.
- App refuses to start in `Production` if `Database:AutoProvisionLocalTimescaleDb=true` (you must supply an explicit connection string).
- App refuses to start if no connection string can be resolved at all.
### Grafana embedding — three production auth options
The dev compose runs Grafana with anonymous Viewer (safe on `localhost`). The prod compose has anonymous **off** and leaves the auth mode unset on purpose — pick one before publishing:
| Option | What it does | Trade-off |
|---|---|---|
| **(a) Traefik `forwardAuth`** | Traefik middleware calls a portal `/api/auth/check` endpoint on every Grafana request; portal cookie required, else 401 | Zero changes to Grafana. Best when "any portal user = same dashboards." |
| **(b) Grafana `auth.proxy`** | `GF_AUTH_PROXY_ENABLED=true`; trust an `X-WEBAUTH-USER` header set by Traefik | Maps portal user → Grafana user, gets per-user folders/perms. Sanitise the header — never let a client set it directly. |
| **(c) Service-account API key + render tokens** | Portal mints short-lived render tokens; SPA embeds via `?auth_token=...` | Most moving parts. Right when dashboards are stitched into custom UI per-panel rather than full Grafana. |
Until one is wired, prod-mode Grafana refuses anonymous access and the iframe shows a login page — the intended safe default.
### Other considerations
- Same-origin embed (prod path-prefix routing through Traefik) sidesteps third-party-cookie blockers that increasingly break cross-origin Grafana iframes.
- Provisioned datasource is `editable: false` — admins cannot accidentally rewire Grafana from its UI.
- Default password complexity is tunable in `Program.cs``IdentityOptions`. Lockout is tunable in the same block.
- **TimescaleDB licensing** — we use Apache-licensed `timescale/timescaledb:*-pg16`. Stay on community features (hypertables, continuous aggregates) if you ever sell this as managed DBaaS.
---
## Testing
Backend unit tests under `tests/Tau.Acuvim.Portal.Tests/` cover cost calculation, rate validation, connection-string resolution, day-of-week math:
```powershell
cd C:\AcuvimDev\Tau.Acuvim\portal\tests\Tau.Acuvim.Portal.Tests
dotnet test
```
See [TESTING.md](./TESTING.md) for the full manual integration scenario, frontend test scaffolding recipe, and edge-case checklist.
---
## Operations
For per-customer provisioning, secret rotation, backups, and health monitoring see [OPERATIONS.md](./OPERATIONS.md).