New top-level portal/ project, peer to console/ and firmware/. Delivers a .NET 10 + React 18 + TimescaleDB + Grafana stack, one container set per customer behind Traefik. Built in 12 phases per FrontEndPrompt spec; no changes to existing console or firmware. Backend (src/Tau.Acuvim.Portal/): - .NET 10 minimal API, Serilog, ASP.NET Identity (cookie auth, lockout). - Single AppDbContext with identity / app / monitoring schemas. - MigrateAsync + TimescaleBootstrapper (idempotent hypertable creation) + IdentityBootstrapper (seeded admin + branding) on startup. - Pure CostCalculator + DB-backed RateService for tariffs (effective-dated, TOU periods, VAT, fixed charges, per-municipality timezone). - BrandingService with logo upload to mounted volume. - Time-series ingest + bucketed query services (time_bucket aggregates, ON CONFLICT for idempotent re-delivery). - ConfigOverviewService with redaction-by-construction (passwords never in payload). - DataProtection keys persisted to /data/keys volume for cookie survival across container restarts. Frontend (frontend/): - React 18 + TypeScript + Vite + Ant Design 5 + TanStack Query. - BrandingProvider + ThemedRoot for live re-themed white-labelling. - RequireAuth / RequireRole guards. - Pages: Login, Dashboard, Dashboards (embedded Grafana), Sites (admin), Settings tabs (Branding / Rates / Users / Grafana / App config). Infra: - Dev (docker-compose.yml) and prod (docker-compose.prod.yml) compose files. Three services per customer; Traefik subdomain + same-origin /grafana path-prefix routing wired with labels. - Grafana 11 with provisioned timescaledb datasource (uid pinned) and starter power-overview.json dashboard with device template variable. - Compose project name documented as lowercase (Compose v2 requirement). Tests (tests/Tau.Acuvim.Portal.Tests/): - xUnit, 40 tests. Covers CostCalculator (period match, TZ, overlap, VAT, fixed), ConnectionStringResolver (all 4 precedence branches incl. Production refusal), TariffValidator, DayOfWeekFlag. - All passing locally against .NET 10. Docs: - README.md (onboarding + 11 spec sections), OPERATIONS.md (per-customer provisioning, secret rotation, backup, troubleshooting), TESTING.md (manual integration scenarios, frontend test scaffolding recipe). Production safety guards: - Refuses to start if Authentication:DefaultAdminPassword is unchanged default in Production. - Refuses to start if Database:AutoProvisionLocalTimescaleDb=true in Production. - Prod Grafana ships with anonymous off and auth mode unset (three options documented in README Security) so iframe refuses to load until a deliberate prod auth choice is made. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
12 KiB
Tau Acuvim Portal — Operations
Per-customer deployment loop. For background, architecture, and security model, read the README first.
Contents
- Prerequisites (per host)
- Provisioning a new customer
- Updating a customer's stack
- Rotating secrets
- Backup & restore
- Health & monitoring
- Troubleshooting
- Decommissioning a customer
Prerequisites (per host)
These exist once on the host running customer stacks; not per customer.
- Docker Engine (or Docker Desktop on Windows hosts).
- External Traefik instance — running on the same host, joined to a Docker network named
traefik-public. Configured with:- Two entrypoints:
web(80),websecure(443). - A certificate resolver named
le(Let's Encrypt via DNS-01 or HTTP-01). - HTTP → HTTPS redirect.
- Docker provider with
exposedByDefault: false.
- Two entrypoints:
- Wildcard DNS + TLS cert for
*.portal.example.com(or whatever your customer subdomain pattern is). - The
traefik-publicDocker network exists:docker network create traefik-public # one-time - The portal image is built (or pull-able from a registry):
cd /path/to/portal docker compose -f docker-compose.prod.yml build
Provisioning a new customer
Goal: spin up an isolated stack for customer ABC0001 (Compose project abc0001 — lowercase required) at abc0001.portal.example.com.
1. Create the customer directory
A common pattern: one directory per customer holding only an .env file (the compose files are shared from the repo). Adjust to your fleet-management tool of choice (Ansible, Portainer, Helm-on-K8s later).
/srv/portal/abc0001/
└── .env
2. Generate strong secrets
openssl rand -base64 32 # POSTGRES_PASSWORD
openssl rand -base64 32 # GRAFANA_ADMIN_PASSWORD
openssl rand -base64 32 # Authentication__DefaultAdminPassword
3. Fill in .env
COMPOSE_PROJECT_NAME=abc0001
CUSTOMER_HOST=abc0001.portal.example.com
POSTGRES_DB=power_monitoring
POSTGRES_USER=power_user
POSTGRES_PASSWORD=<from step 2>
Authentication__DefaultAdminEmail=admin@abc0001.example.com
Authentication__DefaultAdminPassword=<from step 2>
GRAFANA_ADMIN_PASSWORD=<from step 2>
Grafana__EmbedPathPrefix=/grafana
4. Decide Grafana auth mode
Anonymous is off in the prod compose by default. Pick one of the three options from the README's Security notes and wire it before exposing the stack to anyone:
- (a) Traefik
forwardAuth→ add the middleware to thetraefik.http.routers.${COMPOSE_PROJECT_NAME}-grafanalabels and implement/api/auth/checkon the portal. - (b) Grafana
auth.proxy→ setGF_AUTH_PROXY_ENABLED=true,GF_AUTH_PROXY_HEADER_NAME=X-WEBAUTH-USERenv vars; ensure Traefik (or the portal) sets the header and that no client can. - (c) Render tokens → minted by a portal endpoint; SPA appends
?auth_token=...to the iframe URL.
Without any of these, Grafana refuses anonymous access in prod (intended safe default — iframe will show a login page).
5. Bring it up
cd /srv/portal/abc0001
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml up -d
6. Verify
# Wait for healthy
docker ps --filter "label=com.docker.compose.project=abc0001"
# Health checks
curl -fs https://abc0001.portal.example.com/health # → Healthy
curl -fs https://abc0001.portal.example.com/health/ready # → Healthy
# Migration + seed in the logs
docker logs abc0001_portal | grep -E "Applied migration|Seeded|hypertable"
# expect:
# Applied migration 'InitialCreate'
# TimescaleDB hypertable for monitoring.PowerMeasurements is ready
# Seeded default admin admin@abc0001.example.com
7. First login + handover
- Sign in as
Authentication__DefaultAdminEmailwith the password from step 2. - Settings → Users → create the customer's real admin account; toggle Admin on.
- Sign out, sign in as the customer admin, change the default admin password (or delete the default admin account if the customer admin is the only one needed).
- Settings → Branding → upload customer logo, apply colours.
- Settings → Rates → seed at least one municipality + tariff for cost calc.
- Sites → create the customer's sites/devices so the ingest pipeline knows where measurements belong.
Updating a customer's stack
Code-only update (no migrations, no compose changes)
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml \
up -d --build portal
Brief downtime while the new container starts. The DB is untouched.
Update with new migrations
Same command — MigrateAsync on startup applies pending migrations before the app accepts traffic. Watch the logs:
docker logs -f abc0001_portal | grep -E "Applied migration|Failed|hypertable"
If a migration fails the container will exit; fix forward, push a corrected image, retry.
Compose changes (env vars, ports, labels)
Edit the customer's .env (or the central docker-compose.prod.yml) and:
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml up -d
Compose recreates only the containers whose definition changed.
Rolling many customers
There's no built-in fan-out — pick your orchestrator (Ansible playbook, simple bash loop, Portainer stacks). Update one customer first, verify, then roll the rest.
Rotating secrets
Database password
# 1. Change the password inside Postgres
docker exec -it abc0001_timescale psql -U power_user -d power_monitoring \
-c "ALTER USER power_user WITH PASSWORD '<new>';"
# 2. Update .env
sed -i 's/^POSTGRES_PASSWORD=.*/POSTGRES_PASSWORD=<new>/' .env
# 3. Recreate the portal + grafana to pick up new env vars
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml \
up -d portal grafana
Grafana admin password
sed -i 's/^GRAFANA_ADMIN_PASSWORD=.*/GRAFANA_ADMIN_PASSWORD=<new>/' .env
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml \
up -d grafana
GF_SECURITY_ADMIN_PASSWORD is re-applied on container start.
Default admin password
Once the customer admin exists and has changed their own password, the default admin can be deleted from the Settings → Users UI. After that, Authentication__DefaultAdminPassword is only used if the row is re-seeded (which happens only when no account with that email exists).
Backup & restore
What to back up
| Volume | What's in it | Frequency |
|---|---|---|
<PREFIX>_timescale-data |
All customer data (Identity, branding, tariffs, sites, devices, measurements) | Daily, more for high-write customers |
<PREFIX>_grafana-data |
Grafana's internal SQLite (user prefs, plugin state). Dashboards re-provision from JSON so this is not authoritative. | Weekly is plenty |
<PREFIX>_portal-branding |
Uploaded logos | Daily |
<PREFIX>_portal-keys |
Data Protection key ring (cookie signing). Losing this invalidates all sessions but doesn't lose data. | Weekly |
Postgres dump
docker exec abc0001_timescale \
pg_dump -U power_user -d power_monitoring -F c -f /tmp/backup.dump
docker cp abc0001_timescale:/tmp/backup.dump ./abc0001-$(date +%Y%m%d).dump
docker exec abc0001_timescale rm /tmp/backup.dump
For consistent hypertable backups, prefer Timescale's pg_dump (supports hypertables natively as of PG12+; the above works).
Volume snapshot
For non-DB volumes, simplest is a tar from the volume's mountpoint, or use your storage layer's snapshot facility (LVM, ZFS, EBS, etc.).
Restore
# Fresh DB
docker compose -f /path/to/portal/docker-compose.prod.yml --env-file .env down timescaledb
docker volume rm abc0001_timescale-data
docker compose -f /path/to/portal/docker-compose.prod.yml --env-file .env up -d timescaledb
# Restore
docker cp abc0001-YYYYMMDD.dump abc0001_timescale:/tmp/backup.dump
docker exec abc0001_timescale \
pg_restore -U power_user -d power_monitoring --clean --if-exists /tmp/backup.dump
docker exec abc0001_timescale rm /tmp/backup.dump
# Start everything
docker compose -f /path/to/portal/docker-compose.prod.yml --env-file .env up -d
The TimescaleBootstrapper is idempotent — it will not error on a restored hypertable.
Health & monitoring
Liveness / readiness
GET /health— liveness. Use as Traefik / load-balancer health check.GET /health/ready— readiness (DB reachable). Use for orchestration "in service" decisions.
Logs
Serilog writes JSON to stdout; the Docker logging driver of your choice (json-file, journald, gelf to a central log store) picks it up.
docker logs abc0001_portal --tail 200 --follow
Notable lines:
Database connection resolved via …— confirms how this container resolved its DB at startup.Applied migration '…'— one per pending migration.TimescaleDB hypertable for monitoring.PowerMeasurements is ready— bootstrapper succeeded.Seeded default admin …— first start only; absence on subsequent starts is correct.
DB health from the host
docker exec abc0001_timescale pg_isready -U power_user -d power_monitoring
TimescaleDB chunks
docker exec -it abc0001_timescale psql -U power_user -d power_monitoring -c \
"SELECT chunk_name, range_start, range_end, total_bytes
FROM chunks_detailed_size('monitoring.\"PowerMeasurements\"');"
Troubleshooting
| Symptom | First check |
|---|---|
| Portal container restart-looping | docker logs <PREFIX>_portal — usually a missing env var (default-admin password in prod, missing Postgres password) or a migration failure. |
/health/ready returns Unhealthy |
Postgres container down, or wrong creds. docker logs <PREFIX>_timescale. |
| Grafana iframe loads but no charts | Datasource UID mismatch — confirm grafana/provisioning/datasources/timescaledb.yml has uid: timescaledb and the dashboard JSON references the same. |
| Grafana iframe shows login screen in prod | Expected if no auth mode is wired yet (anonymous off by default). Pick a mode (see README → Security). |
| Branded logo missing after restart | <PREFIX>_portal-branding volume not mounted, or filesystem perms wrong. The container runs as user app; volume must be writable by uid 1000. |
Ingest returns accepted: 0, rejected: N |
Devices don't exist for those externalIds. Create them via the Sites screen first. |
| Cookie auth seems random / sessions lost on restart | <PREFIX>_portal-keys volume not mounted — Data Protection re-keys on every start. |
| Hypertable error on startup | Pre-existing non-empty plain table being converted. migrate_data => TRUE should handle it; if not, restore from backup and check for manual monitoring."PowerMeasurements" schema changes. |
Decommissioning a customer
# Final backup
docker exec abc0001_timescale pg_dump -U power_user -d power_monitoring \
-F c -f /tmp/final.dump
docker cp abc0001_timescale:/tmp/final.dump ./abc0001-final-$(date +%Y%m%d).dump
# Stop and remove containers
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml down
# Remove volumes (destroys data — confirm backup first)
docker volume rm \
abc0001_timescale-data \
abc0001_grafana-data \
abc0001_portal-branding \
abc0001_portal-keys
# Remove customer dir
rm -rf /srv/portal/abc0001
# DNS record + cert (manual or via your DNS automation)