A bundle of related portal work — picked up while ensuring per-customer
isolation actually works end-to-end and replacing the placeholder Client
landing page. Build green, full test suite 66/66.
Frontend — Client surface
- DashboardPage: replace placeholder with 4 KPI cards (kWh, current kW,
active devices, estimated cost), 24h active-power ECharts mini-chart,
per-device "today/range" table, and a date-range picker with shortcuts
(Today / 7d / 30d / This month / Custom). 30s auto-refresh.
- New Measurements page (/measurements, Client mode, any authenticated
user) with multi-select device filter, full date range incl. an
"All time" shortcut, server-paginated preview, and Excel export.
- "Export to Excel" buttons on: Client Dashboard summary, Client Dashboard
raw measurements, Admin fleet dashboard, Admin customer-detail Cost tab.
- DashboardsPage sidebar items: let the menu item grow and reset
line-height so the two-line title+description doesn't crush.
Frontend — Admin / user mgmt
- RestrictedAdmin role: admin who only sees their assigned customers.
New UserFormDrawer choice + CustomerAccessModal for granting/revoking
per-customer access; surfaced from the Users page.
Backend
- ClosedXML 0.104.2 + ExcelExportService (pure formatter; frozen header,
currency/kWh/kW/date number formats, AdjustToContents).
- DashboardSummaryService computes per-device totals + estimated cost
(hourly bucketing × site's municipality's active tariff, mirroring
FleetCostService for the Admin side).
- New endpoints:
GET /api/dashboard/summary[+/export.xlsx]
GET /api/measurements/raw[+/export.xlsx] (deviceIds, paginated)
GET /api/sites/devices (flat list w/ site name)
GET /api/fleet/dashboard/export.xlsx
GET /api/fleet/customers/{id}/cost/export.xlsx
GET /api/auth/check (cookie-only liveness)
- AdminCustomerAccess: per-user customer scoping for RestrictedAdmin via
Postgres-row-level filter — RlsContext (per-DI-scope state) +
CustomerFilterMiddleware (populates from claims after auth) +
fleet.* DbSets gain HasQueryFilter expressions. Bootstrappers
Elevate() to bypass the filter for trusted system code.
- Migration: 20260518095759_AddAdminCustomerAccess (mapping table,
composite PK on UserId+CustomerId).
Infra / templating (the "spin it up via the template" piece)
- docker-compose.prod.yml + docker-compose.yml: pass WhiteLabel__*,
Application__RunMode, FleetIngest__* through to the container as
${VAR:-default} substitutions. Previously these were silently dropped
in prod — a customer's .env settings for branding/fleet-push never
reached the running process. Latent bug, fixed.
- docker-compose.prod.yml: forwardAuth middleware labels on the
Grafana router pointing at /api/auth/check. Option (a) from the
README's three prod-auth modes — every Grafana request now gates on
a valid portal cookie. Anonymous stays off.
- .env.example rewritten with a Client section, optional FleetIngest
block, and an Admin variant block — annotated on what's required vs.
optional and where the seed-only-on-first-boot caveat applies.
- README "Grafana embedding" table: option (a) now marked active with
an inline note on how to switch modes later.
- OPERATIONS.md step 3 includes the white-label pre-brand .env snippet;
step 4 (formerly "decide Grafana auth mode") updated to reflect
that auth is wired by default.
Tests
- New BrandingSeedFromOptionsTests (5 tests) pins the env-var → IOptions
→ DB seed contract: first read seeds from options; subsequent reads
return the DB row (UI edits survive restarts); EnsureSeededAsync is
idempotent; UpdateAsync falls back to options for blanked fields.
- CustomerTokenGraceTests helper: pass the new RlsContext to
AdminDbContext (SetAll() so existing semantics hold).
Verified end-to-end
- Real Docker spin-up with WhiteLabel__* in a throwaway .env →
/api/branding returned all six fields verbatim (ApplicationName,
LogoUrl, three colors, FooterText).
- curl login → /api/dashboard/summary returned valid JSON →
/api/dashboard/summary/export.xlsx returned a 6.9 KB file the
`file` command identifies as "Microsoft Excel 2007+".
- /api/measurements/raw with and without deviceIds filter returned
correct paginated rows; /export.xlsx with filter produced a valid
7.1 KB xlsx with the meter count in the filename.
- Frontend tsc -b clean; backend dotnet build 0/0; xunit 66/66.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
17 KiB
Tau Acuvim Portal — Operations
Per-customer deployment loop. For background, architecture, and security model, read the README first.
Contents
- Prerequisites (per host)
- Provisioning a new customer
- Updating a customer's stack
- Rotating secrets
- Backup & restore
- Health & monitoring
- Troubleshooting
- Decommissioning a customer
Prerequisites (per host)
These exist once on the host running customer stacks; not per customer.
- Docker Engine (or Docker Desktop on Windows hosts).
- External Traefik instance — running on the same host, joined to a Docker network named
traefik-public. Configured with:- Two entrypoints:
web(80),websecure(443). - A certificate resolver named
le(Let's Encrypt via DNS-01 or HTTP-01). - HTTP → HTTPS redirect.
- Docker provider with
exposedByDefault: false.
- Two entrypoints:
- Wildcard DNS + TLS cert for
*.portal.example.com(or whatever your customer subdomain pattern is). - The
traefik-publicDocker network exists:docker network create traefik-public # one-time - The portal image is built (or pull-able from a registry):
cd /path/to/portal docker compose -f docker-compose.prod.yml build
Provisioning a new customer
Goal: spin up an isolated stack for customer ABC0001 (Compose project abc0001 — lowercase required) at abc0001.portal.example.com.
1. Create the customer directory
A common pattern: one directory per customer holding only an .env file (the compose files are shared from the repo). Adjust to your fleet-management tool of choice (Ansible, Portainer, Helm-on-K8s later).
/srv/portal/abc0001/
└── .env
2. Generate strong secrets
openssl rand -base64 32 # POSTGRES_PASSWORD
openssl rand -base64 32 # GRAFANA_ADMIN_PASSWORD
openssl rand -base64 32 # Authentication__DefaultAdminPassword
3. Fill in .env
Copy .env.example to the customer's directory and fill in:
COMPOSE_PROJECT_NAME=abc0001
CUSTOMER_HOST=abc0001.portal.example.com
Application__RunMode=Client
POSTGRES_DB=power_monitoring
POSTGRES_USER=power_user
POSTGRES_PASSWORD=<from step 2>
Authentication__DefaultAdminEmail=admin@abc0001.example.com
Authentication__DefaultAdminPassword=<from step 2>
GRAFANA_ADMIN_PASSWORD=<from step 2>
Grafana__EmbedPathPrefix=/grafana
# Pre-brand the stack so the customer's first sign-in already shows their
# colours and name. Only applied on first boot; later changes are via the UI.
WhiteLabel__ApplicationName=Acme Corp Power Monitoring
WhiteLabel__PrimaryColor=#0c4a6e
WhiteLabel__SecondaryColor=#0e7490
WhiteLabel__AccentColor=#06b6d4
WhiteLabel__FooterText=© Acme Corp
See .env.example for the full annotated set including the optional FleetIngest__* block (added later, when you enable fleet aggregation for this customer).
4. Grafana auth (already wired)
Production Grafana embedding uses Traefik forwardAuth → portal /api/auth/check, defined inline on the Grafana router in docker-compose.prod.yml. Every Grafana sub-request is gated on a valid portal cookie; anonymous is off. No per-customer action required.
To switch to a different mode (e.g. auth.proxy for per-user Grafana folders), see the README "Grafana embedding — production auth" section.
5. Bring it up
cd /srv/portal/abc0001
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml up -d
6. Verify
# Wait for healthy
docker ps --filter "label=com.docker.compose.project=abc0001"
# Health checks
curl -fs https://abc0001.portal.example.com/health # → Healthy
curl -fs https://abc0001.portal.example.com/health/ready # → Healthy
# Migration + seed in the logs
docker logs abc0001_portal | grep -E "Applied migration|Seeded|hypertable"
# expect:
# Applied migration 'InitialCreate'
# TimescaleDB hypertable for monitoring.PowerMeasurements is ready
# Seeded default admin admin@abc0001.example.com
7. First login + handover
- Sign in as
Authentication__DefaultAdminEmailwith the password from step 2. - Settings → Users → create the customer's real admin account; toggle Admin on.
- Sign out, sign in as the customer admin, change the default admin password (or delete the default admin account if the customer admin is the only one needed).
- Settings → Branding → upload customer logo, apply colours.
- Settings → Rates → seed at least one municipality + tariff for cost calc.
- Sites → create the customer's sites/devices so the ingest pipeline knows where measurements belong.
Updating a customer's stack
Code-only update (no migrations, no compose changes)
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml \
up -d --build portal
Brief downtime while the new container starts. The DB is untouched.
Update with new migrations
Same command — MigrateAsync on startup applies pending migrations before the app accepts traffic. Watch the logs:
docker logs -f abc0001_portal | grep -E "Applied migration|Failed|hypertable"
If a migration fails the container will exit; fix forward, push a corrected image, retry.
Compose changes (env vars, ports, labels)
Edit the customer's .env (or the central docker-compose.prod.yml) and:
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml up -d
Compose recreates only the containers whose definition changed.
Rolling many customers
There's no built-in fan-out — pick your orchestrator (Ansible playbook, simple bash loop, Portainer stacks). Update one customer first, verify, then roll the rest.
Rotating secrets
Database password
# 1. Change the password inside Postgres
docker exec -it abc0001_timescale psql -U power_user -d power_monitoring \
-c "ALTER USER power_user WITH PASSWORD '<new>';"
# 2. Update .env
sed -i 's/^POSTGRES_PASSWORD=.*/POSTGRES_PASSWORD=<new>/' .env
# 3. Recreate the portal + grafana to pick up new env vars
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml \
up -d portal grafana
Grafana admin password
sed -i 's/^GRAFANA_ADMIN_PASSWORD=.*/GRAFANA_ADMIN_PASSWORD=<new>/' .env
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml \
up -d grafana
GF_SECURITY_ADMIN_PASSWORD is re-applied on container start.
Default admin password
Once the customer admin exists and has changed their own password, the default admin can be deleted from the Settings → Users UI. After that, Authentication__DefaultAdminPassword is only used if the row is re-seeded (which happens only when no account with that email exists).
Backup & restore
What to back up
| Volume | What's in it | Frequency |
|---|---|---|
<PREFIX>_timescale-data |
All customer data (Identity, branding, tariffs, sites, devices, measurements) | Daily, more for high-write customers |
<PREFIX>_grafana-data |
Grafana's internal SQLite (user prefs, plugin state). Dashboards re-provision from JSON so this is not authoritative. | Weekly is plenty |
<PREFIX>_portal-branding |
Uploaded logos | Daily |
<PREFIX>_portal-keys |
Data Protection key ring (cookie signing). Losing this invalidates all sessions but doesn't lose data. | Weekly |
Postgres dump
docker exec abc0001_timescale \
pg_dump -U power_user -d power_monitoring -F c -f /tmp/backup.dump
docker cp abc0001_timescale:/tmp/backup.dump ./abc0001-$(date +%Y%m%d).dump
docker exec abc0001_timescale rm /tmp/backup.dump
For consistent hypertable backups, prefer Timescale's pg_dump (supports hypertables natively as of PG12+; the above works).
Volume snapshot
For non-DB volumes, simplest is a tar from the volume's mountpoint, or use your storage layer's snapshot facility (LVM, ZFS, EBS, etc.).
Restore
# Fresh DB
docker compose -f /path/to/portal/docker-compose.prod.yml --env-file .env down timescaledb
docker volume rm abc0001_timescale-data
docker compose -f /path/to/portal/docker-compose.prod.yml --env-file .env up -d timescaledb
# Restore
docker cp abc0001-YYYYMMDD.dump abc0001_timescale:/tmp/backup.dump
docker exec abc0001_timescale \
pg_restore -U power_user -d power_monitoring --clean --if-exists /tmp/backup.dump
docker exec abc0001_timescale rm /tmp/backup.dump
# Start everything
docker compose -f /path/to/portal/docker-compose.prod.yml --env-file .env up -d
The TimescaleBootstrapper is idempotent — it will not error on a restored hypertable.
Health & monitoring
Liveness / readiness
GET /health— liveness. Use as Traefik / load-balancer health check.GET /health/ready— readiness (DB reachable). Use for orchestration "in service" decisions.
Logs
Serilog writes JSON to stdout; the Docker logging driver of your choice (json-file, journald, gelf to a central log store) picks it up.
docker logs abc0001_portal --tail 200 --follow
Notable lines:
Database connection resolved via …— confirms how this container resolved its DB at startup.Applied migration '…'— one per pending migration.TimescaleDB hypertable for monitoring.PowerMeasurements is ready— bootstrapper succeeded.Seeded default admin …— first start only; absence on subsequent starts is correct.
DB health from the host
docker exec abc0001_timescale pg_isready -U power_user -d power_monitoring
TimescaleDB chunks
docker exec -it abc0001_timescale psql -U power_user -d power_monitoring -c \
"SELECT chunk_name, range_start, range_end, total_bytes
FROM chunks_detailed_size('monitoring.\"PowerMeasurements\"');"
Troubleshooting
| Symptom | First check |
|---|---|
| Portal container restart-looping | docker logs <PREFIX>_portal — usually a missing env var (default-admin password in prod, missing Postgres password) or a migration failure. |
/health/ready returns Unhealthy |
Postgres container down, or wrong creds. docker logs <PREFIX>_timescale. |
| Grafana iframe loads but no charts | Datasource UID mismatch — confirm grafana/provisioning/datasources/timescaledb.yml has uid: timescaledb and the dashboard JSON references the same. |
| Grafana iframe shows login screen in prod | Expected if no auth mode is wired yet (anonymous off by default). Pick a mode (see README → Security). |
| Branded logo missing after restart | <PREFIX>_portal-branding volume not mounted, or filesystem perms wrong. The container runs as user app; volume must be writable by uid 1000. |
Ingest returns accepted: 0, rejected: N |
Devices don't exist for those externalIds. Create them via the Sites screen first. |
| Cookie auth seems random / sessions lost on restart | <PREFIX>_portal-keys volume not mounted — Data Protection re-keys on every start. |
| Hypertable error on startup | Pre-existing non-empty plain table being converted. migrate_data => TRUE should handle it; if not, restore from backup and check for manual monitoring."PowerMeasurements" schema changes. |
Decommissioning a customer
# Final backup
docker exec abc0001_timescale pg_dump -U power_user -d power_monitoring \
-F c -f /tmp/final.dump
docker cp abc0001_timescale:/tmp/final.dump ./abc0001-final-$(date +%Y%m%d).dump
# Stop and remove containers
docker compose --env-file .env -f /path/to/portal/docker-compose.prod.yml down
# Remove volumes (destroys data — confirm backup first)
docker volume rm \
abc0001_timescale-data \
abc0001_grafana-data \
abc0001_portal-branding \
abc0001_portal-keys
# Remove customer dir
rm -rf /srv/portal/abc0001
# DNS record + cert (manual or via your DNS automation)
# If using the fleet aggregator: also delete the customer from the Admin
# Customers page (UI Delete) or via psql against the central DB:
# DELETE FROM fleet."Customers" WHERE "Code" = 'ABC0001';
# (cascades to Sites, Devices, PowerMeasurements, IngestEvents)
Fleet aggregator (Admin stack)
For background and the full design see docs/FLEET-DESIGN.md. This section covers the day-to-day ops.
One-time: provisioning the Admin stack
# 1. Create a dedicated Postgres DB for the central fleet
docker exec <timescale-container> createdb -U power_user admin_fleet
# 2. Spin up the Admin portal (same image as a customer stack, different env)
docker run -d --name admin-portal --restart unless-stopped \
--network <shared-network> \
-e Application__RunMode=Admin \
-e ASPNETCORE_ENVIRONMENT=Production \
-e Application__PublicUrl=https://admin.portal.example.com \
-e Database__ConnectionString='Host=<host>;Port=5432;Database=admin_fleet;Username=power_user;Password=<secret>' \
-e Authentication__DefaultAdminEmail=ops@yourco.example \
-e Authentication__DefaultAdminPassword=<strong> \
-v admin-portal-keys:/data/keys \
-v admin-portal-branding:/data/branding \
tau-acuvim-portal:latest
# 3. (Optional) Spin up an Admin-side Grafana pointed at admin_fleet
docker run -d --name admin-grafana --restart unless-stopped \
--network <shared-network> \
-e GF_SECURITY_ADMIN_PASSWORD=<strong> \
-e GF_SECURITY_ALLOW_EMBEDDING=true \
-e GF_AUTH_ANONYMOUS_ENABLED=false \
-e POSTGRES_DB=admin_fleet \
-e POSTGRES_USER=power_user \
-e POSTGRES_PASSWORD=<secret> \
-v admin-grafana-data:/var/lib/grafana \
-v /srv/portal/grafana/provisioning:/etc/grafana/provisioning:ro \
-v /srv/portal/grafana/dashboards-admin:/var/lib/grafana/dashboards:ro \
grafana/grafana:11.4.0
Behind Traefik: add labels on admin-portal and admin-grafana mirroring the per-customer pattern, with Host(admin.portal.example.com) and (for Grafana) && PathPrefix(/grafana). Choose a Grafana auth mode from README Security (forwardAuth / auth.proxy / render tokens) before exposing.
Onboarding a new customer end-to-end
# A. Admin side — register and capture token (one-time per customer)
# 1. Sign in to https://admin.portal.example.com
# 2. Customers → "Register customer" → Code=ABC0001, Name=Acme Corp
# 3. Copy the token shown ONCE.
# B. Customer side — spin up their stack (per OPERATIONS "Provisioning a new customer")
# AND add to their .env:
cat >> /srv/portal/abc0001/.env <<EOF
Application__RunMode=Client
FleetIngest__Enabled=true
FleetIngest__Url=https://admin.portal.example.com/api/fleet/ingest
FleetIngest__Token=<token from step A.3>
FleetIngest__IntervalSeconds=60
FleetIngest__BatchSize=5000
EOF
# 3. Restart the customer's portal so the push service starts.
docker compose -f /path/to/portal/docker-compose.prod.yml --env-file .env \
up -d portal
# C. Verify
# 1. In Admin UI → Customers, ABC0001 should show "Last push" advance within a minute.
# 2. Click the row → Customer detail → "Recent ingest" tab should list sites/devices
# batches (and measurements once any are ingested locally).
# 3. From the host:
docker exec <admin-timescale> psql -U power_user -d admin_fleet -c \
'SELECT "BatchType","RowsAccepted","ReceivedAt" FROM fleet."IngestEvents" ORDER BY "ReceivedAt" DESC LIMIT 10;'
Common ops
| What | Where |
|---|---|
| Rotate a customer's push token | Admin UI → Customers → row's "Rotate token" button. Update customer's .env and restart their portal. Brief push gap (until restart) is expected. |
| Disable a customer (stop accepting their data) | Admin UI → Customers → Edit → Active off. Ingest returns 401 immediately; data already in fleet.* is untouched. |
| Investigate "why hasn't ABC0001 shown up?" | Customer detail page → Recent ingest tab. Check for 401s, rejected rows, error messages. Or: SELECT * FROM fleet."IngestEvents" WHERE "CustomerId" = '<id>' ORDER BY "ReceivedAt" DESC; |
| Inspect compression | SELECT * FROM hypertable_compression_stats('fleet."PowerMeasurements"'); |
| Force a continuous aggregate refresh | CALL refresh_continuous_aggregate('fleet.hourly_per_device', NULL, NULL); |
| Decommission a customer from the fleet | Admin UI → Customers → Delete (cascades sites/devices/measurements/events). Customer's local stack is untouched; their portal will get 401s on push until they disable FleetIngest__Enabled or you re-register them. |
Backing up the central DB
Same pg_dump pattern as a customer DB (see above), targeting admin_fleet. Includes hypertable chunks; restore with pg_restore then run the Admin portal once to re-bootstrap the continuous aggregate refresh policy (FleetTimescaleBootstrapper is idempotent).