Tau.Acuvim

10 Commits 1 Branch 0 Tags 905 KiB

Author	SHA1	Message	Date
Diseri Pearson	3333202f3a	Dual-token rotation grace window (24h default) Token rotation used to be immediate cutover — push gap from when ops rotates to when the customer's .env is updated and portal restarted. Now the old token keeps working for 24h after rotation, so customer ops has a full workday to swap it in without dropping a single push tick. Backend - Customer entity gains PreviousTokenHash + PreviousTokenExpiresAt (both nullable). Non-unique index on PreviousTokenHash so the OR-lookup in FindByTokenAsync stays cheap. - CustomerService.RotateTokenAsync(id, graceWindow=null, ct): copies the existing TokenHash into PreviousTokenHash with PreviousTokenExpiresAt = now + graceWindow (default 24h, lifted to CustomerService.DefaultTokenGracePeriod), then issues a new current token. Second rotation overwrites the previous slot — at most one previous token is ever honoured. - CustomerService.FindByTokenAsync matches either current OR (previous AND PreviousTokenExpiresAt > now). IsActive=false still rejects both. - DTO exposes PreviousTokenExpiresAt so the UI can render the grace window status. - New EF migration AddPreviousTokenGraceWindow on AdminDbContext. Frontend - Customers table "Token" column shows an "Old token valid until …" orange tag with a tooltip whenever the grace window is active, plus the issue/rotation date as before. - TokenShownOnceModal mentions the 24h grace window so ops knows they have time to update .env without urgency. - Rotate-token popconfirm copy updated to reflect the new behavior. Tests (+5, 61/61 passing) - CustomerTokenGraceTests covers: create doesn't set previous; rotate moves current into previous slot with future expiry; zero grace window rejects original immediately; second rotation overwrites previous (original dies, first-rotation becomes the new previous); inactive customer rejects both current AND previous. Verified end-to-end on the dev host - Migration applied cleanly on the existing admin_fleet DB (existing DEV0001 customer got NULL previous columns, no data loss). - Created GRACE01 → got token1. - Rotated → got token2. PreviousTokenExpiresAt = +24h. Both token1 and token2 push successfully (200). - Rotated again → got token3. token1 push now returns 401 (gone). token2 push still 200 (now the previous). token3 push 200 (current). Docs - FLEET-DESIGN.md §6 rewritten — no longer "immediate cutover". - §11 "open seams" row for this feature marked as shipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 10:45:31 +02:00
Diseri Pearson	59c3f949d0	Admin customer detail: "Open Grafana drilldown" button Wires the existing customer-drilldown dashboard JSON to the customer detail page. Button opens ${Grafana.BaseUrl}/d/customer-drilldown in a new tab with var-customer=<customer-id> pre-filled, kiosk mode, light theme. - Fetches /api/grafana/config (cached 5min, reuses the existing TanStack query key so GrafanaInfoCard's cache is shared). - Button disabled with tooltip explaining when Grafana baseUrl isn't configured for the Admin stack (points to Settings → Grafana). - Customer id is URI-encoded before interpolation (defence in depth — it's a UUID, but encodeURIComponent costs nothing). - Dashboard UID hardcoded as 'customer-drilldown' to match the provisioned JSON. Renaming requires changing both together. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 10:37:48 +02:00
Diseri Pearson	aaa522058e	Settings → App config: surface RunMode + FleetIngest push state ConfigOverviewService now reports the runtime mode and (on Client only) the fleet-push configuration + live push-state per resource. The token is reduced to a boolean (TokenConfigured) in the DTO — never a string — so it cannot be accidentally serialised. Backend - FleetIngestInfoDto: { enabled, url, intervalSeconds, batchSize, batchMaxBytes, tokenConfigured }. No Token property at all. - FleetPushStateRowDto: { resourceType, lastCursor, lastSyncedAt, consecutiveFailures, lastError }. - ConfigOverviewDto gains RunMode (string), nullable FleetIngest + FleetPushState (null in Admin mode). - ConfigOverviewService becomes async + injects IServiceProvider so it can read AppDbContext in Client mode without that DbContext being required (GetService returns null in Admin mode, where it's not registered). - AdminConfigEndpoints awaits the async call. Tests (+3, 56/56 passing) - FleetIngestInfoDto has no Token property (reflection check). - Serialised DTO never contains the literal token value (string scan). - ConfigOverviewDto's FleetIngest + FleetPushState are nullable so Admin-mode payloads serialise them as absent rather than empty. Frontend - ConfigOverviewCard adds a Run mode row to the Application section (gold tag for Admin, blue for Client). - New "Fleet push (Client → Admin)" descriptions card (enabled, token configured, url, interval, batch sizes) — hidden in Admin mode. - "Push state per resource" table — resource, last cursor, last sync, consecutive failures (color-coded), last error. Verified end-to-end on the dev host - /api/admin/config-overview on the Client returns runMode=Client + fleetIngest={enabled,url,interval,batchSize,batchMaxBytes,tokenConfigured} + fleetPushState[3] rows (sites/devices/measurements, failures=0). - The 64-char dev token (from the running Client's .env) is verified absent from the response body via direct string search. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 10:33:52 +02:00
Diseri Pearson	c5787a7a7f	Phase 15: Admin operator surface + fleet dashboards + onboarding docs The Admin stack now has a usable operator UI for managing the fleet. End-to-end verified locally: Client pushes → Admin dashboard reflects the activity within the CA refresh window. Backend (Admin-only) - FleetQueryService: dashboard headline (totals, active count, today's measurements + kWh from the hourly_per_device CA) and per-customer detail (sites, devices, last 50 measurements, last 20 ingest events). - /api/fleet/dashboard and /api/fleet/customers/{id}/detail endpoints. - DTOs added; Program.cs wires the service + endpoints under RunMode=Admin. Frontend - DashboardPage now branches on RunMode — Admin renders the fleet headline (statistic cards + customer summary table with lag tags), Client keeps the existing placeholder. - AdminCustomerDetailPage drills into one customer: descriptions card + tabs for Recent ingest (with rejection counts, batch sizes, time-spread for visible firmware-replay waves), Recent measurements, Sites, Devices. - AdminCustomersPage rows are clickable → /admin/customers/:id (skips the click when target is a button/popover so action buttons still work). - App.tsx adds the /admin/customers/:id route, RequireRole-gated. Grafana - grafana/dashboards-admin/fleet-overview.json — 4 stat panels (active customers, total, last-24h samples, today's kWh) plus 2 time series (per-customer active power, per-customer hourly kWh). Reads from fleet.hourly_per_device CA. - grafana/dashboards-admin/customer-drilldown.json — parameterized by $customer (template variable querying fleet.Customers). Per-device active power, cumulative kWh, recent ingest events table. Docs - README: Phase 15 section describing the new admin UI surface + pointer to dashboard-admin folder. - OPERATIONS: new "Fleet aggregator (Admin stack)" section covering one-time provisioning (Admin portal + Admin Grafana), end-to-end customer-onboarding workflow (register on Admin → drop token in customer .env → restart → verify in UI/SQL), common ops (rotate token, disable, investigate, compression stats, force CA refresh, decommission), and Admin-DB backup notes. - README decommissioning note now mentions deleting from fleet.Customers if the customer was registered for aggregation. Verified end-to-end - Phase 14's Client + Admin stacks rebuilt with Phase 15 code. - /api/fleet/dashboard returns correct totals (1 customer, 1 active, measurements + kWh derived from CA). - /api/fleet/customers/{id}/detail returns sites, devices, recent measurements, recent ingest events. - Ingested a fresh measurement on Client → after CA refresh, totals in Admin dashboard advance correctly. - All 53 tests still passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 10:27:55 +02:00
Diseri Pearson	a92b4277ae	Phase 14: Push + ingest pipeline (end-to-end fleet aggregation) Customer-stack measurements now flow to the Admin-stack central DB via HTTPS POST, with firmware buffer-and-replay back-fills handled correctly. Client side (push) - monitoring.PowerMeasurements gains ReceivedAt (default NOW()) + index. Push selects WHERE ReceivedAt > LastCursor, so back-dated rows from offline-buffer replays are picked up automatically. - app.FleetPushState table holds per-resource cursors + backoff state. - FleetPushClient: HttpClient wrapper, X-Customer-Token header, X-Batch-Type, X-Push-Cursor. 413 returns retry-after halving signal. - FleetPushService: BackgroundService loop. Per tick: sites (full set), devices (full set), measurements (cursor-driven up to 3 batches). Exponential backoff per resource on failure (1m → 30m cap). Honors 429 Retry-After. Only registered when RunMode=Client AND FleetIngest__Enabled=true. Admin side (ingest) - /api/fleet/ingest: anonymous, X-Customer-Token authed against fleet.Customers via SHA-256 indexed lookup. 401 on bad token; 400 on bad batch type. - FleetIngestService dispatches by X-Batch-Type: sites/devices → upsert by (CustomerId, Id) with ON CONFLICT UPDATE measurements → bulk INSERT ON CONFLICT (Time, CustomerId, DeviceId) DO NOTHING (idempotent under re-delivery). - Updates fleet.Customers.FirstSeenAt/LastSeenAt on each successful batch. - Writes fleet.IngestEvents audit row per batch (accepted, rejected, bytes, client cursor, time-spread, error). - FleetTimescaleBootstrapper runs after MigrateAsync in Admin mode: CREATE EXTENSION timescaledb, create_hypertable on fleet.PowerMeasurements, chunk interval 7 days, compression with segmentby=(CustomerId,DeviceId) + compress_orderby "Time" DESC, compression policy 7 days, hourly_per_device continuous aggregate (realtime, materialized_only=false, 30-day start_offset so back-fills get materialized on next refresh tick). Wiring - docker-compose.yml threads Application__RunMode + FleetIngest__* from .env (defaults safely off) so a single dev host can run two stacks. - .env.example documents the new vars under their own section. Tests - FleetIngestValidationTests (2 new). 53/53 passing. Verified end-to-end on the dev host - Client (portal-dev_portal, RunMode=Client, FleetIngest__Enabled=true) pushes to Admin (portal-admin-test, RunMode=Admin, separate admin_fleet DB) via container DNS. - Customer registered on Admin (DEV0001), token captured, dropped into Client .env, Client restarted, push service started on schedule. - Ingested measurements (including a 2026-04-01 back-dated sample simulating firmware replay) all land in fleet.PowerMeasurements with the correct CustomerId. - Customer.FirstSeenAt/LastSeenAt update, IngestEvents records every batch (sites + devices per tick, measurements when cursor advances). - Hypertable confirmed via timescaledb_information.hypertables; hourly_per_device CA confirmed via timescaledb_information.continuous_aggregates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 10:17:58 +02:00
Diseri Pearson	2c618b776b	Phase 13: RunMode flag + AdminDbContext + Customers registry Adds the plumbing for the fleet-aggregation feature without moving any data yet. Same portal binary now supports two modes selected via Application:RunMode (Client \| Admin). Backend - New AdminDbContext (identity + branding shared via SharedSchemaConfiguration helper + fleet schema). AppDbContext keeps existing identity + branding + monitoring + rates; renamed implicitly the "Client" context. Only one is registered with DI per RunMode. - IWhiteLabelStore interface implemented by both contexts so BrandingService works in either mode. - Fleet entities: Customer, FleetSite, FleetDevice, FleetPowerMeasurement, IngestEvent (all in the new fleet schema). Migration in Migrations/Admin/. - CustomerService: 32-byte random token, SHA-256 hash stored, plaintext shown once on create + rotate. Token lookup is a single O(log N) indexed query. - RunModeGuards: refuses Admin without conn string; refuses Client+push without URL/token; refuses cross-DB pointing (Client at admin_fleet DB with fleet.Customers, or Admin at customer DB with monitoring.PowerMeasurements). - Endpoint maps now branch on RunMode: Client → sites/measurements/rates/admin-sites/admin-rates Admin → admin/customers Shared → auth, users, branding, grafana, admin-config, app/info, health - /api/app/info (anonymous) returns {runMode, applicationName, version} so the SPA can drive nav without re-fetching auth state. Frontend - AppInfoProvider + useAppInfo hook fetch /api/app/info once on load. - AdminCustomersPage with create / edit / rotate-token / delete. - TokenShownOnceModal: shows token once, copy-to-clipboard, "I've stored it" confirmation gate before closing. - AppLayout nav swaps Sites <-> Customers based on RunMode and shows a FLEET ADMIN tag in the header when in Admin mode. Tests - 11 new tests: CustomerTokenTests (5) + RunModeGuardsTests (6). - 51/51 passing locally. Verified - dotnet build + dotnet test clean (zero errors, one EF1002 warning suppressed in Phase 11 already). - Client mode docker rebuild: no regressions, /api/app/info returns Client, login works, /api/sites/ works. - Admin mode spun up on port 8090 against a fresh admin_fleet DB: /api/app/info returns Admin, customer ABC0001 registered, 64-char token returned, list shows the row. - Cross-DB guard: Client run against admin_fleet refuses with explicit "is pointed at a database that contains fleet.Customers" error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 10:09:41 +02:00
Diseri Pearson	880525b306	Add Fleet ingest design doc (portal/docs/FLEET-DESIGN.md) Locked design for Admin / cross-customer aggregation feature. Implementation lands in phases 13-15. Key decisions captured: - Same portal binary, RunMode=Client\|Admin config flag. - Two DbContext classes (ClientDbContext + AdminDbContext) to keep schemas cleanly separated and migrations sane. - Fleet ingest is opt-in (FleetIngest__Enabled=false works exactly as today, no data leaves customer stack). - Push by ReceivedAt, not Time, so firmware offline-buffer replays are picked up automatically. - Per-tick batch cap so a back-fill wave from one customer doesn't starve other customers' pushes. - SHA-256 token hash (not bcrypt) for the high-throughput ingest endpoint; tokens shown once on Admin Customers page. - Realtime continuous aggregates with wide start_offset so late back-fills materialize on the next refresh tick. - No retention policy. TimescaleDB compression on chunks older than 7 days handles long-term storage cost. - Open seams (tariff sync, RLS, GDPR delete, dual-token rotation, sharding) documented with v2 extension paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 09:55:07 +02:00
Diseri Pearson	e17921a122	Add portal: customer-facing white-labeled monitoring stack New top-level portal/ project, peer to console/ and firmware/. Delivers a .NET 10 + React 18 + TimescaleDB + Grafana stack, one container set per customer behind Traefik. Built in 12 phases per FrontEndPrompt spec; no changes to existing console or firmware. Backend (src/Tau.Acuvim.Portal/): - .NET 10 minimal API, Serilog, ASP.NET Identity (cookie auth, lockout). - Single AppDbContext with identity / app / monitoring schemas. - MigrateAsync + TimescaleBootstrapper (idempotent hypertable creation) + IdentityBootstrapper (seeded admin + branding) on startup. - Pure CostCalculator + DB-backed RateService for tariffs (effective-dated, TOU periods, VAT, fixed charges, per-municipality timezone). - BrandingService with logo upload to mounted volume. - Time-series ingest + bucketed query services (time_bucket aggregates, ON CONFLICT for idempotent re-delivery). - ConfigOverviewService with redaction-by-construction (passwords never in payload). - DataProtection keys persisted to /data/keys volume for cookie survival across container restarts. Frontend (frontend/): - React 18 + TypeScript + Vite + Ant Design 5 + TanStack Query. - BrandingProvider + ThemedRoot for live re-themed white-labelling. - RequireAuth / RequireRole guards. - Pages: Login, Dashboard, Dashboards (embedded Grafana), Sites (admin), Settings tabs (Branding / Rates / Users / Grafana / App config). Infra: - Dev (docker-compose.yml) and prod (docker-compose.prod.yml) compose files. Three services per customer; Traefik subdomain + same-origin /grafana path-prefix routing wired with labels. - Grafana 11 with provisioned timescaledb datasource (uid pinned) and starter power-overview.json dashboard with device template variable. - Compose project name documented as lowercase (Compose v2 requirement). Tests (tests/Tau.Acuvim.Portal.Tests/): - xUnit, 40 tests. Covers CostCalculator (period match, TZ, overlap, VAT, fixed), ConnectionStringResolver (all 4 precedence branches incl. Production refusal), TariffValidator, DayOfWeekFlag. - All passing locally against .NET 10. Docs: - README.md (onboarding + 11 spec sections), OPERATIONS.md (per-customer provisioning, secret rotation, backup, troubleshooting), TESTING.md (manual integration scenarios, frontend test scaffolding recipe). Production safety guards: - Refuses to start if Authentication:DefaultAdminPassword is unchanged default in Production. - Refuses to start if Database:AutoProvisionLocalTimescaleDb=true in Production. - Prod Grafana ships with anonymous off and auth mode unset (three options documented in README Security) so iframe refuses to load until a deliberate prod auth choice is made. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 09:30:30 +02:00
Renier Forster	99864d0a8b	Add CLAUDE.md project onboarding guide Build commands, architecture overview, library gotchas, and conventions for the firmware, backend, and frontend components. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-16 19:07:00 +02:00
Renier Forster	84a0668c54	Initial commit: Tau Acuvim IoT monitoring system Complete IoT monitoring platform for Acuvim II power meters via ESP32. Firmware (Phases 1-7): - ESP32-WROVER-B (TTGO T-Call v1.4) with RS485 Modbus RTU - WiFi STA+AP concurrent mode with GSM/GPRS failover - Transport abstraction layer with 4 priority modes - MQTT protocol with 20 commands, LWT, QoS, exponential backoff - SD card offline buffering with JSONL rotation and non-blocking drain - OTA firmware updates with dual partition rollback protection - Watchdog timer, crash loop detection, Acuvim health monitoring - Captive portal provisioning with AP mode Console backend (Phase 8): - .NET 10 minimal API with PostgreSQL + EF Core - JWT authentication, SignalR real-time updates - MQTTnet 5.x bridge service with health monitoring - Device, telemetry, firmware, alert, group management - Rate limiting, security headers, Swagger/OpenAPI Frontend (Phase 9): - React 18 + TypeScript + Vite with Ant Design 5 - ECharts telemetry visualization, TanStack Query - SignalR live updates, device management UI - Dashboard, fleet management, firmware deployment Testing & Production (Phase 10): - 28 firmware unit tests (Modbus, JSON, config, version) - 23 xUnit backend tests (device, telemetry, command, alert) - Docker Compose with nginx, TLS MQTT, PostgreSQL - Production deployment, commissioning, and troubleshooting docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-16 19:05:32 +02:00