Complete IoT monitoring platform for Acuvim II power meters via ESP32. Firmware (Phases 1-7): - ESP32-WROVER-B (TTGO T-Call v1.4) with RS485 Modbus RTU - WiFi STA+AP concurrent mode with GSM/GPRS failover - Transport abstraction layer with 4 priority modes - MQTT protocol with 20 commands, LWT, QoS, exponential backoff - SD card offline buffering with JSONL rotation and non-blocking drain - OTA firmware updates with dual partition rollback protection - Watchdog timer, crash loop detection, Acuvim health monitoring - Captive portal provisioning with AP mode Console backend (Phase 8): - .NET 10 minimal API with PostgreSQL + EF Core - JWT authentication, SignalR real-time updates - MQTTnet 5.x bridge service with health monitoring - Device, telemetry, firmware, alert, group management - Rate limiting, security headers, Swagger/OpenAPI Frontend (Phase 9): - React 18 + TypeScript + Vite with Ant Design 5 - ECharts telemetry visualization, TanStack Query - SignalR live updates, device management UI - Dashboard, fleet management, firmware deployment Testing & Production (Phase 10): - 28 firmware unit tests (Modbus, JSON, config, version) - 23 xUnit backend tests (device, telemetry, command, alert) - Docker Compose with nginx, TLS MQTT, PostgreSQL - Production deployment, commissioning, and troubleshooting docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
375 lines
12 KiB
Markdown
375 lines
12 KiB
Markdown
# Phase 5: SD Card Offline Buffering
|
|
|
|
## Objective
|
|
|
|
Add SD card storage as a last-resort data buffer when both WiFi and GSM are unavailable. Telemetry data is written to the SD card in a replay-friendly format and automatically drained to MQTT when connectivity is restored.
|
|
|
|
## Prerequisites
|
|
|
|
- Phase 4 complete (transport failover working)
|
|
- MicroSD card (FAT32 formatted, up to 32GB)
|
|
- SD card module or breakout (SPI interface)
|
|
|
|
## Deliverables
|
|
|
|
1. SPI-based SD card driver with hot-plug detection
|
|
2. Offline telemetry buffer (newline-delimited JSON files)
|
|
3. Automatic queue drain on connectivity restore
|
|
4. File rotation and cleanup (prevent SD from filling up)
|
|
5. SD card status in web UI and telemetry
|
|
|
|
---
|
|
|
|
## 5.1 SD Card Hardware
|
|
|
|
### SPI Pin Mapping (from Phase 1)
|
|
|
|
```
|
|
SD_CS = GPIO 15 # Chip Select
|
|
SD_MOSI = GPIO 2 # Master Out Slave In
|
|
SD_MISO = GPIO 14 # Master In Slave Out
|
|
SD_CLK = GPIO 12 # Clock
|
|
```
|
|
|
|
**Important:** These pins share the HSPI bus. The SD card SPI runs independently of the internal flash (which uses VSPI). Ensure SPI bus is not conflicting with any other peripheral.
|
|
|
|
### SPI Initialization
|
|
|
|
```cpp
|
|
#include <SPI.h>
|
|
#include <SD.h>
|
|
|
|
SPIClass hspi(HSPI);
|
|
hspi.begin(SD_CLK, SD_MISO, SD_MOSI, SD_CS);
|
|
SD.begin(SD_CS, hspi);
|
|
```
|
|
|
|
## 5.2 SD Card Manager
|
|
|
|
### `sd_manager.h` / `sd_manager.cpp`
|
|
|
|
```cpp
|
|
class SdManager {
|
|
public:
|
|
bool begin(); // Initialize SD card
|
|
bool isAvailable(); // Card present and mounted
|
|
bool isHealthy(); // Card mounted and writable
|
|
|
|
// Buffering
|
|
bool bufferTelemetry(const AcuvimData& data); // Write one record
|
|
uint32_t getQueuedCount(); // Records waiting to send
|
|
bool hasQueuedData(); // Any records to drain
|
|
|
|
// Drain (replay to MQTT)
|
|
bool drainNext(String& payload); // Get next record for sending
|
|
void confirmDrained(const String& filename, uint32_t position);
|
|
void drainAll(MqttClient& mqtt); // Drain entire queue
|
|
|
|
// Maintenance
|
|
uint64_t getTotalSpace(); // Total SD capacity
|
|
uint64_t getUsedSpace(); // Used space
|
|
uint64_t getFreeSpace(); // Available space
|
|
uint32_t getFileCount(); // Number of buffer files
|
|
bool cleanup(uint32_t maxAgeDays = 30); // Delete old files
|
|
bool format(); // Format SD card (destructive)
|
|
|
|
private:
|
|
bool mounted;
|
|
String currentFileName;
|
|
uint32_t recordsInCurrentFile;
|
|
|
|
String generateFileName();
|
|
bool rotateFile();
|
|
bool shouldRotate();
|
|
};
|
|
```
|
|
|
|
## 5.3 File Format and Structure
|
|
|
|
### Directory Structure
|
|
|
|
```
|
|
SD Card Root/
|
|
├── /telemetry/
|
|
│ ├── 2026-05-16_001.jsonl # Newline-delimited JSON
|
|
│ ├── 2026-05-16_002.jsonl
|
|
│ └── 2026-05-17_001.jsonl
|
|
├── /drain/ # Files currently being drained
|
|
│ └── (moved here during drain, deleted after)
|
|
└── /logs/ # Optional diagnostic logs
|
|
└── boot.log
|
|
```
|
|
|
|
### File Naming Convention
|
|
|
|
- Format: `YYYY-MM-DD_NNN.jsonl`
|
|
- `NNN`: Sequential file number within the day (001, 002, ...)
|
|
- `.jsonl`: JSON Lines format (one JSON object per line)
|
|
|
|
### Record Format
|
|
|
|
Each line is a complete, self-contained telemetry JSON (same format as MQTT payload from Phase 2):
|
|
|
|
```json
|
|
{"ts":1716000000,"dev":"ACV-AABBCCDDEEFF","v":{"a":230.1,"b":231.4,"c":229.8},"i":{"a":15.2,"b":14.8,"c":15.5},"p":{"total":10.5},"f":50.01,"e":{"imp_act":12345.6},"src":"sd"}
|
|
```
|
|
|
|
- Added `"src":"sd"` field to indicate this was buffered data (vs. real-time)
|
|
- Each line is independently parseable (no array wrapping)
|
|
- File can be read line by line without loading entire file into memory
|
|
|
|
### File Rotation
|
|
|
|
Rotate to a new file when:
|
|
- Current file exceeds 500KB (~2,000-3,000 records depending on content)
|
|
- Date changes (new day = new file)
|
|
- Current file has 5,000 records (hard limit)
|
|
|
|
## 5.4 Write Strategy
|
|
|
|
### Buffering Flow
|
|
|
|
```
|
|
Acuvim poll cycle:
|
|
├── Transport connected?
|
|
│ ├── YES: Publish to MQTT directly
|
|
│ │ └── Also check: hasQueuedData()?
|
|
│ │ └── YES: drain queued records in background
|
|
│ └── NO: Buffer to SD card
|
|
│ └── SD available?
|
|
│ ├── YES: Write to current .jsonl file
|
|
│ └── NO: Data is lost (log warning)
|
|
└── Continue polling
|
|
```
|
|
|
|
### Write Implementation
|
|
|
|
```cpp
|
|
bool SdManager::bufferTelemetry(const AcuvimData& data) {
|
|
if (!mounted) return false;
|
|
|
|
if (shouldRotate()) {
|
|
rotateFile();
|
|
}
|
|
|
|
File file = SD.open(currentFileName, FILE_APPEND);
|
|
if (!file) return false;
|
|
|
|
// Serialize AcuvimData to compact JSON (same format as MQTT)
|
|
StaticJsonDocument<512> doc;
|
|
// ... populate doc from data ...
|
|
doc["src"] = "sd";
|
|
|
|
String line;
|
|
serializeJson(doc, line);
|
|
file.println(line);
|
|
file.close();
|
|
|
|
recordsInCurrentFile++;
|
|
return true;
|
|
}
|
|
```
|
|
|
|
**Design notes:**
|
|
- Open file, write, close immediately (crash-safe, no data loss on power failure)
|
|
- Do NOT keep files open between writes (SD card can be removed)
|
|
- `FILE_APPEND` mode ensures data is added to end of file
|
|
- Each write is atomic at the line level
|
|
|
|
## 5.5 Drain Strategy
|
|
|
|
### Drain Flow
|
|
|
|
When connectivity is restored, queued records are replayed to MQTT:
|
|
|
|
```
|
|
drainAll() logic:
|
|
1. List files in /telemetry/ sorted by name (oldest first)
|
|
2. For each file:
|
|
a. Move file to /drain/ directory (prevents double-read)
|
|
b. Open file, read line by line
|
|
c. For each line:
|
|
- Parse JSON
|
|
- Publish to MQTT telemetry topic
|
|
- Throttle: max 10 records/second (avoid MQTT flood)
|
|
d. After all lines published: delete file from /drain/
|
|
e. If MQTT disconnects mid-drain: stop, leave remaining files
|
|
3. Log drain summary (records sent, time taken)
|
|
```
|
|
|
|
### Drain Throttling
|
|
|
|
- Max drain rate: 10 records per second
|
|
- Yield between publishes to allow main loop tasks (Modbus polling, MQTT keepalive)
|
|
- Drain runs as a background task in the main loop, not a blocking operation
|
|
- During drain, live telemetry continues to publish normally (live data takes priority)
|
|
|
|
### Drain Integration in Main Loop
|
|
|
|
```cpp
|
|
void loop() {
|
|
mqtt.loop();
|
|
transport.loop();
|
|
|
|
// Normal telemetry polling
|
|
if (pollTimer.elapsed()) {
|
|
AcuvimData data;
|
|
if (acuvim.readAll(data)) {
|
|
if (transport.isConnected() && mqtt.isConnected()) {
|
|
mqtt.publishTelemetry(data);
|
|
} else if (sdManager.isAvailable()) {
|
|
sdManager.bufferTelemetry(data);
|
|
}
|
|
}
|
|
}
|
|
|
|
// Background drain (non-blocking, processes a few records per loop)
|
|
if (transport.isConnected() && mqtt.isConnected() && sdManager.hasQueuedData()) {
|
|
sdManager.drainBatch(mqtt, 5); // Drain up to 5 records per loop iteration
|
|
}
|
|
}
|
|
```
|
|
|
|
## 5.6 Storage Management
|
|
|
|
### Capacity Planning
|
|
|
|
| Poll Interval | Record Size | Records/Hour | MB/Day | Days on 4GB |
|
|
|---------------|-------------|--------------|--------|-------------|
|
|
| 5 seconds | ~250 bytes | 720 | ~4.3 | ~900 |
|
|
| 10 seconds | ~250 bytes | 360 | ~2.2 | ~1,800 |
|
|
| 30 seconds | ~250 bytes | 120 | ~0.7 | ~5,500 |
|
|
|
|
At 5-second polling, a 4GB SD card holds approximately 2.5 years of data. Storage is not a concern.
|
|
|
|
### Cleanup Policy
|
|
|
|
- Default: delete files older than 30 days
|
|
- Configurable via NVS (`sd_retention_days`)
|
|
- If SD card is >90% full: delete oldest files regardless of age
|
|
- Cleanup runs once per hour
|
|
|
|
### Error Handling
|
|
|
|
- SD card removed: set `mounted = false`, log warning, data lost until reinserted
|
|
- SD card full: delete oldest file, retry write
|
|
- Corrupt file during drain: skip file, move to `/errors/`, continue with next
|
|
- SD write failure: retry once, then skip and log
|
|
|
|
## 5.7 Hot-Plug Detection
|
|
|
|
Check SD card presence periodically (every 30 seconds):
|
|
|
|
```cpp
|
|
void SdManager::checkCard() {
|
|
if (mounted && !SD.exists("/")) {
|
|
// Card was removed
|
|
mounted = false;
|
|
Serial.println("SD card removed");
|
|
} else if (!mounted) {
|
|
// Try to remount
|
|
if (SD.begin(SD_CS, hspi)) {
|
|
mounted = true;
|
|
ensureDirectories();
|
|
Serial.println("SD card inserted");
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## 5.8 Devices Without SD Card
|
|
|
|
Similar to GSM (Phase 4), handle gracefully:
|
|
|
|
- On boot, attempt SD card initialization
|
|
- If no SD card: set `sd_available = false`
|
|
- No error logs for expected absence
|
|
- Telemetry flow: transport only (WiFi/GSM), data lost if no transport
|
|
- Web UI shows "No SD card" status
|
|
|
|
## 5.9 Web UI Updates
|
|
|
|
### Status Page Addition
|
|
|
|
```
|
|
┌──────────────────────────────────────┐
|
|
│ Storage │
|
|
│ ● SD Card: 4.0 GB (0.1% used) │
|
|
│ Queued: 0 records │
|
|
│ Files: 3 │
|
|
│ Retention: 30 days │
|
|
│ │
|
|
│ [Drain Now] [Cleanup] │
|
|
└──────────────────────────────────────┘
|
|
```
|
|
|
|
### API Endpoints
|
|
|
|
```
|
|
GET /api/sd/status
|
|
Response:
|
|
{
|
|
"available": true,
|
|
"total_bytes": 4294967296,
|
|
"used_bytes": 4194304,
|
|
"free_bytes": 4290772992,
|
|
"queued_records": 0,
|
|
"file_count": 3,
|
|
"retention_days": 30
|
|
}
|
|
|
|
POST /api/sd/drain
|
|
Response:
|
|
{
|
|
"success": true,
|
|
"message": "Drain started. 0 records queued."
|
|
}
|
|
|
|
POST /api/sd/cleanup
|
|
Response:
|
|
{
|
|
"success": true,
|
|
"deleted_files": 2,
|
|
"freed_bytes": 1048576
|
|
}
|
|
```
|
|
|
|
## 5.10 Testing & Validation
|
|
|
|
| Test | Method | Pass Criteria |
|
|
|------|--------|---------------|
|
|
| SD card init | Insert formatted SD card | Mounted, directories created |
|
|
| Buffer write | Disable WiFi and GSM | Records written to .jsonl file |
|
|
| File format | Read .jsonl on PC | Valid JSON per line, parseable |
|
|
| File rotation | Write >500KB | New file created with incremented number |
|
|
| Date rotation | Cross midnight while buffering | New date-prefixed file |
|
|
| Drain on reconnect | Restore WiFi after buffering | Records published to MQTT in order |
|
|
| Drain throttle | Buffer 1000 records, reconnect | Drain rate <= 10/sec, main loop responsive |
|
|
| Live + drain | Reconnect during polling | Live telemetry published + drain in background |
|
|
| No SD card | Boot without SD card | Graceful, no errors, WiFi/GSM only |
|
|
| SD card removal | Remove SD during operation | Detected within 30s, no crash |
|
|
| SD card reinsert | Reinsert after removal | Remounted, buffering resumes |
|
|
| Cleanup | Create files > 30 days old | Old files deleted |
|
|
| SD full | Fill SD card | Oldest file deleted, write continues |
|
|
| Power loss during write | Kill power mid-write | At most last line lost, file intact |
|
|
|
|
## 5.11 Phase 5 Completion Criteria
|
|
|
|
- [ ] SD card initializes and creates directory structure
|
|
- [ ] Telemetry buffered to SD when no transport available
|
|
- [ ] JSONL format verified (parseable per line)
|
|
- [ ] File rotation works (size and date based)
|
|
- [ ] Automatic drain on connectivity restore
|
|
- [ ] Drain is non-blocking and throttled
|
|
- [ ] Live telemetry continues during drain
|
|
- [ ] Hot-plug detection (removal and reinsertion)
|
|
- [ ] Graceful handling when no SD card present
|
|
- [ ] Cleanup of old files works
|
|
- [ ] Storage stats visible in web UI
|
|
- [ ] SD status included in device status API
|
|
|
|
---
|
|
|
|
**Previous Phase:** [Phase 4 — GSM & Transport Failover](acuvim-spec-04.md)
|
|
**Next Phase:** [Phase 6 — OTA Firmware Updates](acuvim-spec-06.md)
|