From c515a6b4e1892d97a5266e6f4bbabaf7bb9b1b1f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Torjus=20H=C3=A5kestad?= Date: Thu, 5 Feb 2026 22:41:07 +0100 Subject: [PATCH 1/2] home-assistant: fix zigbee sensor battery reporting WSDCGQ12LM sensors report battery: 0 due to firmware quirk. Override battery calculation using voltage via homeassistant value_template. Also adds zigbee_sensor_stale alert for detecting dead sensors regardless of battery reporting accuracy (1 hour threshold). Device configuration moved from external devices.yaml to inline NixOS config for declarative management. Co-Authored-By: Claude Opus 4.5 --- .../plans/zigbee-sensor-battery-monitoring.md | 56 +++++++++++++++---- services/home-assistant/default.nix | 38 +++++++++++++ services/monitoring/rules.yml | 8 +++ 3 files changed, 90 insertions(+), 12 deletions(-) diff --git a/docs/plans/zigbee-sensor-battery-monitoring.md b/docs/plans/zigbee-sensor-battery-monitoring.md index e5c72ee..97da549 100644 --- a/docs/plans/zigbee-sensor-battery-monitoring.md +++ b/docs/plans/zigbee-sensor-battery-monitoring.md @@ -5,11 +5,11 @@ Three Aqara Zigbee temperature sensors report `battery: 0` in their MQTT payload, making the `hass_sensor_battery_percent` Prometheus metric useless for battery monitoring on these devices. Affected sensors: -- **Temp Living Room** (`0x54ef441000a54d3c`) — area: living_room -- **Temp Office** (`0x54ef441000a547bd`) — area: office -- **temp_server** — area: server_room +- **Temp Living Room** (`0x54ef441000a54d3c`) — WSDCGQ12LM +- **Temp Office** (`0x54ef441000a547bd`) — WSDCGQ12LM +- **temp_server** (`0x54ef441000a564b6`) — WSDCGQ12LM -The **Temp Bedroom** sensor (`0x00124b0025495463`) is a different model and reports battery correctly (69% at time of investigation). +The **Temp Bedroom** sensor (`0x00124b0025495463`) is a SONOFF SNZB-02 and reports battery correctly. ## Findings @@ -17,15 +17,47 @@ The **Temp Bedroom** sensor (`0x00124b0025495463`) is a different model and repo - The Zigbee2MQTT payload includes a `voltage` field (e.g., `2707` = 2.707V), which indicates healthy battery levels (~40-60% for a CR2032 coin cell). - CR2032 voltage reference: ~3.0V fresh, ~2.7V mid-life, ~2.1V dead. - The `voltage` field is not exposed as a Prometheus metric — it exists only in the MQTT payload. -- This is a known firmware quirk with some Aqara sensors that always report 0% battery. +- This is a known firmware quirk with some Aqara WSDCGQ12LM sensors that always report 0% battery. -## Possible Solutions +## Implementation -### 1. Expose voltage as a Prometheus metric -Enable the voltage sensor entities in Home Assistant (they may exist but be disabled by default). The HA Prometheus integration would then export them automatically. +### Solution 1: Calculate battery from voltage in Zigbee2MQTT (Implemented) -### 2. Calculate battery from voltage in Zigbee2MQTT -Override the battery calculation using the voltage field. Approximate formula: `(voltage - 2100) / (3000 - 2100) * 100`. +Override the Home Assistant battery entity's `value_template` in Zigbee2MQTT device configuration to calculate battery percentage from voltage. -### 3. Alert on sensor staleness instead -Create a Prometheus alert based on `hass_last_updated_time_seconds` going stale (e.g., no temperature update in 1 hour). This detects dead sensors regardless of battery reporting accuracy. +**Formula:** `(voltage - 2100) / 9` (maps 2100-3000mV to 0-100%) + +**Changes in `services/home-assistant/default.nix`:** +- Device configuration moved from external `devices.yaml` to inline NixOS config +- Three affected sensors have `homeassistant.sensor_battery.value_template` override + +**Expected battery values based on current voltages:** +| Sensor | Voltage | Expected Battery | +|--------|---------|------------------| +| Temp Living Room | 2710 mV | ~68% | +| Temp Office | 2658 mV | ~62% | +| temp_server | 2765 mV | ~74% | + +### Solution 2: Alert on sensor staleness (Implemented) + +Added Prometheus alert `zigbee_sensor_stale` in `services/monitoring/rules.yml` that fires when a Zigbee temperature sensor hasn't updated in over 1 hour. This provides defense-in-depth for detecting dead sensors regardless of battery reporting accuracy. + +**Alert details:** +- Expression: `(time() - hass_last_updated_time_seconds{entity=~"sensor\\.(0x[0-9a-f]+|temp_server)_temperature"}) > 3600` +- Severity: warning +- For: 5m + +## Post-Deployment Steps + +After deploying to ha1: + +1. Restart zigbee2mqtt service (automatic on NixOS rebuild) +2. In Home Assistant, the battery entities may need to be re-discovered: + - Go to Settings → Devices & Services → MQTT + - The new `value_template` should take effect after entity re-discovery + - If not, try disabling and re-enabling the battery entities + +## Notes + +- Device configuration is now declarative in NixOS. Future device additions via Zigbee2MQTT frontend will need to be added to the NixOS config to persist. +- The `devices.yaml` file on ha1 will be overwritten on service start but can be removed after confirming the new config works. diff --git a/services/home-assistant/default.nix b/services/home-assistant/default.nix index 14f4fce..352b74c 100644 --- a/services/home-assistant/default.nix +++ b/services/home-assistant/default.nix @@ -69,6 +69,44 @@ frontend = true; permit_join = false; serial.port = "/dev/ttyUSB0"; + + # Inline device configuration (replaces devices.yaml) + # This allows declarative management and homeassistant overrides + devices = { + # Temperature sensors with battery fix + # WSDCGQ12LM sensors report battery: 0 due to firmware quirk + # Override battery calculation using voltage (mV): (voltage - 2100) / 9 + "0x54ef441000a547bd" = { + friendly_name = "0x54ef441000a547bd"; + homeassistant.sensor_battery.value_template = "{{ (((value_json.voltage | float) - 2100) / 9) | round(0) | int | min(100) | max(0) }}"; + }; + "0x54ef441000a54d3c" = { + friendly_name = "0x54ef441000a54d3c"; + homeassistant.sensor_battery.value_template = "{{ (((value_json.voltage | float) - 2100) / 9) | round(0) | int | min(100) | max(0) }}"; + }; + "0x54ef441000a564b6" = { + friendly_name = "temp_server"; + homeassistant.sensor_battery.value_template = "{{ (((value_json.voltage | float) - 2100) / 9) | round(0) | int | min(100) | max(0) }}"; + }; + + # Other sensors + "0x00124b0025495463".friendly_name = "0x00124b0025495463"; # SONOFF temp sensor (battery works) + "0x54ef4410009ac117".friendly_name = "0x54ef4410009ac117"; # Water leak sensor + + # Buttons + "0x54ef441000a1f907".friendly_name = "btn_livingroom"; + "0x54ef441000a1ee71".friendly_name = "btn_bedroom"; + + # Philips Hue lights + "0x001788010d1b599a" = { + friendly_name = "0x001788010d1b599a"; + transition = 5; + }; + "0x001788010d253b99".friendly_name = "0x001788010d253b99"; + "0x001788010e371aa4".friendly_name = "0x001788010e371aa4"; + "0x001788010dc5f003".friendly_name = "0x001788010dc5f003"; + "0x001788010dc35d06".friendly_name = "0x001788010dc35d06"; + }; }; }; } diff --git a/services/monitoring/rules.yml b/services/monitoring/rules.yml index 30d01eb..df9e0d6 100644 --- a/services/monitoring/rules.yml +++ b/services/monitoring/rules.yml @@ -226,6 +226,14 @@ groups: annotations: summary: "Mosquitto not running on {{ $labels.instance }}" description: "Mosquitto has been down on {{ $labels.instance }} more than 5 minutes." + - alert: zigbee_sensor_stale + expr: (time() - hass_last_updated_time_seconds{entity=~"sensor\\.(0x[0-9a-f]+|temp_server)_temperature"}) > 3600 + for: 5m + labels: + severity: warning + annotations: + summary: "Zigbee sensor {{ $labels.friendly_name }} is stale" + description: "Zigbee temperature sensor {{ $labels.entity }} has not reported data for over 1 hour. The sensor may have a dead battery or connectivity issues." - name: smartctl_rules rules: - alert: smart_critical_warning -- 2.49.1 From 32968147b5f25722a875d3c5e8e98d6a3d6e832b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Torjus=20H=C3=A5kestad?= Date: Thu, 5 Feb 2026 22:49:49 +0100 Subject: [PATCH 2/2] docs: move zigbee battery plan to completed Updated plan with: - Full device inventory from ha1 - Backup verification details - Branch and commit references Co-Authored-By: Claude Opus 4.5 --- .../zigbee-sensor-battery-monitoring.md | 46 +++++++++++++++++++ 1 file changed, 46 insertions(+) rename docs/plans/{ => completed}/zigbee-sensor-battery-monitoring.md (59%) diff --git a/docs/plans/zigbee-sensor-battery-monitoring.md b/docs/plans/completed/zigbee-sensor-battery-monitoring.md similarity index 59% rename from docs/plans/zigbee-sensor-battery-monitoring.md rename to docs/plans/completed/zigbee-sensor-battery-monitoring.md index 97da549..4051710 100644 --- a/docs/plans/zigbee-sensor-battery-monitoring.md +++ b/docs/plans/completed/zigbee-sensor-battery-monitoring.md @@ -1,5 +1,9 @@ # Zigbee Sensor Battery Monitoring +**Status:** Completed +**Branch:** `zigbee-battery-fix` +**Commit:** `c515a6b home-assistant: fix zigbee sensor battery reporting` + ## Problem Three Aqara Zigbee temperature sensors report `battery: 0` in their MQTT payload, making the `hass_sensor_battery_percent` Prometheus metric useless for battery monitoring on these devices. @@ -19,6 +23,25 @@ The **Temp Bedroom** sensor (`0x00124b0025495463`) is a SONOFF SNZB-02 and repor - The `voltage` field is not exposed as a Prometheus metric — it exists only in the MQTT payload. - This is a known firmware quirk with some Aqara WSDCGQ12LM sensors that always report 0% battery. +## Device Inventory + +Full list of Zigbee devices on ha1 (12 total): + +| Device | IEEE Address | Model | Type | +|--------|-------------|-------|------| +| temp_server | 0x54ef441000a564b6 | WSDCGQ12LM | Temperature sensor (battery fix applied) | +| (Temp Living Room) | 0x54ef441000a54d3c | WSDCGQ12LM | Temperature sensor (battery fix applied) | +| (Temp Office) | 0x54ef441000a547bd | WSDCGQ12LM | Temperature sensor (battery fix applied) | +| (Temp Bedroom) | 0x00124b0025495463 | SNZB-02 | Temperature sensor (battery works) | +| (Water leak) | 0x54ef4410009ac117 | SJCGQ12LM | Water leak sensor | +| btn_livingroom | 0x54ef441000a1f907 | WXKG13LM | Wireless mini switch | +| btn_bedroom | 0x54ef441000a1ee71 | WXKG13LM | Wireless mini switch | +| (Hue bulb) | 0x001788010dc35d06 | 9290024688 | Hue E27 1100lm (Router) | +| (Hue bulb) | 0x001788010dc5f003 | 9290024688 | Hue E27 1100lm (Router) | +| (Hue ceiling) | 0x001788010e371aa4 | 915005997301 | Hue Infuse medium (Router) | +| (Hue ceiling) | 0x001788010d253b99 | 915005997301 | Hue Infuse medium (Router) | +| (Hue wall) | 0x001788010d1b599a | 929003052901 | Hue Sana wall light (Router, transition=5) | + ## Implementation ### Solution 1: Calculate battery from voltage in Zigbee2MQTT (Implemented) @@ -30,6 +53,7 @@ Override the Home Assistant battery entity's `value_template` in Zigbee2MQTT dev **Changes in `services/home-assistant/default.nix`:** - Device configuration moved from external `devices.yaml` to inline NixOS config - Three affected sensors have `homeassistant.sensor_battery.value_template` override +- All 12 devices now declaratively managed **Expected battery values based on current voltages:** | Sensor | Voltage | Expected Battery | @@ -47,6 +71,27 @@ Added Prometheus alert `zigbee_sensor_stale` in `services/monitoring/rules.yml` - Severity: warning - For: 5m +## Pre-Deployment Verification + +### Backup Verification + +Before deployment, verified ha1 backup configuration and ran manual backup: + +**Backup paths:** +- `/var/lib/hass` ✓ +- `/var/lib/zigbee2mqtt` ✓ +- `/var/lib/mosquitto` ✓ + +**Manual backup (2026-02-05 22:45:23):** +- Snapshot ID: `59704dfa` +- Files: 77 total (0 new, 13 changed, 64 unmodified) +- Data: 62.635 MiB processed, 6.928 MiB stored (compressed) + +### Other directories reviewed + +- `/var/lib/vault` — Contains AppRole credentials; not backed up (can be re-provisioned via Ansible) +- `/var/lib/sops-nix` — Legacy; ha1 uses Vault now + ## Post-Deployment Steps After deploying to ha1: @@ -61,3 +106,4 @@ After deploying to ha1: - Device configuration is now declarative in NixOS. Future device additions via Zigbee2MQTT frontend will need to be added to the NixOS config to persist. - The `devices.yaml` file on ha1 will be overwritten on service start but can be removed after confirming the new config works. +- The NixOS zigbee2mqtt module defaults to `devices = "devices.yaml"` but our explicit inline config overrides this. -- 2.49.1