From 678fd3d6def0b31690a0c09247ebe04807a320b3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Torjus=20H=C3=A5kestad?= Date: Thu, 5 Feb 2026 10:19:33 +0100 Subject: [PATCH] docs: add systemd-exporter findings to monitoring gaps plan Co-Authored-By: Claude Opus 4.5 --- docs/plans/monitoring-gaps.md | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/docs/plans/monitoring-gaps.md b/docs/plans/monitoring-gaps.md index 296bc71..f07f883 100644 --- a/docs/plans/monitoring-gaps.md +++ b/docs/plans/monitoring-gaps.md @@ -79,6 +79,33 @@ These services have adequate alerting and/or scrape targets: | Nix Cache (Harmonia, build-flakes) | Via Caddy | 4 alerts | | CA (step-ca) | Yes (port 9000) | 4 certificate alerts | +## Per-Service Resource Metrics (systemd-exporter) + +### Current State + +No per-service CPU, memory, or IO metrics are collected. The existing node-exporter systemd collector only provides unit state (active/inactive/failed), socket stats, and timer triggers. While systemd tracks per-unit resource usage via cgroups internally (visible in `systemctl status` and `systemd-cgtop`), this data is not exported to Prometheus. + +### Available Solution + +The `prometheus-systemd-exporter` package (v0.7.0) is available in nixpkgs with a ready-made NixOS module: + +```nix +services.prometheus.exporters.systemd.enable = true; +``` + +**Options:** `enable`, `port`, `extraFlags`, `user`, `group` + +This exporter reads cgroup data and exposes per-unit metrics including: +- CPU seconds consumed per service +- Memory usage per service +- Task/process counts per service +- Restart counts +- IO usage + +### Recommendation + +Enable on all hosts via the shared `system/` config (same pattern as node-exporter). Add a corresponding scrape job on monitoring01. This would give visibility into resource consumption per service across the fleet, useful for capacity planning and diagnosing noisy-neighbor issues on shared hosts. + ## Suggested Priority 1. **PostgreSQL** - Critical infrastructure, easy to add with existing nixpkgs module