monitoring-gaps-implementation #20
Reference in New Issue
Block a user
Delete Branch "monitoring-gaps-implementation"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Implements monitoring coverage for services identified in the monitoring gaps audit. Adds Prometheus exporters, scrape targets, and alert rules for previously unmonitored services.
New Exporters & Scrape Targets
New Alert Rules
postgres_down,postgres_exporter_down,postgres_high_connectionsauthelia_down,lldap_downjellyfin_downopenbao_down,openbao_sealed,openbao_scrape_downunbound_low_cache_hit_ratio(extended)nats_slow_consumers(extended)Terraform Changes
prometheus-metricspolicy for OpenBao metrics accessvault_tokenresource to auto-create Prometheus scrape tokenhosts/monitoring01/openbao-tokenFiles Changed
system/monitoring/metrics.nix- systemd-exporter on all hostsservices/postgres/postgres.nix- postgres-exporterservices/authelia/default.nix- Authelia telemetryservices/ns/resolver.nix- unbound-exporter with remote-controlservices/nats/default.nix- NATS HTTP monitoringservices/vault/default.nix- OpenBao telemetryservices/monitoring/prometheus.nix- new scrape configs + vault secretservices/monitoring/rules.yml- all new alert rulesterraform/vault/policies.tf- new file for prometheus-metrics policyterraform/vault/secrets.tf- prometheus token secretdocs/plans/monitoring-gaps.md→docs/plans/completed/Deployment Notes
tofu applyinterraform/vault/first (already done)