monitoring: implement monitoring gaps coverage
Some checks failed
Run nix flake check / flake-check (push) Failing after 7m36s

Add exporters and scrape targets for services lacking monitoring:
- PostgreSQL: postgres-exporter on pgdb1
- Authelia: native telemetry metrics on auth01
- Unbound: unbound-exporter with remote-control on ns1/ns2
- NATS: HTTP monitoring endpoint on nats1
- OpenBao: telemetry config and Prometheus scrape with token auth
- Systemd: systemd-exporter on all hosts for per-service metrics

Add alert rules for postgres, auth (authelia + lldap), jellyfin,
vault (openbao), plus extend existing nats and unbound rules.

Add Terraform config for Prometheus metrics policy and token. The
token is created via vault_token resource and stored in KV, so no
manual token creation is needed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-02-05 21:42:38 +01:00
parent 41d4226812
commit 3cccfc0487
12 changed files with 217 additions and 0 deletions

View File

@@ -1,5 +1,10 @@
{ config, ... }:
{
homelab.monitoring.scrapeTargets = [{
job_name = "authelia";
port = 9959;
}];
sops.secrets.authelia_ldap_password = {
format = "yaml";
sopsFile = ../../secrets/auth01/secrets.yaml;
@@ -45,6 +50,12 @@
storageEncryptionKeyFile = config.sops.secrets.authelia_storage_encryption_key_file.path;
};
settings = {
telemetry = {
metrics = {
enabled = true;
address = "tcp://0.0.0.0:9959";
};
};
access_control = {
default_policy = "two_factor";
};