dd1b64de27
monitoring: auto-generate Prometheus scrape targets from host configs
...
Run nix flake check / flake-check (pull_request) Successful in 2m49s
Run nix flake check / flake-check (push) Has been cancelled
Add homelab.monitoring NixOS options (enable, scrapeTargets) following
the same pattern as homelab.dns. Prometheus scrape configs are now
auto-generated from flake host configurations and external targets,
replacing hardcoded target lists.
Also cleans up alert rules: snake_case naming, fix zigbee2mqtt typo,
remove duplicate pushgateway alert, add for clauses to monitoring_rules,
remove hardcoded WireGuard public key, and add new alerts for
certificates, proxmox, caddy, smartctl temperature, filesystem
prediction, systemd state, file descriptors, and host reboots.
Fixes grafana scrape target port from 3100 to 3000.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-02-05 00:49:07 +01:00
adf70999b9
Fix scrape config
Run nix flake check / flake-check (push) Failing after 6m7s
Periodic flake update / flake-update (push) Successful in 3m13s
2025-06-01 02:41:54 +02:00
acb9e59775
Scrape nix-cache caddy
Run nix flake check / flake-check (push) Has been cancelled
2025-06-01 02:40:41 +02:00
14aa3a9340
Remove non-working timer rule
Run nix flake check / flake-check (push) Failing after 14m3s
Periodic flake update / flake-update (push) Successful in 3m9s
2025-05-29 10:15:40 +02:00
797f915939
Add monitoring rules for monitoring services
Run nix flake check / flake-check (push) Has been cancelled
2025-05-29 10:09:27 +02:00
3785b8047a
Fix alert name for build-flakes alert
Run nix flake check / flake-check (push) Failing after 10m34s
Periodic flake update / flake-update (push) Successful in 3m3s
2025-05-28 21:28:04 +02:00
fb1a36a846
Rework build-flakes alert rules
Run nix flake check / flake-check (push) Has been cancelled
2025-05-28 21:26:04 +02:00
77d1782f36
Set honor_labels for pushgw scrape
Run nix flake check / flake-check (push) Failing after 8m37s
2025-05-28 20:34:17 +02:00
5b06a95222
Add prometheus pushgateway
Run nix flake check / flake-check (push) Failing after 12m59s
2025-05-28 17:10:50 +02:00
5ce8f46394
Configure tempo otlp reciever endpoint
Run nix flake check / flake-check (push) Failing after 11m42s
Periodic flake update / flake-update (push) Successful in 4m6s
2025-05-24 22:10:01 +02:00
feff1d06eb
Configure tempo otlp reciever
Run nix flake check / flake-check (push) Has been cancelled
2025-05-24 22:08:36 +02:00
b75df7578f
Configure tempo wal storage
Run nix flake check / flake-check (push) Has been cancelled
2025-05-24 22:03:56 +02:00
4d88644417
Configure tempo storage
Run nix flake check / flake-check (push) Has been cancelled
2025-05-24 21:55:08 +02:00
d4137f79aa
Change tempo settings
Run nix flake check / flake-check (push) Has been cancelled
2025-05-24 21:32:19 +02:00
486320b0ec
Add tempo to monitoring
Run nix flake check / flake-check (push) Has been cancelled
2025-05-24 21:29:05 +02:00
6fc4d42d16
Fix alloy config
Run nix flake check / flake-check (push) Has been cancelled
2025-05-24 12:42:40 +02:00
ebcdefd0ca
Add alloy
Run nix flake check / flake-check (push) Has been cancelled
2025-05-24 12:40:39 +02:00
2dae23560d
Fix pyroscope ports attribute
Run nix flake check / flake-check (push) Has been cancelled
2025-05-24 12:01:30 +02:00
1988b36f03
Add pyroscope container to monitoring
Run nix flake check / flake-check (push) Has been cancelled
2025-05-24 12:00:02 +02:00
2a46da3761
Add labmon to scrape config
Run nix flake check / flake-check (push) Failing after 14m32s
2025-05-24 03:37:52 +02:00
4e870cda44
Scrape step-ca metrics
Run nix flake check / flake-check (push) Failing after 3m52s
Periodic flake update / flake-update (push) Successful in 2m42s
2025-05-23 09:28:52 +02:00
6e6d5098c5
Collect ghettoptt stats
Run nix flake check / flake-check (push) Failing after 11m48s
2025-05-22 14:55:32 +02:00
aa2cbcda60
Add home assistant to prometheus
Run nix flake check / flake-check (push) Failing after 15m18s
2025-05-19 11:21:46 +02:00
78efb084ec
Alertonotify hardening part 3
Run nix flake check / flake-check (push) Failing after 10m10s
Periodic flake update / flake-update (push) Successful in 4m12s
2025-05-18 15:24:58 +02:00
16042b08c0
Alertonotify hardening part 2
Run nix flake check / flake-check (push) Failing after 3m58s
2025-05-18 15:20:00 +02:00
8e0b97c9e0
Alertonotify hardening part 1
Run nix flake check / flake-check (push) Failing after 4m30s
2025-05-18 15:08:26 +02:00
fe2e87658a
Move prometheus roles to external file
Run nix flake check / flake-check (push) Failing after 3m7s
2025-05-18 14:54:09 +02:00
c07d96bbab
Add alert for wireguard handshake
Run nix flake check / flake-check (push) Failing after 3m17s
Periodic flake update / flake-update (push) Successful in 2m15s
2025-05-18 01:12:04 +02:00
bd58d07001
Monitor wireguard
Run nix flake check / flake-check (push) Failing after 3m32s
2025-05-18 00:59:55 +02:00
3797526000
Add some alerting rules for smartctl
Run nix flake check / flake-check (push) Has been cancelled
2025-05-18 00:51:02 +02:00
afa3cc3a57
Collect smartctl metrics from gunter
Run nix flake check / flake-check (push) Failing after 4m53s
2025-05-18 00:43:15 +02:00
08a0ddaf30
Increase prometheus retention to 30d
Run nix flake check / flake-check (push) Failing after 5m58s
Periodic flake update / flake-update (push) Successful in 4m7s
2025-05-12 23:22:31 +02:00
518e3a3ded
Fix flapping build-flakes alarm
Run nix flake check / flake-check (push) Failing after 6m57s
Periodic flake update / flake-update (push) Successful in 3m59s
2025-04-07 10:41:35 +02:00
0dbdee65c5
Add harmonia alerting rule
Run nix flake check / flake-check (push) Has been cancelled
2025-02-24 18:29:41 +01:00
b468e9d533
Improve alerttonotify service
Run nix flake check / flake-check (push) Failing after 2m56s
Periodic flake update / flake-update (push) Successful in 1m26s
2025-02-23 20:51:39 +01:00
874e30fb28
Tune cpu alarm
Run nix flake check / flake-check (push) Failing after 4m18s
2025-02-23 20:46:25 +01:00
db9bf38ab6
Fix alerttonotify service
Run nix flake check / flake-check (push) Failing after 26m40s
2025-02-23 18:16:13 +01:00
15e5ccb0ec
Change alertmanager repeat time
Run nix flake check / flake-check (push) Failing after 3m41s
2025-02-23 18:10:14 +01:00
b8d058d23e
Add alerting rules
Run nix flake check / flake-check (push) Failing after 8m51s
2025-02-12 20:34:22 +01:00
a5448c5fc1
Remove whitespace
Run nix flake check / flake-check (push) Failing after 24m42s
Periodic flake update / flake-update (push) Successful in 1m24s
2025-02-12 00:26:14 +01:00
f1ca20a387
Add some alerting rules
Run nix flake check / flake-check (push) Failing after 14m34s
2025-02-11 23:24:35 +01:00
f0bc29ac5e
Add nats host to monitoring
Run nix flake check / flake-check (push) Has been cancelled
2025-02-11 23:12:55 +01:00
539ff4eeac
Change cpu load alert
Run nix flake check / flake-check (push) Waiting to run
2025-02-11 23:07:56 +01:00
3b500a25a7
Enable alerttonotify service
Run nix flake check / flake-check (push) Has been cancelled
2025-02-11 22:34:41 +01:00
abb4cf58ea
Add alerttonotify to monitoring host
Run nix flake check / flake-check (push) Has been cancelled
2025-02-11 22:25:54 +01:00
6079852cc6
Add missing hosts to prometheus scrap job
Run nix flake check / flake-check (push) Failing after 6m22s
Periodic flake update / flake-update (push) Successful in 1m30s
2025-01-26 00:56:21 +01:00
26bf43bba5
Collect restic rest metrics
Run nix flake check / flake-check (push) Failing after 6m44s
Periodic flake update / flake-update (push) Successful in 1m29s
2025-01-24 23:43:02 +01:00
2824718e53
Collect alertmanager metrics
Run nix flake check / flake-check (push) Has been cancelled
2025-01-24 23:34:43 +01:00
25b2f1d1ee
Collect grafana metrics
2025-01-24 23:33:49 +01:00
f2b5bb6f2a
Collect loki metrics
2025-01-24 23:32:45 +01:00