Commit Graph

4 Commits

Author SHA1 Message Date
9bd48e0808 monitoring: explicitly list valid HTTP status codes
All checks were successful
Run nix flake check / flake-check (push) Successful in 2m6s
Empty valid_status_codes defaults to 2xx only, not "any".
Explicitly list common status codes (2xx, 3xx, 4xx, 5xx) so
services returning 400/401 like ha and nzbget pass the probe.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-09 22:41:47 +01:00
d1b0a5dc20 monitoring: accept any HTTP status in TLS probe
Some checks failed
Run nix flake check / flake-check (push) Has been cancelled
Only care about TLS handshake success for certificate monitoring.
Services like nzbget (401) and ha (400) return non-2xx but have
valid certificates.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-09 22:33:45 +01:00
4d32707130 monitoring: remove duplicate rules from blackbox.nix
All checks were successful
Run nix flake check / flake-check (push) Successful in 2m7s
The rules were already added to rules.yml but the blackbox.nix file
still had them, causing duplicate 'groups' key errors.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-09 22:28:42 +01:00
75e4fb61a5 monitoring: add blackbox exporter for TLS certificate monitoring
All checks were successful
Run nix flake check / flake-check (push) Successful in 2m6s
Add blackbox exporter to monitoring01 to probe TLS endpoints and alert
on expiring certificates. Monitors all ACME-managed certificates from
OpenBao PKI including Caddy auto-TLS services.

Alerts:
- tls_certificate_expiring_soon (< 7 days, warning)
- tls_certificate_expiring_critical (< 24h, critical)
- tls_probe_failed (connectivity issues)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-09 22:21:42 +01:00