Remove monitoring01 host configuration and unused service modules
(prometheus, grafana, loki, tempo, pyroscope). Migrate blackbox,
exportarr, and pve exporters to monitoring02 with scrape configs
moved to VictoriaMetrics. Update alert rules, terraform vault
policies/secrets, http-proxy entries, and documentation to reflect
the monitoring02 migration.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update all LogQL examples, agent instructions, and scripts to use
the hostname label instead of host, matching the Prometheus label
naming convention. Also update pipe-to-loki and bootstrap scripts
to push hostname instead of host.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move detailed Prometheus/Loki reference from CLAUDE.md to the
observability skill
- Add complete list of Prometheus jobs organized by category
- Add bootstrap log documentation with stages table
- Add kanidm01 to host labels table
- CLAUDE.md now references the skill instead of duplicating info
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Document the new hostname and host metadata labels available on all
Prometheus scrape targets:
- hostname: short hostname for easy filtering
- role: host role (dns, build-host, vault)
- tier: deployment tier (test for test VMs)
- dns_role: primary/secondary for DNS servers
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Merge Prometheus metrics and Loki logs into a unified troubleshooting
skill. Adds LogQL query patterns, label reference, and common service
units for log searching.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>