diff --git a/docs/plans/loki-improvements.md b/docs/plans/loki-improvements.md index c1b58a0..68cbc2a 100644 --- a/docs/plans/loki-improvements.md +++ b/docs/plans/loki-improvements.md @@ -125,7 +125,25 @@ This enables queries like: - `{level=~"critical|error", tier="prod"}` - prod errors and criticals - `{level="warning", role="dns"}` - warnings from DNS servers -### 6. Monitoring CNAME for Promtail Target +### 6. Enable JSON Logging on Services + +**Problem:** Many services support structured JSON log output but may be using plain text by default. JSON logs are significantly easier to query in Loki - `| json` cleanly extracts all fields, whereas plain text requires fragile regex or pattern matching. + +**Recommendation:** Audit all configured services and enable JSON logging where supported. Candidates to check include: +- Caddy (already JSON by default) +- Prometheus / Alertmanager / Loki / Tempo +- Grafana +- NSD / Unbound +- Home Assistant +- NATS +- Jellyfin +- OpenBao (Vault) +- Kanidm +- Garage + +For each service, check whether it supports a JSON log format option and whether enabling it would break anything (e.g., log volume increase from verbose JSON, or dashboards that parse text format). + +### 7. Monitoring CNAME for Promtail Target **Problem:** Promtail hardcodes `monitoring01.home.2rjus.net:3100`. The VictoriaMetrics migration plan already addresses this by switching to a `monitoring` CNAME. @@ -139,7 +157,8 @@ This enables queries like: | 2 | **Limits config** | Low | Medium | Do with retention - minimal additional effort | | 3 | **Promtail label fix** | Trivial | Low | Quick fix, do with other label changes | | 4 | **Journal priority → level** | Low-medium | Medium | Reliable level filtering across the fleet | -| 5 | **Monitoring CNAME** | Low | Medium | Part of monitoring02 migration | +| 5 | **JSON logging audit** | Low-medium | Medium | Audit services, enable JSON where supported | +| 6 | **Monitoring CNAME** | Low | Medium | Part of monitoring02 migration | ## Implementation Steps