From c272ce6903846c25ae8acecd70e8445ad97c7115 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Torjus=20H=C3=A5kestad?= Date: Mon, 9 Feb 2026 01:28:21 +0100 Subject: [PATCH] docs: document --debug flag and extraArgs module option Add documentation for: - --debug flag in Listener Flags table - --heartbeat-interval flag (was missing) - extraArgs NixOS module option - New Troubleshooting section with debug logging examples and guidance for diagnosing metrics issues Co-Authored-By: Claude Opus 4.5 --- README.md | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/README.md b/README.md index b5107c5..5553653 100644 --- a/README.md +++ b/README.md @@ -63,6 +63,8 @@ homelab-deploy listener \ | `--discover-subject` | No | Discovery subject (default: `deploy.discover`) | | `--metrics-enabled` | No | Enable Prometheus metrics endpoint | | `--metrics-addr` | No | Metrics HTTP server address (default: `:9972`) | +| `--heartbeat-interval` | No | Status update interval in seconds during deployment (default: 15) | +| `--debug` | No | Enable debug logging for troubleshooting | #### Subject Templates @@ -214,6 +216,7 @@ Add the module to your NixOS configuration: | `metrics.enable` | bool | `false` | Enable Prometheus metrics endpoint | | `metrics.address` | string | `":9972"` | Metrics HTTP server address | | `metrics.openFirewall` | bool | `false` | Open firewall for metrics port | +| `extraArgs` | list of string | `[]` | Extra command line arguments (e.g., `["--debug"]`) | Default `deploySubjects`: ```nix @@ -298,6 +301,57 @@ histogram_quantile(0.95, rate(homelab_deploy_deployment_duration_seconds_bucket[ sum(homelab_deploy_deployment_in_progress) ``` +## Troubleshooting + +### Debug Logging + +Enable debug logging to diagnose issues with deployments or metrics: + +**CLI:** +```bash +homelab-deploy listener --debug \ + --hostname myhost \ + --tier prod \ + --nats-url nats://nats.example.com:4222 \ + --nkey-file /run/secrets/listener.nkey \ + --flake-url git+https://git.example.com/user/nixos-configs.git \ + --metrics-enabled +``` + +**NixOS module:** +```nix +services.homelab-deploy.listener = { + enable = true; + tier = "prod"; + natsUrl = "nats://nats.example.com:4222"; + nkeyFile = "/run/secrets/homelab-deploy-nkey"; + flakeUrl = "git+https://git.example.com/user/nixos-configs.git"; + metrics.enable = true; + extraArgs = [ "--debug" ]; +}; +``` + +With debug logging enabled, the listener outputs detailed information about metrics recording: + +```json +{"level":"DEBUG","msg":"recording deployment start metric","metrics_enabled":true} +{"level":"DEBUG","msg":"recording deployment end metric (success)","action":"switch","success":true,"duration_seconds":120.5} +``` + +### Metrics Showing Zero + +If deployment metrics remain at zero after deployments: + +1. **Check metrics are enabled**: Verify `--metrics-enabled` is set and the metrics endpoint is accessible at `/metrics` + +2. **Enable debug logging**: Use `--debug` to confirm metrics recording is being called + +3. **Check deployment status**: Metrics are only recorded for deployments that complete (success or failure). Rejected requests (e.g., already running) increment the counter with `status="rejected"` but don't record duration + +4. **Check after restart**: After a successful `switch` deployment, the listener restarts. Metrics reset to zero in the new instance. The listener waits up to 60 seconds for a Prometheus scrape before restarting to capture the final metrics + +5. **Verify Prometheus scrape timing**: Ensure Prometheus scrapes frequently enough to capture metrics before the listener restarts + ## Message Protocol ### Deploy Request