Add a new info metric that exposes the current system's flake revision and the latest remote revision as labels. This makes it easier to see exactly which revision is deployed vs available. Also adds version constant to Go code and extracts it in flake.nix, providing a single source of truth for the version. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
144 lines
4.4 KiB
Markdown
144 lines
4.4 KiB
Markdown
# nixos-exporter
|
|
|
|
A Prometheus exporter for NixOS-specific metrics. Exposes system state information that standard exporters don't cover: generation management, flake input freshness, and upgrade status.
|
|
|
|
## Installation
|
|
|
|
### As a flake
|
|
|
|
```nix
|
|
{
|
|
inputs.nixos-exporter.url = "git+https://git.t-juice.club/torjus/nixos-exporter";
|
|
|
|
outputs = { self, nixpkgs, nixos-exporter, ... }: {
|
|
nixosConfigurations.myhost = nixpkgs.lib.nixosSystem {
|
|
modules = [
|
|
nixos-exporter.nixosModules.default
|
|
{
|
|
services.prometheus.exporters.nixos = {
|
|
enable = true;
|
|
flake = {
|
|
enable = true;
|
|
url = "github:myuser/myconfig";
|
|
};
|
|
};
|
|
}
|
|
];
|
|
};
|
|
};
|
|
}
|
|
```
|
|
|
|
### Manual
|
|
|
|
```bash
|
|
nix build
|
|
./result/bin/nixos-exporter --listen=:9971
|
|
```
|
|
|
|
## CLI Flags
|
|
|
|
| Flag | Default | Description |
|
|
|------|---------|-------------|
|
|
| `--listen` | `:9971` | Address to listen on |
|
|
| `--collector.flake` | `false` | Enable flake collector |
|
|
| `--flake.url` | | Flake URL for revision comparison (required if flake collector enabled) |
|
|
| `--flake.check-interval` | `1h` | Interval between remote flake checks |
|
|
|
|
## NixOS Module Options
|
|
|
|
```nix
|
|
services.prometheus.exporters.nixos = {
|
|
enable = true;
|
|
port = 9971;
|
|
listenAddress = "0.0.0.0";
|
|
openFirewall = false;
|
|
|
|
flake = {
|
|
enable = false;
|
|
url = ""; # Required if flake.enable = true
|
|
checkInterval = "1h";
|
|
};
|
|
};
|
|
```
|
|
|
|
## Metrics
|
|
|
|
### Generation Metrics (always enabled)
|
|
|
|
| Metric | Type | Description |
|
|
|--------|------|-------------|
|
|
| `nixos_generation_count` | Gauge | Total number of system generations |
|
|
| `nixos_current_generation` | Gauge | Currently active generation number |
|
|
| `nixos_booted_generation` | Gauge | Generation that was booted |
|
|
| `nixos_generation_age_seconds` | Gauge | Age of current generation in seconds |
|
|
| `nixos_config_mismatch` | Gauge | 1 if booted generation differs from current |
|
|
|
|
### Flake Metrics (optional)
|
|
|
|
| Metric | Type | Labels | Description |
|
|
|--------|------|--------|-------------|
|
|
| `nixos_flake_input_age_seconds` | Gauge | `input` | Age of flake input in seconds |
|
|
| `nixos_flake_input_info` | Gauge | `input`, `rev`, `type` | Info gauge with revision and type labels |
|
|
| `nixos_flake_info` | Gauge | `current_rev`, `remote_rev` | Info gauge with current and remote flake revisions |
|
|
| `nixos_flake_revision_behind` | Gauge | | 1 if current system revision differs from remote latest |
|
|
|
|
## Example Prometheus Alerts
|
|
|
|
```yaml
|
|
groups:
|
|
- name: nixos
|
|
rules:
|
|
- alert: NixOSConfigStale
|
|
expr: nixos_generation_age_seconds > 7 * 24 * 3600
|
|
for: 1h
|
|
labels:
|
|
severity: warning
|
|
annotations:
|
|
summary: "NixOS config on {{ $labels.instance }} is over 7 days old"
|
|
|
|
- alert: NixOSRebootRequired
|
|
expr: nixos_config_mismatch == 1
|
|
for: 24h
|
|
labels:
|
|
severity: info
|
|
annotations:
|
|
summary: "{{ $labels.instance }} needs reboot to apply config"
|
|
|
|
- alert: NixpkgsInputStale
|
|
expr: nixos_flake_input_age_seconds{input="nixpkgs"} > 30 * 24 * 3600
|
|
for: 1d
|
|
labels:
|
|
severity: info
|
|
annotations:
|
|
summary: "nixpkgs input on {{ $labels.instance }} is over 30 days old"
|
|
|
|
- alert: NixOSRevisionBehind
|
|
expr: nixos_flake_revision_behind == 1
|
|
for: 1h
|
|
labels:
|
|
severity: info
|
|
annotations:
|
|
summary: "{{ $labels.instance }} is behind remote flake revision"
|
|
```
|
|
|
|
## Security Considerations
|
|
|
|
- The `/metrics` endpoint exposes system state and revision information. Only expose it on internal networks.
|
|
- Runs as non-root user; only reads symlinks and files that are world-readable.
|
|
- When using the flake collector, the exporter executes `nix flake metadata` to fetch remote data.
|
|
|
|
## Known Limitations
|
|
|
|
- The `nixos_flake_info` and `nixos_flake_revision_behind` metrics rely on parsing the git hash from `/run/current-system/nixos-version`. The format of this file varies depending on NixOS configuration:
|
|
- Standard format: `25.11.20260203.e576e3c`
|
|
- Custom format: `1994-294a625`
|
|
|
|
If your system uses a non-standard format that doesn't end with a git hash, the revision comparison may not work correctly.
|
|
|
|
- Flake input ages reflect the remote flake state. If the deployed system is behind, these will show newer timestamps than what's actually deployed.
|
|
|
|
## License
|
|
|
|
MIT
|