feat: implement nixos-exporter
Prometheus exporter for NixOS-specific metrics including: - Generation collector: count, current, booted, age, config mismatch - Flake collector: input age, input info, revision behind Includes NixOS module, flake packaging, and documentation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
142
README.md
Normal file
142
README.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# nixos-exporter
|
||||
|
||||
A Prometheus exporter for NixOS-specific metrics. Exposes system state information that standard exporters don't cover: generation management, flake input freshness, and upgrade status.
|
||||
|
||||
## Installation
|
||||
|
||||
### As a flake
|
||||
|
||||
```nix
|
||||
{
|
||||
inputs.nixos-exporter.url = "git+https://git.t-juice.club/torjus/nixos-exporter";
|
||||
|
||||
outputs = { self, nixpkgs, nixos-exporter, ... }: {
|
||||
nixosConfigurations.myhost = nixpkgs.lib.nixosSystem {
|
||||
modules = [
|
||||
nixos-exporter.nixosModules.default
|
||||
{
|
||||
services.prometheus.exporters.nixos = {
|
||||
enable = true;
|
||||
flake = {
|
||||
enable = true;
|
||||
url = "github:myuser/myconfig";
|
||||
};
|
||||
};
|
||||
}
|
||||
];
|
||||
};
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### Manual
|
||||
|
||||
```bash
|
||||
nix build
|
||||
./result/bin/nixos-exporter --listen=:9971
|
||||
```
|
||||
|
||||
## CLI Flags
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| `--listen` | `:9971` | Address to listen on |
|
||||
| `--collector.flake` | `false` | Enable flake collector |
|
||||
| `--flake.url` | | Flake URL for revision comparison (required if flake collector enabled) |
|
||||
| `--flake.check-interval` | `1h` | Interval between remote flake checks |
|
||||
|
||||
## NixOS Module Options
|
||||
|
||||
```nix
|
||||
services.prometheus.exporters.nixos = {
|
||||
enable = true;
|
||||
port = 9971;
|
||||
listenAddress = "0.0.0.0";
|
||||
openFirewall = false;
|
||||
|
||||
flake = {
|
||||
enable = false;
|
||||
url = ""; # Required if flake.enable = true
|
||||
checkInterval = "1h";
|
||||
};
|
||||
};
|
||||
```
|
||||
|
||||
## Metrics
|
||||
|
||||
### Generation Metrics (always enabled)
|
||||
|
||||
| Metric | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `nixos_generation_count` | Gauge | Total number of system generations |
|
||||
| `nixos_current_generation` | Gauge | Currently active generation number |
|
||||
| `nixos_booted_generation` | Gauge | Generation that was booted |
|
||||
| `nixos_generation_age_seconds` | Gauge | Age of current generation in seconds |
|
||||
| `nixos_config_mismatch` | Gauge | 1 if booted generation differs from current |
|
||||
|
||||
### Flake Metrics (optional)
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
|--------|------|--------|-------------|
|
||||
| `nixos_flake_input_age_seconds` | Gauge | `input` | Age of flake input in seconds |
|
||||
| `nixos_flake_input_info` | Gauge | `input`, `rev`, `type` | Info gauge with revision and type labels |
|
||||
| `nixos_flake_revision_behind` | Gauge | | 1 if current system revision differs from remote latest |
|
||||
|
||||
## Example Prometheus Alerts
|
||||
|
||||
```yaml
|
||||
groups:
|
||||
- name: nixos
|
||||
rules:
|
||||
- alert: NixOSConfigStale
|
||||
expr: nixos_generation_age_seconds > 7 * 24 * 3600
|
||||
for: 1h
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "NixOS config on {{ $labels.instance }} is over 7 days old"
|
||||
|
||||
- alert: NixOSRebootRequired
|
||||
expr: nixos_config_mismatch == 1
|
||||
for: 24h
|
||||
labels:
|
||||
severity: info
|
||||
annotations:
|
||||
summary: "{{ $labels.instance }} needs reboot to apply config"
|
||||
|
||||
- alert: NixpkgsInputStale
|
||||
expr: nixos_flake_input_age_seconds{input="nixpkgs"} > 30 * 24 * 3600
|
||||
for: 1d
|
||||
labels:
|
||||
severity: info
|
||||
annotations:
|
||||
summary: "nixpkgs input on {{ $labels.instance }} is over 30 days old"
|
||||
|
||||
- alert: NixOSRevisionBehind
|
||||
expr: nixos_flake_revision_behind == 1
|
||||
for: 1h
|
||||
labels:
|
||||
severity: info
|
||||
annotations:
|
||||
summary: "{{ $labels.instance }} is behind remote flake revision"
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
- The `/metrics` endpoint exposes system state and revision information. Only expose it on internal networks.
|
||||
- Runs as non-root user; only reads symlinks and files that are world-readable.
|
||||
- When using the flake collector, the exporter executes `nix flake metadata` to fetch remote data.
|
||||
|
||||
## Known Limitations
|
||||
|
||||
- The `nixos_flake_revision_behind` metric relies on parsing the git hash from `/run/current-system/nixos-version`. The format of this file varies depending on NixOS configuration:
|
||||
- Standard format: `25.11.20260203.e576e3c`
|
||||
- Custom format: `1994-294a625`
|
||||
|
||||
If your system uses a non-standard format that doesn't end with a git hash, the revision comparison may not work correctly.
|
||||
|
||||
- Flake input ages reflect the remote flake state. If the deployed system is behind, these will show newer timestamps than what's actually deployed.
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
Reference in New Issue
Block a user