feat: add builder mode for centralized Nix builds

Add a new "builder" capability to trigger Nix builds on a dedicated
build host via NATS messaging. This allows pre-building NixOS
configurations before deployment.

New components:
- Builder mode: subscribes to build.<repo>.* subjects, executes nix build
- Build CLI command: triggers builds with progress tracking
- MCP build tool: available with --enable-builds flag
- Builder metrics: tracks build success/failure per repo and host
- NixOS module: services.homelab-deploy.builder

The builder uses a YAML config file to define allowed repositories
with their URLs and default branches. Builds can target all hosts
or specific hosts, with real-time progress updates.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-02-10 22:03:14 +01:00
parent 277a49a666
commit 14f5b31faf
13 changed files with 1535 additions and 57 deletions

189
README.md
View File

@@ -4,11 +4,12 @@ A message-based deployment system for NixOS configurations using NATS for messag
## Overview
The `homelab-deploy` binary provides three operational modes:
The `homelab-deploy` binary provides four operational modes:
1. **Listener mode** - Runs on each NixOS host as a systemd service, subscribing to NATS subjects and executing `nixos-rebuild` when deployment requests arrive
2. **MCP mode** - Runs as an MCP (Model Context Protocol) server, exposing deployment tools for AI assistants
3. **CLI mode** - Manual deployment commands for administrators
2. **Builder mode** - Runs on a dedicated build host, subscribing to NATS subjects and executing `nix build` to pre-build configurations
3. **MCP mode** - Runs as an MCP (Model Context Protocol) server, exposing deployment tools for AI assistants
4. **CLI mode** - Manual deployment and build commands for administrators
## Installation
@@ -128,6 +129,82 @@ homelab-deploy deploy prod-dns --nats-url ... --nkey-file ...
Alias lookup: `HOMELAB_DEPLOY_ALIAS_<NAME>` where name is uppercased and hyphens become underscores.
### Builder Mode
Run on a dedicated build host to pre-build NixOS configurations:
```bash
homelab-deploy builder \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/builder.nkey \
--config /etc/homelab-deploy/builder.yaml \
--timeout 1800 \
--metrics-enabled \
--metrics-addr :9973
```
#### Builder Configuration File
The builder uses a YAML configuration file to define allowed repositories:
```yaml
repos:
nixos-servers:
url: "git+https://git.example.com/org/nixos-servers.git"
default_branch: "master"
homelab:
url: "git+ssh://git@github.com/user/homelab.git"
default_branch: "main"
```
#### Builder Flags
| Flag | Required | Description |
|------|----------|-------------|
| `--nats-url` | Yes | NATS server URL |
| `--nkey-file` | Yes | Path to NKey seed file |
| `--config` | Yes | Path to builder configuration file |
| `--timeout` | No | Build timeout per host in seconds (default: 1800) |
| `--metrics-enabled` | No | Enable Prometheus metrics endpoint |
| `--metrics-addr` | No | Metrics HTTP server address (default: `:9973`) |
### Build Command
Trigger a build on the build server:
```bash
# Build all hosts in a repository
homelab-deploy build nixos-servers --all \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/deployer.nkey
# Build a specific host
homelab-deploy build nixos-servers myhost \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/deployer.nkey
# Build with a specific branch
homelab-deploy build nixos-servers --all --branch feature-x \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/deployer.nkey
# JSON output for scripting
homelab-deploy build nixos-servers --all --json \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/deployer.nkey
```
#### Build Flags
| Flag | Required | Env Var | Description |
|------|----------|---------|-------------|
| `--nats-url` | Yes | `HOMELAB_DEPLOY_NATS_URL` | NATS server URL |
| `--nkey-file` | Yes | `HOMELAB_DEPLOY_NKEY_FILE` | Path to NKey seed file |
| `--branch` | No | `HOMELAB_DEPLOY_BRANCH` | Git branch (uses repo default if not specified) |
| `--all` | No | - | Build all hosts in the repository |
| `--timeout` | No | `HOMELAB_DEPLOY_BUILD_TIMEOUT` | Response timeout in seconds (default: 3600) |
| `--json` | No | - | Output results as JSON |
### MCP Server Mode
Run as an MCP server for AI assistant integration:
@@ -144,6 +221,12 @@ homelab-deploy mcp \
--nkey-file /run/secrets/mcp.nkey \
--enable-admin \
--admin-nkey-file /run/secrets/admin.nkey
# With build tool enabled
homelab-deploy mcp \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/mcp.nkey \
--enable-builds
```
#### MCP Tools
@@ -153,6 +236,7 @@ homelab-deploy mcp \
| `deploy` | Deploy to test-tier hosts only |
| `deploy_admin` | Deploy to any tier (requires `--enable-admin`) |
| `list_hosts` | Discover available deployment targets |
| `build` | Trigger builds on the build server (requires `--enable-builds`) |
#### Tool Parameters
@@ -167,6 +251,12 @@ homelab-deploy mcp \
**list_hosts:**
- `tier` - Filter by tier (optional)
**build:**
- `repo` - Repository name (required, must match builder config)
- `target` - Target hostname (optional, defaults to all)
- `all` - Build all hosts (default if no target specified)
- `branch` - Git branch (uses repo default if not specified)
## NixOS Module
Add the module to your NixOS configuration:
@@ -224,6 +314,37 @@ Default `deploySubjects`:
]
```
### Builder Module Options
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `enable` | bool | `false` | Enable the builder service |
| `package` | package | from flake | Package to use |
| `natsUrl` | string | required | NATS server URL |
| `nkeyFile` | path | required | Path to NKey seed file |
| `configFile` | path | required | Path to builder configuration file |
| `timeout` | int | `1800` | Build timeout per host in seconds |
| `environment` | attrs | `{}` | Additional environment variables |
| `metrics.enable` | bool | `false` | Enable Prometheus metrics endpoint |
| `metrics.address` | string | `":9973"` | Metrics HTTP server address |
| `metrics.openFirewall` | bool | `false` | Open firewall for metrics port |
Example builder configuration:
```nix
services.homelab-deploy.builder = {
enable = true;
natsUrl = "nats://nats.example.com:4222";
nkeyFile = "/run/secrets/homelab-deploy-builder-nkey";
configFile = "/etc/homelab-deploy/builder.yaml";
metrics = {
enable = true;
address = ":9973";
openFirewall = true;
};
};
```
## Prometheus Metrics
The listener can expose Prometheus metrics for monitoring deployment operations.
@@ -298,6 +419,24 @@ histogram_quantile(0.95, rate(homelab_deploy_deployment_duration_seconds_bucket[
sum(homelab_deploy_deployment_in_progress)
```
### Builder Metrics
When running in builder mode, additional metrics are available:
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `homelab_deploy_builds_total` | Counter | `repo`, `status` | Total builds processed |
| `homelab_deploy_build_host_total` | Counter | `repo`, `host`, `status` | Total host builds processed |
| `homelab_deploy_build_duration_seconds` | Histogram | `repo`, `host` | Build execution time per host |
| `homelab_deploy_build_last_timestamp` | Gauge | `repo` | Timestamp of last build attempt |
| `homelab_deploy_build_last_success_timestamp` | Gauge | `repo` | Timestamp of last successful build |
| `homelab_deploy_build_last_failure_timestamp` | Gauge | `repo` | Timestamp of last failed build |
**Label values:**
- `status`: `success`, `failure`
- `repo`: Repository name from config
- `host`: Host name being built
## Message Protocol
### Deploy Request
@@ -325,6 +464,37 @@ sum(homelab_deploy_deployment_in_progress)
**Error codes:** `invalid_revision`, `invalid_action`, `already_running`, `build_failed`, `timeout`
### Build Request
```json
{
"repo": "nixos-servers",
"target": "all",
"branch": "main",
"reply_to": "build.responses.abc123"
}
```
### Build Response
```json
{
"status": "completed",
"message": "built 5/5 hosts successfully",
"results": [
{"host": "host1", "success": true, "duration_seconds": 120.5},
{"host": "host2", "success": true, "duration_seconds": 95.3}
],
"total_duration_seconds": 450.2,
"succeeded": 5,
"failed": 0
}
```
**Status values:** `started`, `progress`, `completed`, `failed`, `rejected`
Progress updates include `host`, `host_success`, `hosts_completed`, and `hosts_total` fields.
## NATS Authentication
All connections use NKey authentication. Generate keys with:
@@ -354,13 +524,22 @@ The deployment system uses the following NATS subject hierarchy:
- `deploy.prod.all` - Deploy to all production hosts
- `deploy.prod.role.dns` - Deploy to all DNS servers in production
### Build Subjects
| Subject Pattern | Purpose |
|-----------------|---------|
| `build.<repo>.*` | Build requests for a repository |
| `build.<repo>.all` | Build all hosts in a repository |
| `build.<repo>.<hostname>` | Build a specific host |
### Response Subjects
| Subject Pattern | Purpose |
|-----------------|---------|
| `deploy.responses.<uuid>` | Unique reply subject for each deployment request |
| `build.responses.<uuid>` | Unique reply subject for each build request |
Deployers create a unique response subject for each request and include it in the `reply_to` field. Listeners publish status updates to this subject.
Deployers and build clients create a unique response subject for each request and include it in the `reply_to` field. Listeners and builders publish status updates to this subject.
### Discovery Subject
@@ -451,7 +630,9 @@ authorization {
| Credential Type | Publish | Subscribe |
|-----------------|---------|-----------|
| Listener | `deploy.responses.>`, `deploy.discover` | Own subjects, `deploy.discover` |
| Builder | `build.responses.>` | `build.<repo>.*` for each configured repo |
| Test deployer | `deploy.test.>`, `deploy.discover` | `deploy.responses.>`, `deploy.discover` |
| Build client | `build.<repo>.*` | `build.responses.>` |
| Admin deployer | `deploy.>` | `deploy.>` |
### Generating NKeys