This repository has been archived on 2026-03-09. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
homelab-deploy/README.md
Torjus Håkestad 14f5b31faf feat: add builder mode for centralized Nix builds
Add a new "builder" capability to trigger Nix builds on a dedicated
build host via NATS messaging. This allows pre-building NixOS
configurations before deployment.

New components:
- Builder mode: subscribes to build.<repo>.* subjects, executes nix build
- Build CLI command: triggers builds with progress tracking
- MCP build tool: available with --enable-builds flag
- Builder metrics: tracks build success/failure per repo and host
- NixOS module: services.homelab-deploy.builder

The builder uses a YAML config file to define allowed repositories
with their URLs and default branches. Builds can target all hosts
or specific hosts, with real-time progress updates.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-10 22:03:14 +01:00

655 lines
19 KiB
Markdown

# homelab-deploy
A message-based deployment system for NixOS configurations using NATS for messaging. Deploy NixOS configurations across a fleet of hosts with support for tiered access control, role-based targeting, and AI assistant integration.
## Overview
The `homelab-deploy` binary provides four operational modes:
1. **Listener mode** - Runs on each NixOS host as a systemd service, subscribing to NATS subjects and executing `nixos-rebuild` when deployment requests arrive
2. **Builder mode** - Runs on a dedicated build host, subscribing to NATS subjects and executing `nix build` to pre-build configurations
3. **MCP mode** - Runs as an MCP (Model Context Protocol) server, exposing deployment tools for AI assistants
4. **CLI mode** - Manual deployment and build commands for administrators
## Installation
### Using Nix Flakes
```bash
# Run directly
nix run github:torjus/homelab-deploy -- --help
# Add to your flake inputs
{
inputs.homelab-deploy.url = "github:torjus/homelab-deploy";
}
```
### Building from source
```bash
nix develop
go build ./cmd/homelab-deploy
```
## CLI Usage
### Listener Mode
Run on each NixOS host to listen for deployment requests:
```bash
homelab-deploy listener \
--hostname myhost \
--tier prod \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/listener.nkey \
--flake-url git+https://git.example.com/user/nixos-configs.git \
--role dns \
--timeout 600
```
#### Listener Flags
| Flag | Required | Description |
|------|----------|-------------|
| `--hostname` | Yes | Hostname for this listener |
| `--tier` | Yes | Deployment tier (`test` or `prod`) |
| `--nats-url` | Yes | NATS server URL |
| `--nkey-file` | Yes | Path to NKey seed file |
| `--flake-url` | Yes | Git flake URL for nixos-rebuild |
| `--role` | No | Role for role-based targeting |
| `--timeout` | No | Deployment timeout in seconds (default: 600) |
| `--deploy-subject` | No | NATS subjects to subscribe to (repeatable) |
| `--discover-subject` | No | Discovery subject (default: `deploy.discover`) |
| `--metrics-enabled` | No | Enable Prometheus metrics endpoint |
| `--metrics-addr` | No | Metrics HTTP server address (default: `:9972`) |
#### Subject Templates
Deploy subjects support template variables that are expanded at startup:
- `<hostname>` - The listener's hostname
- `<tier>` - The listener's tier
- `<role>` - The listener's role (subjects with `<role>` are skipped if role is not set)
Default subjects:
```
deploy.<tier>.<hostname>
deploy.<tier>.all
deploy.<tier>.role.<role>
```
### Deploy Command
Deploy to hosts via NATS:
```bash
# Deploy to a specific host
homelab-deploy deploy deploy.prod.myhost \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/deployer.nkey \
--branch main \
--action switch
# Deploy to all test hosts
homelab-deploy deploy deploy.test.all \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/deployer.nkey
# Deploy to all prod DNS servers
homelab-deploy deploy deploy.prod.role.dns \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/deployer.nkey
```
#### Deploy Flags
| Flag | Required | Env Var | Description |
|------|----------|---------|-------------|
| `--nats-url` | Yes | `HOMELAB_DEPLOY_NATS_URL` | NATS server URL |
| `--nkey-file` | Yes | `HOMELAB_DEPLOY_NKEY_FILE` | Path to NKey seed file |
| `--branch` | No | `HOMELAB_DEPLOY_BRANCH` | Git branch or commit (default: `master`) |
| `--action` | No | `HOMELAB_DEPLOY_ACTION` | nixos-rebuild action (default: `switch`) |
| `--timeout` | No | `HOMELAB_DEPLOY_TIMEOUT` | Response timeout in seconds (default: 900) |
#### Subject Aliases
Configure aliases via environment variables to simplify common deployments:
```bash
export HOMELAB_DEPLOY_ALIAS_TEST="deploy.test.all"
export HOMELAB_DEPLOY_ALIAS_PROD="deploy.prod.all"
export HOMELAB_DEPLOY_ALIAS_PROD_DNS="deploy.prod.role.dns"
# Now use short aliases
homelab-deploy deploy test --nats-url ... --nkey-file ...
homelab-deploy deploy prod-dns --nats-url ... --nkey-file ...
```
Alias lookup: `HOMELAB_DEPLOY_ALIAS_<NAME>` where name is uppercased and hyphens become underscores.
### Builder Mode
Run on a dedicated build host to pre-build NixOS configurations:
```bash
homelab-deploy builder \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/builder.nkey \
--config /etc/homelab-deploy/builder.yaml \
--timeout 1800 \
--metrics-enabled \
--metrics-addr :9973
```
#### Builder Configuration File
The builder uses a YAML configuration file to define allowed repositories:
```yaml
repos:
nixos-servers:
url: "git+https://git.example.com/org/nixos-servers.git"
default_branch: "master"
homelab:
url: "git+ssh://git@github.com/user/homelab.git"
default_branch: "main"
```
#### Builder Flags
| Flag | Required | Description |
|------|----------|-------------|
| `--nats-url` | Yes | NATS server URL |
| `--nkey-file` | Yes | Path to NKey seed file |
| `--config` | Yes | Path to builder configuration file |
| `--timeout` | No | Build timeout per host in seconds (default: 1800) |
| `--metrics-enabled` | No | Enable Prometheus metrics endpoint |
| `--metrics-addr` | No | Metrics HTTP server address (default: `:9973`) |
### Build Command
Trigger a build on the build server:
```bash
# Build all hosts in a repository
homelab-deploy build nixos-servers --all \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/deployer.nkey
# Build a specific host
homelab-deploy build nixos-servers myhost \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/deployer.nkey
# Build with a specific branch
homelab-deploy build nixos-servers --all --branch feature-x \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/deployer.nkey
# JSON output for scripting
homelab-deploy build nixos-servers --all --json \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/deployer.nkey
```
#### Build Flags
| Flag | Required | Env Var | Description |
|------|----------|---------|-------------|
| `--nats-url` | Yes | `HOMELAB_DEPLOY_NATS_URL` | NATS server URL |
| `--nkey-file` | Yes | `HOMELAB_DEPLOY_NKEY_FILE` | Path to NKey seed file |
| `--branch` | No | `HOMELAB_DEPLOY_BRANCH` | Git branch (uses repo default if not specified) |
| `--all` | No | - | Build all hosts in the repository |
| `--timeout` | No | `HOMELAB_DEPLOY_BUILD_TIMEOUT` | Response timeout in seconds (default: 3600) |
| `--json` | No | - | Output results as JSON |
### MCP Server Mode
Run as an MCP server for AI assistant integration:
```bash
# Test-tier only access
homelab-deploy mcp \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/mcp.nkey
# With admin access to all tiers
homelab-deploy mcp \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/mcp.nkey \
--enable-admin \
--admin-nkey-file /run/secrets/admin.nkey
# With build tool enabled
homelab-deploy mcp \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/mcp.nkey \
--enable-builds
```
#### MCP Tools
| Tool | Description |
|------|-------------|
| `deploy` | Deploy to test-tier hosts only |
| `deploy_admin` | Deploy to any tier (requires `--enable-admin`) |
| `list_hosts` | Discover available deployment targets |
| `build` | Trigger builds on the build server (requires `--enable-builds`) |
#### Tool Parameters
**deploy / deploy_admin:**
- `hostname` - Target specific host
- `all` - Deploy to all hosts (in tier)
- `role` - Deploy to hosts with this role
- `branch` - Git branch/commit (default: master)
- `action` - switch, boot, test, dry-activate (default: switch)
- `tier` - Required for deploy_admin only
**list_hosts:**
- `tier` - Filter by tier (optional)
**build:**
- `repo` - Repository name (required, must match builder config)
- `target` - Target hostname (optional, defaults to all)
- `all` - Build all hosts (default if no target specified)
- `branch` - Git branch (uses repo default if not specified)
## NixOS Module
Add the module to your NixOS configuration:
```nix
{
inputs.homelab-deploy.url = "github:torjus/homelab-deploy";
outputs = { self, nixpkgs, homelab-deploy, ... }: {
nixosConfigurations.myhost = nixpkgs.lib.nixosSystem {
modules = [
homelab-deploy.nixosModules.default
{
services.homelab-deploy.listener = {
enable = true;
tier = "prod";
role = "dns";
natsUrl = "nats://nats.example.com:4222";
nkeyFile = "/run/secrets/homelab-deploy-nkey";
flakeUrl = "git+https://git.example.com/user/nixos-configs.git";
};
}
];
};
};
}
```
### Module Options
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `enable` | bool | `false` | Enable the listener service |
| `package` | package | from flake | Package to use |
| `hostname` | string | `config.networking.hostName` | Hostname for subject templates |
| `tier` | enum | required | `"test"` or `"prod"` |
| `role` | string | `null` | Role for role-based targeting |
| `natsUrl` | string | required | NATS server URL |
| `nkeyFile` | path | required | Path to NKey seed file |
| `flakeUrl` | string | required | Git flake URL |
| `timeout` | int | `600` | Deployment timeout in seconds |
| `deploySubjects` | list of string | see below | Subjects to subscribe to |
| `discoverSubject` | string | `"deploy.discover"` | Discovery subject |
| `environment` | attrs | `{}` | Additional environment variables |
| `metrics.enable` | bool | `false` | Enable Prometheus metrics endpoint |
| `metrics.address` | string | `":9972"` | Metrics HTTP server address |
| `metrics.openFirewall` | bool | `false` | Open firewall for metrics port |
Default `deploySubjects`:
```nix
[
"deploy.<tier>.<hostname>"
"deploy.<tier>.all"
"deploy.<tier>.role.<role>"
]
```
### Builder Module Options
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `enable` | bool | `false` | Enable the builder service |
| `package` | package | from flake | Package to use |
| `natsUrl` | string | required | NATS server URL |
| `nkeyFile` | path | required | Path to NKey seed file |
| `configFile` | path | required | Path to builder configuration file |
| `timeout` | int | `1800` | Build timeout per host in seconds |
| `environment` | attrs | `{}` | Additional environment variables |
| `metrics.enable` | bool | `false` | Enable Prometheus metrics endpoint |
| `metrics.address` | string | `":9973"` | Metrics HTTP server address |
| `metrics.openFirewall` | bool | `false` | Open firewall for metrics port |
Example builder configuration:
```nix
services.homelab-deploy.builder = {
enable = true;
natsUrl = "nats://nats.example.com:4222";
nkeyFile = "/run/secrets/homelab-deploy-builder-nkey";
configFile = "/etc/homelab-deploy/builder.yaml";
metrics = {
enable = true;
address = ":9973";
openFirewall = true;
};
};
```
## Prometheus Metrics
The listener can expose Prometheus metrics for monitoring deployment operations.
### Enabling Metrics
**CLI:**
```bash
homelab-deploy listener \
--hostname myhost \
--tier prod \
--nats-url nats://nats.example.com:4222 \
--nkey-file /run/secrets/listener.nkey \
--flake-url git+https://git.example.com/user/nixos-configs.git \
--metrics-enabled \
--metrics-addr :9972
```
**NixOS module:**
```nix
services.homelab-deploy.listener = {
enable = true;
tier = "prod";
natsUrl = "nats://nats.example.com:4222";
nkeyFile = "/run/secrets/homelab-deploy-nkey";
flakeUrl = "git+https://git.example.com/user/nixos-configs.git";
metrics = {
enable = true;
address = ":9972";
openFirewall = true; # Optional: open firewall for Prometheus scraping
};
};
```
### Available Metrics
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `homelab_deploy_deployments_total` | Counter | `status`, `action`, `error_code` | Total deployment requests processed |
| `homelab_deploy_deployment_duration_seconds` | Histogram | `action`, `success` | Deployment execution time |
| `homelab_deploy_deployment_in_progress` | Gauge | - | 1 if deployment running, 0 otherwise |
| `homelab_deploy_info` | Gauge | `hostname`, `tier`, `role`, `version` | Static instance metadata |
**Label values:**
- `status`: `completed`, `failed`, `rejected`
- `action`: `switch`, `boot`, `test`, `dry-activate`
- `error_code`: `invalid_action`, `invalid_revision`, `already_running`, `build_failed`, `timeout`, or empty
- `success`: `true`, `false`
### HTTP Endpoints
| Endpoint | Description |
|----------|-------------|
| `/metrics` | Prometheus metrics in text format |
| `/health` | Health check (returns `ok`) |
### Example Prometheus Queries
```promql
# Average deployment duration (last hour)
rate(homelab_deploy_deployment_duration_seconds_sum[1h]) /
rate(homelab_deploy_deployment_duration_seconds_count[1h])
# Deployment success rate (last 24 hours)
sum(rate(homelab_deploy_deployments_total{status="completed"}[24h])) /
sum(rate(homelab_deploy_deployments_total{status=~"completed|failed"}[24h]))
# 95th percentile deployment time
histogram_quantile(0.95, rate(homelab_deploy_deployment_duration_seconds_bucket[1h]))
# Currently running deployments across all hosts
sum(homelab_deploy_deployment_in_progress)
```
### Builder Metrics
When running in builder mode, additional metrics are available:
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `homelab_deploy_builds_total` | Counter | `repo`, `status` | Total builds processed |
| `homelab_deploy_build_host_total` | Counter | `repo`, `host`, `status` | Total host builds processed |
| `homelab_deploy_build_duration_seconds` | Histogram | `repo`, `host` | Build execution time per host |
| `homelab_deploy_build_last_timestamp` | Gauge | `repo` | Timestamp of last build attempt |
| `homelab_deploy_build_last_success_timestamp` | Gauge | `repo` | Timestamp of last successful build |
| `homelab_deploy_build_last_failure_timestamp` | Gauge | `repo` | Timestamp of last failed build |
**Label values:**
- `status`: `success`, `failure`
- `repo`: Repository name from config
- `host`: Host name being built
## Message Protocol
### Deploy Request
```json
{
"action": "switch",
"revision": "main",
"reply_to": "deploy.responses.abc123"
}
```
### Deploy Response
```json
{
"hostname": "myhost",
"status": "completed",
"error": null,
"message": "Successfully switched to generation 42"
}
```
**Status values:** `accepted`, `rejected`, `started`, `completed`, `failed`
**Error codes:** `invalid_revision`, `invalid_action`, `already_running`, `build_failed`, `timeout`
### Build Request
```json
{
"repo": "nixos-servers",
"target": "all",
"branch": "main",
"reply_to": "build.responses.abc123"
}
```
### Build Response
```json
{
"status": "completed",
"message": "built 5/5 hosts successfully",
"results": [
{"host": "host1", "success": true, "duration_seconds": 120.5},
{"host": "host2", "success": true, "duration_seconds": 95.3}
],
"total_duration_seconds": 450.2,
"succeeded": 5,
"failed": 0
}
```
**Status values:** `started`, `progress`, `completed`, `failed`, `rejected`
Progress updates include `host`, `host_success`, `hosts_completed`, and `hosts_total` fields.
## NATS Authentication
All connections use NKey authentication. Generate keys with:
```bash
nk -gen user -pubout
```
Configure appropriate publish/subscribe permissions in your NATS server for each credential type.
## NATS Subject Structure
The deployment system uses the following NATS subject hierarchy:
### Deploy Subjects
| Subject Pattern | Purpose |
|-----------------|---------|
| `deploy.<tier>.<hostname>` | Deploy to a specific host |
| `deploy.<tier>.all` | Deploy to all hosts in a tier |
| `deploy.<tier>.role.<role>` | Deploy to hosts with a specific role in a tier |
**Tier values:** `test`, `prod`
**Examples:**
- `deploy.test.myhost` - Deploy to myhost in test tier
- `deploy.prod.all` - Deploy to all production hosts
- `deploy.prod.role.dns` - Deploy to all DNS servers in production
### Build Subjects
| Subject Pattern | Purpose |
|-----------------|---------|
| `build.<repo>.*` | Build requests for a repository |
| `build.<repo>.all` | Build all hosts in a repository |
| `build.<repo>.<hostname>` | Build a specific host |
### Response Subjects
| Subject Pattern | Purpose |
|-----------------|---------|
| `deploy.responses.<uuid>` | Unique reply subject for each deployment request |
| `build.responses.<uuid>` | Unique reply subject for each build request |
Deployers and build clients create a unique response subject for each request and include it in the `reply_to` field. Listeners and builders publish status updates to this subject.
### Discovery Subject
| Subject Pattern | Purpose |
|-----------------|---------|
| `deploy.discover` | Host discovery requests and responses |
Used by the `list_hosts` MCP tool and for discovering available deployment targets.
## Example NATS Configuration
Below is an example NATS server configuration implementing tiered authentication. This setup provides:
- **Listeners** - Each host has credentials to subscribe to its own subjects and publish responses
- **Test deployer** - Can deploy to test tier only (suitable for MCP without admin access)
- **Admin deployer** - Can deploy to all tiers (for CLI or MCP with admin access)
```conf
authorization {
users = [
# Listener for a test-tier host
{
nkey: "UTEST_HOST1_PUBLIC_KEY_HERE"
permissions: {
subscribe: [
"deploy.test.testhost1"
"deploy.test.all"
"deploy.test.role.>"
"deploy.discover"
]
publish: [
"deploy.responses.>"
"deploy.discover"
]
}
}
# Listener for a prod-tier host with 'dns' role
{
nkey: "UPROD_DNS1_PUBLIC_KEY_HERE"
permissions: {
subscribe: [
"deploy.prod.dns1"
"deploy.prod.all"
"deploy.prod.role.dns"
"deploy.discover"
]
publish: [
"deploy.responses.>"
"deploy.discover"
]
}
}
# Test-tier deployer (MCP without admin)
{
nkey: "UTEST_DEPLOYER_PUBLIC_KEY_HERE"
permissions: {
publish: [
"deploy.test.>"
"deploy.discover"
]
subscribe: [
"deploy.responses.>"
"deploy.discover"
]
}
}
# Admin deployer (full access to all tiers)
{
nkey: "UADMIN_DEPLOYER_PUBLIC_KEY_HERE"
permissions: {
publish: [
"deploy.>"
]
subscribe: [
"deploy.>"
]
}
}
]
}
```
### Key Permission Patterns
| Credential Type | Publish | Subscribe |
|-----------------|---------|-----------|
| Listener | `deploy.responses.>`, `deploy.discover` | Own subjects, `deploy.discover` |
| Builder | `build.responses.>` | `build.<repo>.*` for each configured repo |
| Test deployer | `deploy.test.>`, `deploy.discover` | `deploy.responses.>`, `deploy.discover` |
| Build client | `build.<repo>.*` | `build.responses.>` |
| Admin deployer | `deploy.>` | `deploy.>` |
### Generating NKeys
```bash
# Generate a keypair (outputs public key, saves seed to file)
nk -gen user -pubout > mykey.pub
# The seed (private key) is printed to stderr - save it securely
# Or generate and save seed directly
nk -gen user > mykey.seed
nk -inkey mykey.seed -pubout # Get public key from seed
```
The public key (starting with `U`) goes in the NATS server config. The seed file (starting with `SU`) is used by homelab-deploy via `--nkey-file`.
## License
MIT