docs: update design with configurable subjects and improved module

- Add configurable NATS subject patterns with template variables
  (<hostname>, <tier>, <role>) for multi-tenant setups
- Add deploy.discover subject for host discovery
- Simplify CLI to use direct subjects with optional aliases via
  HOMELAB_DEPLOY_ALIAS_* environment variables
- Clarify request/reply flow with UUID-based response subjects
- Expand NixOS module with hardening options, package option,
  and configurable deploy/discover subjects
- Switch CLI framework from cobra to urfave/cli/v3

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-02-07 03:52:01 +01:00
parent 1460cc533d
commit 1f23a6ddc9
2 changed files with 196 additions and 47 deletions

View File

@@ -40,7 +40,7 @@ Subjects follow `deploy.<tier>.<target>`:
### Planned Package Structure ### Planned Package Structure
``` ```
cmd/homelab-deploy/main.go # CLI entrypoint with cobra subcommands cmd/homelab-deploy/main.go # CLI entrypoint with urfave/cli subcommands
internal/listener/ # Listener mode (NATS subscription, nixos-rebuild execution) internal/listener/ # Listener mode (NATS subscription, nixos-rebuild execution)
internal/mcp/ # MCP server mode internal/mcp/ # MCP server mode
internal/nats/ # NATS client wrapper internal/nats/ # NATS client wrapper

241
design.md
View File

@@ -62,7 +62,20 @@ homelab-deploy listener \
--nkey-file /path/to/listener.nkey \ --nkey-file /path/to/listener.nkey \
--flake-url <git+https://...> \ --flake-url <git+https://...> \
[--role <role>] \ [--role <role>] \
[--timeout 600] [--timeout 600] \
[--deploy-subject <subject>]... \
[--discover-subject <subject>]
# Subject flags can be repeated and use template variables:
homelab-deploy listener \
--hostname ns1 \
--tier prod \
--role dns \
--deploy-subject "deploy.<tier>.<hostname>" \
--deploy-subject "deploy.<tier>.all" \
--deploy-subject "deploy.<tier>.role.<role>" \
--discover-subject "deploy.discover" \
...
# MCP server mode (for AI assistants) # MCP server mode (for AI assistants)
homelab-deploy mcp \ homelab-deploy mcp \
@@ -71,62 +84,113 @@ homelab-deploy mcp \
[--enable-admin --admin-nkey-file /path/to/admin.nkey] [--enable-admin --admin-nkey-file /path/to/admin.nkey]
# CLI commands for manual use # CLI commands for manual use
homelab-deploy deploy <hostname> \ # Deploy to a specific subject
homelab-deploy deploy <subject> \
--nats-url nats://server:4222 \ --nats-url nats://server:4222 \
--nkey-file /path/to/deployer.nkey \ --nkey-file /path/to/deployer.nkey \
[--branch <branch>] \ [--branch <branch>] \
[--action <switch|boot|test|dry-activate>] [--action <switch|boot|test|dry-activate>]
homelab-deploy deploy \ # Examples:
--tier <test|prod> \ homelab-deploy deploy deploy.prod.ns1 # Deploy to specific host
--all \ homelab-deploy deploy deploy.test.all # Deploy to all test hosts
--nats-url nats://server:4222 \ homelab-deploy deploy deploy.prod.role.dns # Deploy to all prod DNS hosts
--nkey-file /path/to/deployer.nkey \
[--branch <branch>] \
[--action <switch|boot|test|dry-activate>]
homelab-deploy deploy \ # Using aliases (configured via environment variables)
--tier <test|prod> \ homelab-deploy deploy test # Expands to configured subject
--role <role> \ homelab-deploy deploy prod-dns # Expands to configured subject
--nats-url nats://server:4222 \
--nkey-file /path/to/deployer.nkey \
[--branch <branch>] \
[--action <switch|boot|test|dry-activate>]
``` ```
### CLI Subject Aliases
The CLI supports subject aliases via environment variables. If the `<subject>` argument doesn't look like a NATS subject (no dots), the CLI checks for an alias.
**Environment variable format:** `HOMELAB_DEPLOY_ALIAS_<NAME>=<subject>`
```bash
export HOMELAB_DEPLOY_ALIAS_TEST="deploy.test.all"
export HOMELAB_DEPLOY_ALIAS_PROD="deploy.prod.all"
export HOMELAB_DEPLOY_ALIAS_PROD_DNS="deploy.prod.role.dns"
# Now these work:
homelab-deploy deploy test # -> deploy.test.all
homelab-deploy deploy prod # -> deploy.prod.all
homelab-deploy deploy prod-dns # -> deploy.prod.role.dns
```
Alias names are case-insensitive and hyphens are converted to underscores when looking up the environment variable.
## NATS Subject Structure ## NATS Subject Structure
Subjects follow the pattern `deploy.<tier>.<target>`: Subjects follow the pattern `deploy.<tier>.<target>` by default, but are fully configurable:
| Subject Pattern | Description | | Subject Pattern | Description |
|-----------------|-------------| |-----------------|-------------|
| `deploy.<tier>.<hostname>` | Deploy to specific host (e.g., `deploy.prod.ns1`) | | `deploy.<tier>.<hostname>` | Deploy to specific host (e.g., `deploy.prod.ns1`) |
| `deploy.<tier>.all` | Deploy to all hosts in tier (e.g., `deploy.test.all`) | | `deploy.<tier>.all` | Deploy to all hosts in tier (e.g., `deploy.test.all`) |
| `deploy.<tier>.role.<role>` | Deploy to hosts with role in tier (e.g., `deploy.prod.role.dns`) | | `deploy.<tier>.role.<role>` | Deploy to hosts with role in tier (e.g., `deploy.prod.role.dns`) |
| `deploy.responses.<request-id>` | Response subject for request/reply pattern | | `deploy.responses.<uuid>` | Response subject for request/reply (UUID generated by CLI) |
| `deploy.discover` | Host discovery requests |
### Subject Customization
Listeners can configure custom subject patterns using template variables:
- `<hostname>` - The listener's hostname
- `<tier>` - The listener's tier (test/prod)
- `<role>` - The listener's role (if configured)
This allows prefixing subjects for multi-tenant setups (e.g., `homelab.deploy.<tier>.<hostname>`).
## Listener Mode ## Listener Mode
### Responsibilities ### Responsibilities
1. Connect to NATS using NKey authentication 1. Connect to NATS using NKey authentication
2. Subscribe to subjects based on hostname, tier, and role 2. Subscribe to configured deploy subjects (with template expansion)
3. Validate incoming deployment requests 3. Subscribe to discovery subject and respond with host metadata
4. Execute `nixos-rebuild` with the specified parameters 4. Validate incoming deployment requests
5. Report status back via NATS reply subject 5. Execute `nixos-rebuild` with the specified parameters
6. Report status back via NATS reply subject
### Subject Subscriptions ### Subject Subscriptions
A listener subscribes to multiple subjects based on its configuration: Listeners subscribe to a configurable list of subjects. The configuration uses template variables that are expanded at runtime:
- `deploy.<tier>.<hostname>` - Direct messages to this host ```yaml
- `deploy.<tier>.all` - Broadcast to all hosts in tier listener:
- `deploy.<tier>.role.<role>` - Broadcast to hosts with matching role (only if role is configured) hostname: ns1
tier: prod
role: dns
**Example:** A host with `hostname=ns1, tier=prod, role=dns` subscribes to: deploy_subjects:
- "deploy.<tier>.<hostname>"
- "deploy.<tier>.all"
- "deploy.<tier>.role.<role>"
discover_subject: "deploy.discover"
```
Template variables:
- `<hostname>` - Replaced with the configured hostname
- `<tier>` - Replaced with the configured tier
- `<role>` - Replaced with the configured role (subject skipped if role is null)
**Example:** With the above configuration, the listener subscribes to:
- `deploy.prod.ns1` - `deploy.prod.ns1`
- `deploy.prod.all` - `deploy.prod.all`
- `deploy.prod.role.dns` - `deploy.prod.role.dns`
- `deploy.discover`
**Prefixed example:** For multi-tenant setups:
```yaml
listener:
hostname: ns1
tier: prod
deploy_subjects:
- "homelab.deploy.<tier>.<hostname>"
- "homelab.deploy.<tier>.all"
discover_subject: "homelab.deploy.discover"
```
### Message Formats ### Message Formats
@@ -171,18 +235,21 @@ A listener subscribes to multiple subjects based on its configuration:
### Request/Reply Flow ### Request/Reply Flow
1. Deployer sends request with unique `reply_to` subject 1. CLI generates a UUID for the request (e.g., `550e8400-e29b-41d4-a716-446655440000`)
2. Deployer subscribes to the `reply_to` subject before sending 2. CLI subscribes to `deploy.responses.<uuid>`
3. Listener validates request: 3. CLI publishes deploy request to target subject with `reply_to: "deploy.responses.<uuid>"`
4. Listener validates request:
- Checks revision exists using `git ls-remote` - Checks revision exists using `git ls-remote`
- Checks no other deployment is running - Checks no other deployment is running
4. Listener sends immediate response: 5. Listener publishes response to the `reply_to` subject:
- `{"status": "rejected", ...}` if validation fails, or - `{"status": "rejected", ...}` if validation fails, or
- `{"status": "started", ...}` if deployment begins - `{"status": "started", ...}` if deployment begins
5. If started, listener executes nixos-rebuild 6. If started, listener executes nixos-rebuild
6. Listener sends final response: 7. Listener publishes final response to the same `reply_to` subject:
- `{"status": "completed", ...}` on success, or - `{"status": "completed", ...}` on success, or
- `{"status": "failed", ...}` on failure - `{"status": "failed", ...}` on failure
8. CLI receives responses and displays progress/results
9. CLI unsubscribes after receiving final status or timeout
### Deployment Execution ### Deployment Execution
@@ -354,18 +421,34 @@ The `list_hosts` tool needs to know available hosts. Options:
1. **Static configuration**: Read from a config file or environment variable 1. **Static configuration**: Read from a config file or environment variable
2. **NATS request**: Publish to a discovery subject and collect responses from listeners 2. **NATS request**: Publish to a discovery subject and collect responses from listeners
Recommend option 2: Listeners respond to `deploy.discover` with their metadata: Recommend option 2: Listeners subscribe to their configured `discover_subject` and respond with metadata.
**Discovery request:**
```json
{
"reply_to": "deploy.responses.discover-abc123"
}
```
**Discovery response:**
```json ```json
{ {
"hostname": "ns1", "hostname": "ns1",
"tier": "prod", "tier": "prod",
"role": "dns" "role": "dns",
"deploy_subjects": [
"deploy.prod.ns1",
"deploy.prod.all",
"deploy.prod.role.dns"
]
} }
``` ```
The response includes the expanded `deploy_subjects` so clients know exactly which subjects reach this host.
## NixOS Module ## NixOS Module
The NixOS module configures the listener as a systemd service. The NixOS module configures the listener as a systemd service with appropriate hardening.
### Module Options ### Module Options
@@ -374,9 +457,12 @@ The NixOS module configures the listener as a systemd service.
options.services.homelab-deploy.listener = { options.services.homelab-deploy.listener = {
enable = lib.mkEnableOption "homelab-deploy listener service"; enable = lib.mkEnableOption "homelab-deploy listener service";
package = lib.mkPackageOption pkgs "homelab-deploy" { };
hostname = lib.mkOption { hostname = lib.mkOption {
type = lib.types.str; type = lib.types.str;
description = "Hostname for this listener (used for NATS subject)"; default = config.networking.hostName;
description = "Hostname for this listener (used in subject templates)";
}; };
tier = lib.mkOption { tier = lib.mkOption {
@@ -393,16 +479,19 @@ The NixOS module configures the listener as a systemd service.
natsUrl = lib.mkOption { natsUrl = lib.mkOption {
type = lib.types.str; type = lib.types.str;
description = "NATS server URL"; description = "NATS server URL";
example = "nats://nats.example.com:4222";
}; };
nkeyFile = lib.mkOption { nkeyFile = lib.mkOption {
type = lib.types.path; type = lib.types.path;
description = "Path to NKey seed file for NATS authentication"; description = "Path to NKey seed file for NATS authentication";
example = "/run/secrets/homelab-deploy-nkey";
}; };
flakeUrl = lib.mkOption { flakeUrl = lib.mkOption {
type = lib.types.str; type = lib.types.str;
description = "Git flake URL for nixos-rebuild"; description = "Git flake URL for nixos-rebuild";
example = "git+https://git.example.com/user/nixos-configs.git";
}; };
timeout = lib.mkOption { timeout = lib.mkOption {
@@ -410,19 +499,79 @@ The NixOS module configures the listener as a systemd service.
default = 600; default = 600;
description = "Deployment timeout in seconds"; description = "Deployment timeout in seconds";
}; };
deploySubjects = lib.mkOption {
type = lib.types.listOf lib.types.str;
default = [
"deploy.<tier>.<hostname>"
"deploy.<tier>.all"
"deploy.<tier>.role.<role>"
];
description = ''
List of NATS subjects to subscribe to for deployment requests.
Template variables: <hostname>, <tier>, <role>
'';
};
discoverSubject = lib.mkOption {
type = lib.types.str;
default = "deploy.discover";
description = "NATS subject for host discovery requests";
};
environment = lib.mkOption {
type = lib.types.attrsOf lib.types.str;
default = { };
description = "Additional environment variables for the service";
example = { GIT_SSH_COMMAND = "ssh -i /run/secrets/deploy-key"; };
};
}; };
} }
``` ```
### Systemd Service ### Systemd Service
The module should create a systemd service with: The module creates a hardened systemd service:
- `Type=simple`
- `Restart=always` ```nix
- `RestartSec=10` systemd.services.homelab-deploy-listener = {
- Run as root (required for nixos-rebuild) description = "homelab-deploy listener";
- Proper ordering (after network-online.target) wantedBy = [ "multi-user.target" ];
- Resource limits if desired after = [ "network-online.target" ];
wants = [ "network-online.target" ];
environment = cfg.environment;
serviceConfig = {
Type = "simple";
ExecStart = "${cfg.package}/bin/homelab-deploy listener ...";
Restart = "always";
RestartSec = 10;
# Hardening (compatible with nixos-rebuild requirements)
NoNewPrivileges = false; # nixos-rebuild may need to spawn privileged processes
ProtectSystem = "false"; # nixos-rebuild modifies /nix/store and /run
ProtectHome = "read-only";
PrivateTmp = true;
PrivateDevices = true;
ProtectKernelTunables = true;
ProtectKernelModules = true;
ProtectControlGroups = true;
RestrictAddressFamilies = [ "AF_UNIX" "AF_INET" "AF_INET6" ];
RestrictNamespaces = false; # nix build uses namespaces
RestrictSUIDSGID = true;
LockPersonality = true;
MemoryDenyWriteExecute = false; # nix may need this
SystemCallArchitectures = "native";
};
};
```
**Note:** Some hardening options are relaxed because `nixos-rebuild` requires:
- Write access to `/nix/store` for building
- Ability to activate system configurations
- Network access for fetching from git/cache
- Namespace support for nix sandbox builds
## NATS Authentication ## NATS Authentication
@@ -469,9 +618,9 @@ The flake.nix should provide:
### Go Dependencies ### Go Dependencies
Recommended libraries: Recommended libraries:
- `github.com/urfave/cli/v3` - CLI framework
- `github.com/nats-io/nats.go` - NATS client - `github.com/nats-io/nats.go` - NATS client
- `github.com/spf13/cobra` - CLI framework - `github.com/mark3labs/mcp-go` - MCP server implementation
- `github.com/mark3labs/mcp-go` - MCP server implementation (or similar)
- Standard library for JSON, logging, process execution - Standard library for JSON, logging, process execution
### Error Handling ### Error Handling