CLAUDE.md: update documentation from audit
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
- Fix OpenBao CLI name (bao, not vault) - Add vault01, testvm01-03 to hosts list - Document nixos-exporter and homelab-deploy flake inputs - Add vault/ and actions-runner/ services - Document homelab.host and homelab.deploy options - Document automatic Vault credential provisioning via wrapped tokens - Consolidate homelab module options into dedicated section Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
64
CLAUDE.md
64
CLAUDE.md
@@ -65,7 +65,7 @@ Do not run `nix flake update`. Should only be done manually by user.
|
||||
nix develop
|
||||
```
|
||||
|
||||
The devshell provides: `ansible`, `tofu` (OpenTofu), `vault` (OpenBao CLI), `create-host`, and `homelab-deploy`.
|
||||
The devshell provides: `ansible`, `tofu` (OpenTofu), `bao` (OpenBao CLI), `create-host`, and `homelab-deploy`.
|
||||
|
||||
**Important:** When suggesting commands that use devshell tools, always use `nix develop -c <command>` syntax rather than assuming the user is already in a devshell. For example:
|
||||
```bash
|
||||
@@ -286,9 +286,10 @@ The `current_rev` label contains the git commit hash of the deployed flake confi
|
||||
- `configuration.nix` - Host-specific settings (networking, hardware, users)
|
||||
- `/system/` - Shared system-level configurations applied to ALL hosts
|
||||
- Core modules: nix.nix, sshd.nix, sops.nix (legacy), vault-secrets.nix, acme.nix, autoupgrade.nix
|
||||
- Additional modules: motd.nix (dynamic MOTD), packages.nix (base packages), root-user.nix (root config), homelab-deploy.nix (NATS listener)
|
||||
- Monitoring: node-exporter and promtail on every host
|
||||
- `/modules/` - Custom NixOS modules
|
||||
- `homelab/` - Homelab-specific options (DNS automation, monitoring scrape targets)
|
||||
- `homelab/` - Homelab-specific options (see "Homelab Module Options" section below)
|
||||
- `/lib/` - Nix library functions
|
||||
- `dns-zone.nix` - DNS zone generation functions
|
||||
- `monitoring.nix` - Prometheus scrape target generation functions
|
||||
@@ -296,6 +297,8 @@ The `current_rev` label contains the git commit hash of the deployed flake confi
|
||||
- `home-assistant/` - Home automation stack
|
||||
- `monitoring/` - Observability stack (Prometheus, Grafana, Loki, Tempo)
|
||||
- `ns/` - DNS services (authoritative, resolver, zone generation)
|
||||
- `vault/` - OpenBao (Vault) secrets server
|
||||
- `actions-runner/` - GitHub Actions runner
|
||||
- `http-proxy/`, `ca/`, `postgres/`, `nats/`, `jellyfin/`, etc.
|
||||
- `/secrets/` - SOPS-encrypted secrets with age encryption (legacy, only used by ca)
|
||||
- `/common/` - Shared configurations (e.g., VM guest agent)
|
||||
@@ -329,25 +332,31 @@ All hosts automatically get:
|
||||
|
||||
### Active Hosts
|
||||
|
||||
Production servers managed by `rebuild-all.sh`:
|
||||
Production servers:
|
||||
- `ns1`, `ns2` - Primary/secondary DNS servers (10.69.13.5/6)
|
||||
- `ca` - Internal Certificate Authority
|
||||
- `vault01` - OpenBao (Vault) secrets server
|
||||
- `ha1` - Home Assistant + Zigbee2MQTT + Mosquitto
|
||||
- `http-proxy` - Reverse proxy
|
||||
- `monitoring01` - Full observability stack (Prometheus, Grafana, Loki, Tempo, Pyroscope)
|
||||
- `jelly01` - Jellyfin media server
|
||||
- `nix-cache01` - Binary cache server
|
||||
- `nix-cache01` - Binary cache server + GitHub Actions runner
|
||||
- `pgdb1` - PostgreSQL database
|
||||
- `nats1` - NATS messaging server
|
||||
|
||||
Template/test hosts:
|
||||
- `template1` - Base template for cloning new hosts
|
||||
Test/staging hosts:
|
||||
- `testvm01`, `testvm02`, `testvm03` - Test-tier VMs for branch testing and deployment validation
|
||||
|
||||
Template hosts:
|
||||
- `template1`, `template2` - Base templates for cloning new hosts
|
||||
|
||||
### Flake Inputs
|
||||
|
||||
- `nixpkgs` - NixOS 25.11 stable (primary)
|
||||
- `nixpkgs-unstable` - Unstable channel (available via overlay as `pkgs.unstable.<package>`)
|
||||
- `sops-nix` - Secrets management (legacy, only used by ca)
|
||||
- `nixos-exporter` - NixOS module for exposing flake revision metrics (used to verify deployments)
|
||||
- `homelab-deploy` - NATS-based remote deployment tool for test-tier hosts
|
||||
- Custom packages from git.t-juice.club:
|
||||
- `alerttonotify` - Alert routing
|
||||
- `labmon` - Lab monitoring
|
||||
@@ -439,9 +448,21 @@ Example VM deployment includes:
|
||||
- Custom CPU/memory/disk sizing
|
||||
- VLAN tagging
|
||||
- QEMU guest agent
|
||||
- Automatic Vault credential provisioning via `vault_wrapped_token`
|
||||
|
||||
OpenTofu outputs the VM's IP address after deployment for easy SSH access.
|
||||
|
||||
**Automatic Vault Credential Provisioning:**
|
||||
|
||||
VMs can receive Vault (OpenBao) credentials automatically during bootstrap:
|
||||
|
||||
1. OpenTofu generates a wrapped token via `terraform/vault/` and stores it in the VM configuration
|
||||
2. Cloud-init passes `VAULT_WRAPPED_TOKEN` and `NIXOS_FLAKE_BRANCH` to the bootstrap script
|
||||
3. The bootstrap script unwraps the token to obtain AppRole credentials
|
||||
4. Credentials are written to `/var/lib/vault/approle/` before the NixOS rebuild
|
||||
|
||||
This eliminates the need for manual `provision-approle.yml` playbook runs on new VMs. Bootstrap progress is logged to Loki with `job="bootstrap"` labels.
|
||||
|
||||
#### Template Rebuilding and Terraform State
|
||||
|
||||
When the Proxmox template is rebuilt (via `build-and-deploy-template.yml`), the template name may change. This would normally cause Terraform to want to recreate all existing VMs, but that's unnecessary since VMs are independent once cloned.
|
||||
@@ -521,11 +542,7 @@ Prometheus scrape targets are automatically generated from host configurations,
|
||||
- **External targets**: Non-flake hosts defined in `/services/monitoring/external-targets.nix`
|
||||
- **Library**: `lib/monitoring.nix` provides `generateNodeExporterTargets` and `generateScrapeConfigs`
|
||||
|
||||
Host monitoring options (`homelab.monitoring.*`):
|
||||
- `enable` (default: `true`) - Include host in Prometheus node-exporter scrape targets
|
||||
- `scrapeTargets` (default: `[]`) - Additional scrape targets exposed by this host (job_name, port, metrics_path, scheme, scrape_interval, honor_labels)
|
||||
|
||||
Service modules declare their scrape targets directly (e.g., `services/ca/default.nix` declares step-ca on port 9000). The Prometheus config on monitoring01 auto-generates scrape configs from all hosts.
|
||||
Service modules declare their scrape targets directly via `homelab.monitoring.scrapeTargets` (e.g., `services/ca/default.nix` declares step-ca on port 9000). The Prometheus config on monitoring01 auto-generates scrape configs from all hosts. See "Homelab Module Options" section for available options.
|
||||
|
||||
To add monitoring targets for non-NixOS hosts, edit `/services/monitoring/external-targets.nix`.
|
||||
|
||||
@@ -544,13 +561,30 @@ DNS zone entries are automatically generated from host configurations:
|
||||
- **External hosts**: Non-flake hosts defined in `/services/ns/external-hosts.nix`
|
||||
- **Serial number**: Uses `self.sourceInfo.lastModified` (git commit timestamp)
|
||||
|
||||
Host DNS options (`homelab.dns.*`):
|
||||
- `enable` (default: `true`) - Include host in DNS zone generation
|
||||
- `cnames` (default: `[]`) - List of CNAME aliases pointing to this host
|
||||
|
||||
Hosts are automatically excluded from DNS if:
|
||||
- `homelab.dns.enable = false` (e.g., template hosts)
|
||||
- No static IP configured (e.g., DHCP-only hosts)
|
||||
- Network interface is a VPN/tunnel (wg*, tun*, tap*)
|
||||
|
||||
To add DNS entries for non-NixOS hosts, edit `/services/ns/external-hosts.nix`.
|
||||
|
||||
### Homelab Module Options
|
||||
|
||||
The `modules/homelab/` directory defines custom options used across hosts for automation and metadata.
|
||||
|
||||
**Host options (`homelab.host.*`):**
|
||||
- `tier` - Deployment tier: `test` or `prod`. Test-tier hosts can receive remote deployments and have different credential access.
|
||||
- `priority` - Alerting priority: `high` or `low`. Controls alerting thresholds for the host.
|
||||
- `role` - Primary role designation (e.g., `dns`, `database`, `bastion`, `vault`)
|
||||
- `labels` - Free-form key-value metadata for host categorization
|
||||
|
||||
**DNS options (`homelab.dns.*`):**
|
||||
- `enable` (default: `true`) - Include host in DNS zone generation
|
||||
- `cnames` (default: `[]`) - List of CNAME aliases pointing to this host
|
||||
|
||||
**Monitoring options (`homelab.monitoring.*`):**
|
||||
- `enable` (default: `true`) - Include host in Prometheus node-exporter scrape targets
|
||||
- `scrapeTargets` (default: `[]`) - Additional scrape targets exposed by this host
|
||||
|
||||
**Deploy options (`homelab.deploy.*`):**
|
||||
- `enable` (default: `false`) - Enable NATS-based remote deployment listener. When enabled, the host listens for deployment commands via NATS and can be targeted by the `homelab-deploy` MCP server.
|
||||
|
||||
Reference in New Issue
Block a user