bootstrap: implement automated VM bootstrap mechanism for Phase 3
Add systemd service that automatically bootstraps freshly deployed VMs with their host-specific NixOS configuration from the flake repository. Changes: - hosts/template2/bootstrap.nix: New systemd oneshot service that: - Runs after cloud-init completes (ensures hostname is set) - Reads hostname from hostnamectl (set by cloud-init from Terraform) - Checks network connectivity via HTTPS (curl) - Runs nixos-rebuild boot with flake URL - Reboots on success, fails gracefully with clear errors on failure - hosts/template2/configuration.nix: Configure cloud-init datasource - Changed from NoCloud to ConfigDrive (used by Proxmox) - Allows cloud-init to receive config from Proxmox - hosts/template2/default.nix: Import bootstrap.nix module - terraform/vms.tf: Add cloud-init disk to VMs - Configure disks.ide.ide2.cloudinit block - Removed invalid cloudinit_cdrom_storage parameter - Enables Proxmox to inject cloud-init configuration - TODO.md: Mark Phase 3 as completed This eliminates the manual nixos-rebuild step from the deployment workflow. VMs now automatically pull and apply their configuration on first boot. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
53
TODO.md
53
TODO.md
@@ -105,32 +105,47 @@ create-host \
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Bootstrap Mechanism
|
||||
### Phase 3: Bootstrap Mechanism ✅ COMPLETED
|
||||
|
||||
**Status:** ✅ Fully implemented and tested
|
||||
**Completed:** 2025-02-01
|
||||
|
||||
**Goal:** Get freshly deployed VM to apply its specific host configuration
|
||||
|
||||
**Challenge:** Chicken-and-egg problem - VM needs to know its hostname and pull the right config
|
||||
**Implementation:** Systemd oneshot service that runs on first boot after cloud-init
|
||||
|
||||
**Option A: Cloud-init bootstrap script**
|
||||
- [ ] Add cloud-init `runcmd` to template2 that:
|
||||
- [ ] Reads hostname from cloud-init metadata
|
||||
- [ ] Runs `nixos-rebuild boot --flake git+https://git.t-juice.club/torjus/nixos-servers.git#${hostname}`
|
||||
- [ ] Reboots into the new configuration
|
||||
- [ ] Test cloud-init script execution on fresh VM
|
||||
- [ ] Handle failure cases (flake doesn't exist, network issues)
|
||||
**Approach taken:** Systemd service (variant of Option A)
|
||||
- Systemd service `nixos-bootstrap.service` runs on first boot
|
||||
- Depends on `cloud-config.service` to ensure hostname is set
|
||||
- Reads hostname from `hostnamectl` (set by cloud-init via Terraform)
|
||||
- Runs `nixos-rebuild boot --flake git+https://git.t-juice.club/torjus/nixos-servers.git#${hostname}`
|
||||
- Reboots into new configuration on success
|
||||
- Fails gracefully without reboot on errors (network issues, missing config)
|
||||
- Service self-destructs after successful bootstrap (not in new config)
|
||||
|
||||
**Option B: Terraform provisioner**
|
||||
- [ ] Use OpenTofu's `remote-exec` provisioner
|
||||
- [ ] SSH into new VM after creation
|
||||
- [ ] Run `nixos-rebuild boot --flake <url>#<hostname>`
|
||||
- [ ] Trigger reboot via SSH
|
||||
**Tasks:**
|
||||
- [x] Create bootstrap service module in template2
|
||||
- [x] systemd oneshot service with proper dependencies
|
||||
- [x] Reads hostname from hostnamectl (cloud-init sets it)
|
||||
- [x] Checks network connectivity via HTTPS (curl)
|
||||
- [x] Runs nixos-rebuild boot with flake URL
|
||||
- [x] Reboots on success, fails gracefully on error
|
||||
- [x] Configure cloud-init datasource
|
||||
- [x] Use ConfigDrive datasource (Proxmox provider)
|
||||
- [x] Add cloud-init disk to Terraform VMs (disks.ide.ide2.cloudinit)
|
||||
- [x] Hostname passed via cloud-init user-data from Terraform
|
||||
- [x] Test bootstrap service execution on fresh VM
|
||||
- [x] Handle failure cases (flake doesn't exist, network issues)
|
||||
- [x] Clear error messages in journald
|
||||
- [x] No reboot on failure
|
||||
- [x] System remains accessible for debugging
|
||||
|
||||
**Option C: Two-stage deployment**
|
||||
- [ ] Deploy VM with template2 (minimal config)
|
||||
- [ ] Run Ansible playbook to bootstrap specific config
|
||||
- [ ] Similar to existing `run-upgrade.yml` pattern
|
||||
**Files:**
|
||||
- `hosts/template2/bootstrap.nix` - Bootstrap service definition
|
||||
- `hosts/template2/configuration.nix` - Cloud-init ConfigDrive datasource
|
||||
- `terraform/vms.tf` - Cloud-init disk configuration
|
||||
|
||||
**Decision needed:** Which approach fits best? (Recommend Option A for automation)
|
||||
**Deliverable:** ✅ VMs automatically bootstrap and reboot into host-specific configuration on first boot
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user