Files
nixos-servers/docs/plans/openstack-nixos-image.md
Torjus Håkestad 5e92eb3220
Some checks failed
Run nix flake check / flake-check (push) Failing after 8m1s
Periodic flake update / flake-update (push) Successful in 2m23s
docs: add plan for NixOS OpenStack image
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 00:42:19 +01:00

105 lines
5.7 KiB
Markdown

# NixOS OpenStack Image
## Overview
Build and upload a NixOS base image to the OpenStack cluster at work, enabling NixOS-based VPS instances to replace the current Debian+Podman setup. This image will serve as the foundation for multiple external services:
- **Forgejo** (replacing Gitea on docker2)
- **WireGuard gateway** (replacing docker2's tunnel role, feeding into the remote-access plan)
- Any future externally-hosted services
## Current State
- VPS hosting runs on an OpenStack cluster with a personal quota
- Current VPS (`docker2.t-juice.club`) runs Debian with Podman containers
- Homelab already has a working Proxmox image pipeline: `template2` builds via `nixos-rebuild build-image --image-variant proxmox`, deployed via Ansible
- nixpkgs has a built-in `openstack` image variant in the same `image.modules` system used for Proxmox
## Decisions
- **No cloud-init dependency** - SSH key baked into the image, no need for metadata service
- **No bootstrap script** - VPS deployments are infrequent; manual `nixos-rebuild` after first boot is fine
- **No Vault access** - secrets handled manually until WireGuard access is set up (see remote-access plan)
- **Separate from homelab services** - no logging/metrics integration initially; revisit after remote-access WireGuard is in place
- **Repo placement TBD** - keep in this flake for now for convenience, but external hosts may move to a separate flake later since they can't use most shared `system/` modules (no Vault, no internal DNS, no Promtail)
- **OpenStack CLI in devshell** - add `openstackclient` package; credentials (`clouds.yaml`) stay outside the repo
- **Parallel deployment** - new Forgejo instance runs alongside docker2 initially, then CNAME moves over
## Approach
Follow the same pattern as the Proxmox template (`hosts/template2`), but targeting OpenStack's qcow2 format.
### What nixpkgs provides
The `image.modules.openstack` module produces a qcow2 image with:
- `openstack-config.nix`: EC2 metadata fetcher, SSH enabled, GRUB bootloader, serial console, auto-growing root partition
- `qemu-guest.nix` profile (virtio drivers)
- ext4 root filesystem with `autoResize`
### What we need to customize
The stock OpenStack image pulls SSH keys and hostname from EC2-style metadata. Since we're baking the SSH key into the image, we need a simpler configuration:
- SSH authorized keys baked into the image
- Base packages (age, vim, wget, git)
- Nix substituters (`cache.nixos.org` only - internal cache not reachable)
- systemd-networkd with DHCP
- GRUB bootloader
- Firewall enabled (public-facing host)
### Differences from template2
| Aspect | template2 (Proxmox) | openstack-template (OpenStack) |
|--------|---------------------|-------------------------------|
| Image format | VMA (`.vma.zst`) | qcow2 (`.qcow2`) |
| Image variant | `proxmox` | `openstack` |
| Cloud-init | ConfigDrive + NoCloud | Not used (SSH key baked in) |
| Nix cache | Internal + nixos.org | `cache.nixos.org` only |
| Vault | AppRole via wrapped token | None |
| Bootstrap | Automatic nixos-rebuild on first boot | Manual |
| Network | Internal DHCP | OpenStack DHCP |
| DNS | Internal ns1/ns2 | Public DNS |
| Firewall | Disabled (trusted network) | Enabled |
| System modules | Full `../../system` import | Minimal (sshd, packages only) |
## Implementation Steps
### Phase 1: Build the image
1. Create `hosts/openstack-template/` with minimal configuration
- `default.nix` - imports (only sshd and packages from `system/`, not the full set)
- `configuration.nix` - base config: SSH key, DHCP, GRUB, base packages, firewall on
- `hardware-configuration.nix` - qemu-guest profile with virtio drivers
- Exclude from DNS and monitoring (`homelab.dns.enable = false`, `homelab.monitoring.enable = false`)
- May need to override parts of `image.modules.openstack` to disable the EC2 metadata fetcher if it causes boot delays
2. Build with `nixos-rebuild build-image --image-variant openstack --flake .#openstack-template`
3. Verify the qcow2 image is produced in `result/`
### Phase 2: Upload and test
1. Add `openstackclient` to the devshell
2. Upload image: `openstack image create --disk-format qcow2 --file result/<image>.qcow2 nixos-template`
3. Boot a test instance from the image
4. Verify: SSH access works, DHCP networking, Nix builds work
5. Test manual `nixos-rebuild switch --flake` against the instance
### Phase 3: Automation (optional, later)
Consider an Ansible playbook similar to `build-and-deploy-template.yml` for image builds + uploads. Low priority since this will be done rarely.
## Open Questions
- [ ] Should external VPS hosts eventually move to a separate flake? (Depends on how different they end up being from homelab hosts)
- [ ] Will the stock `openstack-config.nix` metadata fetcher cause boot delays/errors if the metadata service isn't reachable? May need to disable it.
- [ ] **Flavor selection** - investigate what flavors are available in the quota. The standard small flavors likely have insufficient root disk for a NixOS host (Nix store grows fast). Options:
- Use a larger flavor with adequate root disk
- Create a custom flavor (if permissions allow)
- Cinder block storage is an option in theory, but was very slow last time it was tested - avoid if possible
- [ ] Consolidation opportunity - currently running multiple smaller VMs on OpenStack. Could a single larger NixOS VM replace several of them?
## Notes
- `nixos-rebuild build-image --image-variant openstack` uses the same `image.modules` system as Proxmox
- nixpkgs also has an `openstack-zfs` variant if ZFS root is ever wanted
- The stock OpenStack module imports `ec2-data.nix` and `amazon-init.nix` - these may need to be disabled or overridden if they cause issues without a metadata service