Files
nixos-servers/docs/plans/openstack-nixos-image.md
Torjus Håkestad 5e92eb3220
Some checks failed
Run nix flake check / flake-check (push) Failing after 8m1s
Periodic flake update / flake-update (push) Successful in 2m23s
docs: add plan for NixOS OpenStack image
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 00:42:19 +01:00

5.7 KiB

NixOS OpenStack Image

Overview

Build and upload a NixOS base image to the OpenStack cluster at work, enabling NixOS-based VPS instances to replace the current Debian+Podman setup. This image will serve as the foundation for multiple external services:

  • Forgejo (replacing Gitea on docker2)
  • WireGuard gateway (replacing docker2's tunnel role, feeding into the remote-access plan)
  • Any future externally-hosted services

Current State

  • VPS hosting runs on an OpenStack cluster with a personal quota
  • Current VPS (docker2.t-juice.club) runs Debian with Podman containers
  • Homelab already has a working Proxmox image pipeline: template2 builds via nixos-rebuild build-image --image-variant proxmox, deployed via Ansible
  • nixpkgs has a built-in openstack image variant in the same image.modules system used for Proxmox

Decisions

  • No cloud-init dependency - SSH key baked into the image, no need for metadata service
  • No bootstrap script - VPS deployments are infrequent; manual nixos-rebuild after first boot is fine
  • No Vault access - secrets handled manually until WireGuard access is set up (see remote-access plan)
  • Separate from homelab services - no logging/metrics integration initially; revisit after remote-access WireGuard is in place
  • Repo placement TBD - keep in this flake for now for convenience, but external hosts may move to a separate flake later since they can't use most shared system/ modules (no Vault, no internal DNS, no Promtail)
  • OpenStack CLI in devshell - add openstackclient package; credentials (clouds.yaml) stay outside the repo
  • Parallel deployment - new Forgejo instance runs alongside docker2 initially, then CNAME moves over

Approach

Follow the same pattern as the Proxmox template (hosts/template2), but targeting OpenStack's qcow2 format.

What nixpkgs provides

The image.modules.openstack module produces a qcow2 image with:

  • openstack-config.nix: EC2 metadata fetcher, SSH enabled, GRUB bootloader, serial console, auto-growing root partition
  • qemu-guest.nix profile (virtio drivers)
  • ext4 root filesystem with autoResize

What we need to customize

The stock OpenStack image pulls SSH keys and hostname from EC2-style metadata. Since we're baking the SSH key into the image, we need a simpler configuration:

  • SSH authorized keys baked into the image
  • Base packages (age, vim, wget, git)
  • Nix substituters (cache.nixos.org only - internal cache not reachable)
  • systemd-networkd with DHCP
  • GRUB bootloader
  • Firewall enabled (public-facing host)

Differences from template2

Aspect template2 (Proxmox) openstack-template (OpenStack)
Image format VMA (.vma.zst) qcow2 (.qcow2)
Image variant proxmox openstack
Cloud-init ConfigDrive + NoCloud Not used (SSH key baked in)
Nix cache Internal + nixos.org cache.nixos.org only
Vault AppRole via wrapped token None
Bootstrap Automatic nixos-rebuild on first boot Manual
Network Internal DHCP OpenStack DHCP
DNS Internal ns1/ns2 Public DNS
Firewall Disabled (trusted network) Enabled
System modules Full ../../system import Minimal (sshd, packages only)

Implementation Steps

Phase 1: Build the image

  1. Create hosts/openstack-template/ with minimal configuration
    • default.nix - imports (only sshd and packages from system/, not the full set)
    • configuration.nix - base config: SSH key, DHCP, GRUB, base packages, firewall on
    • hardware-configuration.nix - qemu-guest profile with virtio drivers
    • Exclude from DNS and monitoring (homelab.dns.enable = false, homelab.monitoring.enable = false)
    • May need to override parts of image.modules.openstack to disable the EC2 metadata fetcher if it causes boot delays
  2. Build with nixos-rebuild build-image --image-variant openstack --flake .#openstack-template
  3. Verify the qcow2 image is produced in result/

Phase 2: Upload and test

  1. Add openstackclient to the devshell
  2. Upload image: openstack image create --disk-format qcow2 --file result/<image>.qcow2 nixos-template
  3. Boot a test instance from the image
  4. Verify: SSH access works, DHCP networking, Nix builds work
  5. Test manual nixos-rebuild switch --flake against the instance

Phase 3: Automation (optional, later)

Consider an Ansible playbook similar to build-and-deploy-template.yml for image builds + uploads. Low priority since this will be done rarely.

Open Questions

  • Should external VPS hosts eventually move to a separate flake? (Depends on how different they end up being from homelab hosts)
  • Will the stock openstack-config.nix metadata fetcher cause boot delays/errors if the metadata service isn't reachable? May need to disable it.
  • Flavor selection - investigate what flavors are available in the quota. The standard small flavors likely have insufficient root disk for a NixOS host (Nix store grows fast). Options:
    • Use a larger flavor with adequate root disk
    • Create a custom flavor (if permissions allow)
    • Cinder block storage is an option in theory, but was very slow last time it was tested - avoid if possible
  • Consolidation opportunity - currently running multiple smaller VMs on OpenStack. Could a single larger NixOS VM replace several of them?

Notes

  • nixos-rebuild build-image --image-variant openstack uses the same image.modules system as Proxmox
  • nixpkgs also has an openstack-zfs variant if ZFS root is ever wanted
  • The stock OpenStack module imports ec2-data.nix and amazon-init.nix - these may need to be disabled or overridden if they cause issues without a metadata service