Files
nixos-servers/docs/infrastructure.md
Torjus Håkestad 34a2f2ab50
Some checks failed
Run nix flake check / flake-check (push) Failing after 11m9s
docs: add infrastructure documentation
2026-02-02 17:36:55 +01:00

8.6 KiB

Homelab Infrastructure

This document describes the physical and virtual infrastructure components that support the NixOS-managed servers in this repository.

Overview

The homelab consists of several core infrastructure components:

  • Proxmox VE - Hypervisor hosting all NixOS VMs
  • TrueNAS - Network storage and backup target
  • Ubiquiti EdgeRouter - Primary router and gateway
  • Mikrotik Switch - Core network switching

All NixOS configurations in this repository run as VMs on Proxmox and rely on these underlying infrastructure components.

Network Topology

Subnets

VLAN numbers are based on third octet of ip address.

TODO: VLAN naming is currently inconsistent across router/switch/Proxmox configurations. Need to standardize VLAN names and update all device configs to use consistent naming.

  • 10.69.8.x - Kubernetes (no longer in use)
  • 10.69.12.x - Core services
  • 10.69.13.x - NixOS VMs and core services
  • 10.69.30.x - Client network 1
  • 10.69.31.x - Clients network 2
  • 10.69.99.x - Management network

Core Network Services

  • Gateway: Web UI exposed on 10.69.10.1
  • DNS: ns1 (10.69.13.5), ns2 (10.69.13.6)
  • Primary DNS Domain: home.2rjus.net

Hardware Components

Proxmox Hypervisor

Purpose: Hosts all NixOS VMs defined in this repository

Hardware:

  • CPU: AMD Ryzen 9 3900X 12-Core Processor
  • RAM: 96GB (94Gi)
  • Storage: 1TB NVMe SSD (nvme0n1)

Management:

  • Web UI: https://pve1.home.2rjus.net:8006
  • Cluster: Standalone
  • Version: Proxmox VE 8.4.16 (kernel 6.8.12-18-pve)

VM Provisioning:

  • Template VM: ID 9000 (built from hosts/template2)
  • See /terraform directory for automated VM deployment using OpenTofu

Storage:

  • ZFS pool: rpool on NVMe partition (nvme0n1p3)
    • Total capacity: ~900GB (232GB used, 667GB available)
    • Configuration: Single disk (no RAID)
    • Scrub status: Last scrub completed successfully with 0 errors

Networking:

  • Management interface: vmbr0 - 10.69.12.75/24 (VLAN 12 - Core services)
  • Physical interface: enp9s0 (primary), enp4s0 (unused)
  • VM bridges:
    • vmbr0 - Main bridge (bridged to enp9s0)
    • vmbr0v8 - VLAN 8 (Kubernetes - deprecated)
    • vmbr0v13 - VLAN 13 (NixOS VMs and core services)

TrueNAS

Purpose: Network storage, backup target, media storage

Hardware:

  • Model: Custom build
  • CPU: AMD Ryzen 5 5600G with Radeon Graphics
  • RAM: 32GB (31.2 GiB)
  • Disks:
    • 2x Kingston SA400S37 240GB SSD (boot pool, mirrored)
    • 2x Seagate ST16000NE000 16TB HDD (hdd-pool mirror-0)
    • 2x WD WD80EFBX 8TB HDD (hdd-pool mirror-1)
    • 2x Seagate ST8000VN004 8TB HDD (hdd-pool mirror-2)
    • 1x NVMe 2TB (nvme-pool, no redundancy)

Management:

  • Web UI: https://nas.home.2rjus.net (10.69.12.50)
  • Hostname: nas.home.2rjus.net
  • Version: TrueNAS-13.0-U6.1 (Core)

Networking:

  • Primary interface: mlxen0 - 10GbE (10Gbase-CX4) connected to sw1
  • IP: 10.69.12.50/24 (VLAN 12 - Core services)

ZFS Pools:

  • boot-pool: 206GB (mirrored SSDs) - 4% used
    • Mirror of 2x Kingston 240GB SSDs
    • Last scrub: No errors
  • hdd-pool: 29.1TB total (3-way mirror, 28.4TB used, 658GB free) - 97% capacity
    • mirror-0: 2x 16TB Seagate ST16000NE000
    • mirror-1: 2x 8TB WD WD80EFBX
    • mirror-2: 2x 8TB Seagate ST8000VN004
    • Last scrub: No errors
  • nvme-pool: 1.81TB (single NVMe, 70.4GB used, 1.74TB free) - 3% capacity
    • Single NVMe drive, no redundancy
    • Last scrub: No errors

NFS Exports:

  • /mnt/hdd-pool/media - Media storage (exported to 10.69.0.0/16, used by Jellyfin)
  • /mnt/hdd-pool/virt/nfs-iso - ISO storage for Proxmox
  • /mnt/hdd-pool/virt/kube-prod-pvc - Kubernetes storage (deprecated)

Jails: TrueNAS runs several FreeBSD jails for media management:

  • nzbget - Usenet downloader
  • restic-rest - Restic REST server for backups
  • radarr - Movie management
  • sonarr - TV show management

Ubiquiti EdgeRouter

Purpose: Primary router, gateway, firewall, inter-VLAN routing

Model: EdgeRouter X 5-Port

Hardware:

  • Serial: F09FC20E1A4C

Management:

  • SSH: ssh ubnt@10.69.10.1
  • Web UI: https://10.69.10.1
  • Version: EdgeOS v2.0.9-hotfix.6 (build 5574651, 12/30/22)

WAN Connection:

  • Interface: eth0
  • Public IP: 84.213.73.123/20
  • Gateway: 84.213.64.1

Interface Layout:

  • eth0: WAN (public IP)
  • eth1: 10.69.31.1/24 - Clients network 2
  • eth2: Unused (down)
  • eth3: 10.69.30.1/24 - Client network 1
  • eth4: Trunk port to Mikrotik switch (carries all VLANs)
    • eth4.8: 10.69.8.1/24 - K8S (deprecated)
    • eth4.10: 10.69.10.1/24 - TRUSTED (management access)
    • eth4.12: 10.69.12.1/24 - SERVER (Proxmox, TrueNAS, core services)
    • eth4.13: 10.69.13.1/24 - SVC (NixOS VMs)
    • eth4.21: 10.69.21.1/24 - CLIENTS
    • eth4.22: 10.69.22.1/24 - WLAN (wireless clients)
    • eth4.23: 10.69.23.1/24 - IOT
    • eth4.99: 10.69.99.1/24 - MGMT (device management)

Routing:

  • Default route: 0.0.0.0/0 via 84.213.64.1 (WAN gateway)
  • Static route: 192.168.100.0/24 via eth0
  • All internal VLANs directly connected

DHCP Servers: Active DHCP pools on all networks:

  • dhcp-8: VLAN 8 (K8S) - 91 addresses
  • dhcp-12: VLAN 12 (SERVER) - 51 addresses
  • dhcp-13: VLAN 13 (SVC) - 41 addresses
  • dhcp-21: VLAN 21 (CLIENTS) - 141 addresses
  • dhcp-22: VLAN 22 (WLAN) - 101 addresses
  • dhcp-23: VLAN 23 (IOT) - 191 addresses
  • dhcp-30: eth3 (Client network 1) - 101 addresses
  • dhcp-31: eth1 (Clients network 2) - 21 addresses
  • dhcp-mgmt: VLAN 99 (MGMT) - 51 addresses

NAT/Firewall:

  • Masquerading on WAN interface (eth0)

Mikrotik Switch

Purpose: Core Layer 2/3 switching

Model: MikroTik CRS326-24G-2S+ (24x 1GbE + 2x 10GbE SFP+)

Hardware:

  • CPU: ARMv7 @ 800MHz
  • RAM: 512MB
  • Uptime: 21+ weeks

Management:

  • Hostname: sw1.home.2rjus.net
  • SSH access: ssh admin@sw1.home.2rjus.net (using gunter SSH key)
  • Management IP: 10.69.99.2/24 (VLAN 99)
  • Version: RouterOS 6.47.10 (long-term)

VLANs:

  • VLAN 8: Kubernetes (deprecated)
  • VLAN 12: SERVERS - Core services subnet
  • VLAN 13: SVC - Services subnet
  • VLAN 21: CLIENTS
  • VLAN 22: WLAN - Wireless network
  • VLAN 23: IOT
  • VLAN 99: MGMT - Management network

Port Layout (active ports):

  • ether1: Uplink to EdgeRouter (trunk, carries all VLANs)
  • ether11: virt-mini1 (VLAN 12 - SERVERS)
  • ether12: Home Assistant (VLAN 12 - SERVERS)
  • ether24: Wireless AP (VLAN 22 - WLAN)
  • sfp-sfpplus1: Media server/Jellyfin (VLAN 12) - 10Gbps, 7m copper DAC
  • sfp-sfpplus2: TrueNAS (VLAN 12) - 10Gbps, 1m copper DAC

Bridge Configuration:

  • All ports bridged to main bridge interface
  • Hardware offloading enabled
  • VLAN filtering enabled on bridge

Backup & Disaster Recovery

Backup Strategy

NixOS VMs:

  • Declarative configurations in this git repository
  • Secrets: SOPS-encrypted, backed up with repository
  • State/data: Some hosts are backed up to nas host, but this should be improved and expanded to more hosts.

Proxmox:

  • VM backups: Not currently implemented

Critical Credentials:

TODO: Document this

  • OpenBao root token and unseal keys: [offline secure storage location]
  • Proxmox root password: [secure storage]
  • TrueNAS admin password: [secure storage]
  • Router admin credentials: [secure storage]

Disaster Recovery Procedures

Total Infrastructure Loss:

  1. Restore Proxmox from installation media
  2. Restore TrueNAS from installation media, import ZFS pools
  3. Restore network configuration on EdgeRouter and Mikrotik
  4. Rebuild NixOS VMs from this repository using Proxmox template
  5. Restore stateful data from TrueNAS backups
  6. Re-initialize OpenBao and restore from backup if needed

Individual VM Loss:

  1. Deploy new VM from template using OpenTofu (terraform/)
  2. Run nixos-rebuild with appropriate flake configuration
  3. Restore any stateful data from backups
  4. For vault01: follow re-provisioning steps in docs/vault/auto-unseal.md

Network Device Failure:

  • EdgeRouter: [config backup location, restoration procedure]
  • Mikrotik: [config backup location, restoration procedure]

Future Additions

  • Additional Proxmox nodes for clustering
  • Backup Proxmox Backup Server
  • Additional TrueNAS for replication

Maintenance Notes

Proxmox Updates

  • Update schedule: manual
  • Pre-update checklist: yolo

TrueNAS Updates

  • Update schedule: manual

Network Device Updates

  • EdgeRouter: manual
  • Mikrotik: manual

Monitoring

Infrastructure Monitoring:

TODO: Improve monitoring for physical hosts (proxmox, nas) TODO: Improve monitoring for networking equipment

All NixOS VMs ship metrics to monitoring01 via node-exporter and logs via Promtail. See /services/monitoring/ for the observability stack configuration.