Files
nixos-servers/docs/plans/nixos-router.md
Torjus Håkestad 8a5aa1c4f5
Some checks failed
Run nix flake check / flake-check (push) Failing after 4m30s
plans: add media PC replacement plan, update router hardware candidates
New plan for replacing the media PC (i7-4770K/Ubuntu) with a NixOS mini PC
running Kodi. Router plan updated with specific AliExpress hardware options
and IDS/IPS considerations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:54:29 +01:00

183 lines
8.2 KiB
Markdown

# NixOS Router — Replace EdgeRouter
Replace the aging Ubiquiti EdgeRouter (gw, 10.69.10.1) with a NixOS-based router.
The EdgeRouter is suspected to be a throughput bottleneck. A NixOS router integrates
naturally with the existing fleet: same config management, same monitoring pipeline,
same deployment workflow.
## Goals
- Eliminate the EdgeRouter throughput bottleneck
- Full integration with existing monitoring (node-exporter, promtail, Prometheus, Loki)
- Declarative firewall and routing config managed in the flake
- Inter-VLAN routing for all existing subnets
- DHCP server for client subnets
- NetFlow/traffic accounting for future ntopng integration
- Foundation for WireGuard remote access (see remote-access.md)
## Current Network Topology
**Subnets (known VLANs):**
| VLAN/Subnet | Purpose | Notable hosts |
|----------------|------------------|----------------------------------------|
| 10.69.10.0/24 | Gateway | gw (10.69.10.1) |
| 10.69.12.0/24 | Core services | nas, pve1, arr jails, restic |
| 10.69.13.0/24 | Infrastructure | All NixOS servers (static IPs) |
| 10.69.22.0/24 | WLAN | unifi-ctrl |
| 10.69.30.0/24 | Workstations | gunter |
| 10.69.31.0/24 | Media | media |
| 10.69.99.0/24 | Management | sw1 (MikroTik CRS326-24G-2S+) |
**DNS:** ns1 (10.69.13.5) and ns2 (10.69.13.6) handle all resolution. Upstream is
Cloudflare/Google over DoT via Unbound.
**Switch:** MikroTik CRS326-24G-2S+ — L2 switching with VLAN trunking. Capable of
L3 routing via RouterOS but not ideal for sustained routing throughput.
## Hardware
Needs a small x86 box with:
- At least 2 NICs (WAN + LAN trunk). Dual 2.5GbE preferred.
- Enough CPU for nftables NAT at line rate (any modern x86 is fine)
- 4-8 GB RAM (plenty for routing + DHCP + NetFlow accounting)
- Low power consumption, fanless preferred for always-on use
**Leading candidate:** [Topton Solid Mini PC](https://www.aliexpress.com/item/1005008981218625.html)
with Intel i3-N300 (8 E-cores), 2x10GbE SFP+ + 3x2.5GbE (~NOK 3000 barebones). The N300
gives headroom for ntopng DPI and potential Suricata IDS without being overkill.
### Hardware Alternatives
Domestic availability for firewall mini PCs is limited — likely ordering from AliExpress.
Key things to verify:
- NIC chipset: Intel i225-V/i226-V preferred over Realtek for Linux driver support
- RAM/storage: some listings are barebones, check what's included
- Import duties: factor in ~25% on top of listing price
| Option | NICs | Notes | Price |
|--------|------|-------|-------|
| [Topton Solid Firewall Router](https://www.aliexpress.com/item/1005008059819023.html) | 2x10GbE SFP+, 4x2.5GbE | No RAM/SSD, only Intel N150 available currently | ~NOK 2500 |
| [Topton Solid Mini PC](https://www.aliexpress.com/item/1005008981218625.html) | 2x10GbE SFP+, 3x2.5GbE | No RAM/SSD, only Intel i3-N300 available currently | ~NOK 3000 |
| [MINISFORUM MS-01](https://www.aliexpress.com/item/1005007308262492.html) | 2x10GbE SFP+, 2x2.5GbE | No RAM/SSD, i5-12600H | ~NOK 4500 |
The LAN port would carry a VLAN trunk to the MikroTik switch, with sub-interfaces
for each VLAN. WAN port connects to the ISP uplink.
## NixOS Configuration
### Stability Policy
The router is treated differently from the rest of the fleet:
- **No auto-upgrade** — `system.autoUpgrade.enable = false`
- **No homelab-deploy listener** — `homelab.deploy.enable = false`
- **Manual updates only** — update every few months, test-build first
- **Use `nixos-rebuild boot`** — changes take effect on next deliberate reboot
- **Tier: prod, priority: high** — alerts treated with highest priority
### Core Services
**Routing & NAT:**
- `systemd-networkd` for all interface config (consistent with rest of fleet)
- VLAN sub-interfaces on the LAN trunk (one per subnet)
- `networking.nftables` for stateful firewall and NAT
- IP forwarding enabled (`net.ipv4.ip_forward = 1`)
- Masquerade outbound traffic on WAN interface
**DHCP:**
- Kea or dnsmasq for DHCP on client subnets (WLAN, workstations, media)
- Infrastructure subnet (10.69.13.0/24) stays static — no DHCP needed
- Static leases for known devices
**Firewall (nftables):**
- Default deny between VLANs
- Explicit allow rules for known cross-VLAN traffic:
- All subnets → ns1/ns2 (DNS)
- All subnets → monitoring01 (metrics/logs)
- Infrastructure → all (management access)
- Workstations → media, core services
- NAT masquerade on WAN
- Rate limiting on WAN-facing services
**Traffic Accounting:**
- nftables flow accounting or softflowd for NetFlow export
- Export to future ntopng instance (see new-services.md)
**IDS/IPS (future consideration):**
- Suricata for inline intrusion detection/prevention on the WAN interface
- Signature-based threat detection, protocol anomaly detection
- CPU-intensive — feasible at typical home internet speeds (500Mbps-1Gbps) on the N300
- Not a day-one requirement, but the hardware should support it
### Monitoring Integration
Since this is a NixOS host in the flake, it gets the standard monitoring stack for free:
- node-exporter for system metrics (CPU, memory, NIC throughput per interface)
- promtail shipping logs to Loki
- Prometheus scrape target auto-registration
- Alertmanager alerts for host-down, high CPU, etc.
Additional router-specific monitoring:
- Per-VLAN interface traffic metrics via node-exporter (automatic for all interfaces)
- NAT connection tracking table size
- WAN uplink status and throughput
- DHCP lease metrics (if Kea, it has a Prometheus exporter)
This is a significant advantage over the EdgeRouter — full observability through
the existing Grafana dashboards and Loki log search, debuggable via the monitoring
MCP tools.
### WireGuard Integration
The remote access plan (remote-access.md) currently proposes a separate `extgw01`
gateway host. With a NixOS router, there's a decision to make:
**Option A:** WireGuard terminates on the router itself. Simplest topology — the
router is already the gateway, so VPN traffic doesn't need extra hops or firewall
rules. But adds complexity to the router, which should stay simple.
**Option B:** Keep extgw01 as a separate host (original plan). Router just routes
traffic to it. Better separation of concerns, router stays minimal.
Recommendation: Start with option B (keep it separate). The router should do routing
and nothing else. WireGuard can move to the router later if extgw01 feels redundant.
## Migration Plan
### Phase 1: Build and lab test
- Acquire hardware
- Create host config in the flake (routing, NAT, DHCP, firewall)
- Test-build on workstation: `nix build .#nixosConfigurations.router01.config.system.build.toplevel`
- Lab test with a temporary setup if possible (two NICs, isolated VLAN)
### Phase 2: Prepare cutover
- Pre-configure the MikroTik switch trunk port for the new router
- Document current EdgeRouter config (port forwarding, NAT rules, DHCP leases)
- Replicate all rules in the NixOS config
- Verify DNS, DHCP, and inter-VLAN routing work in test
### Phase 3: Cutover
- Schedule a maintenance window (brief downtime expected)
- Swap WAN cable from EdgeRouter to new router
- Swap LAN trunk from EdgeRouter to new router
- Verify connectivity from each VLAN
- Verify internet access, DNS resolution, inter-VLAN routing
- Monitor via Prometheus/Loki (immediately available since it's a fleet host)
### Phase 4: Decommission EdgeRouter
- Keep EdgeRouter available as fallback for a few weeks
- Remove `gw` entry from external-hosts.nix, replace with flake-managed host
- Update any references to 10.69.10.1 if the router IP changes
## Open Questions
- **Router IP:** Keep 10.69.10.1 or move to a different address? Each VLAN
sub-interface needs an IP (the gateway address for that subnet).
- **ISP uplink:** What type of WAN connection? PPPoE, DHCP, static IP?
- **Port forwarding:** What ports are currently forwarded on the EdgeRouter?
These need to be replicated in nftables.
- **DHCP scope:** Which subnets currently get DHCP from the EdgeRouter vs
other sources (UniFi controller for WLAN?)?
- **UPnP/NAT-PMP:** Needed for any devices? (gaming consoles, etc.)
- **Hardware preference:** Fanless mini PC budget and preferred vendor?