From f779f49c200d7fb1850e1f968b1878ba363b7c60 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Torjus=20H=C3=A5kestad?= Date: Sat, 31 Jan 2026 10:56:21 +0100 Subject: [PATCH] vibecoding: add CLAUDE.md --- CLAUDE.md | 186 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 186 insertions(+) create mode 100644 CLAUDE.md diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..dd02cd7 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,186 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Repository Overview + +This is a Nix Flake-based NixOS configuration repository for managing a homelab infrastructure consisting of 16 server configurations. The repository uses a modular architecture with shared system configurations, reusable service modules, and per-host customization. + +## Common Commands + +### Building Configurations + +```bash +# List all available configurations +nix flake show + +# Build a specific host configuration locally (without deploying) +nixos-rebuild build --flake .# + +# Build and check a configuration +nix build .#nixosConfigurations..config.system.build.toplevel +``` + +### Deployment + +Do not automatically deploy changes. Deployments are usually done by updating the master branch, and then triggering the auto update on the specific host. + +### Flake Management + +```bash +# Check flake for errors +nix flake check +``` +Do not run `nix flake update`. Should only be done manually by user. + +### Development Environment + +```bash +# Enter development shell (provides ansible, python3) +nix develop +``` + +### Secrets Management + +Secrets are handled by sops. Do not edit any `.sops.yaml` or any file within `secrets/`. Ask the user to modify if necessary. + +## Architecture + +### Directory Structure + +- `/flake.nix` - Central flake defining all 16 NixOS configurations +- `/hosts//` - Per-host configurations + - `default.nix` - Entry point, imports configuration.nix and services + - `configuration.nix` - Host-specific settings (networking, hardware, users) +- `/system/` - Shared system-level configurations applied to ALL hosts + - Core modules: nix.nix, sshd.nix, sops.nix, acme.nix, autoupgrade.nix + - Monitoring: node-exporter and promtail on every host +- `/services/` - Reusable service modules, selectively imported by hosts + - `home-assistant/` - Home automation stack + - `monitoring/` - Observability stack (Prometheus, Grafana, Loki, Tempo) + - `ns/` - DNS services (authoritative, resolver) + - `http-proxy/`, `ca/`, `postgres/`, `nats/`, `jellyfin/`, etc. +- `/secrets/` - SOPS-encrypted secrets with age encryption +- `/common/` - Shared configurations (e.g., VM guest agent) +- `/playbooks/` - Ansible playbooks for fleet management +- `/.sops.yaml` - SOPS configuration with age keys for all servers + +### Configuration Inheritance + +Each host follows this import pattern: +``` +hosts//default.nix + └─> configuration.nix (host-specific) + ├─> ../../system (ALL shared system configs - applied to every host) + ├─> ../../services/ (selective service imports) + └─> ../../common/vm (if VM) +``` + +All hosts automatically get: +- Nix binary cache (nix-cache.home.2rjus.net) +- SSH with root login enabled +- SOPS secrets management with auto-generated age keys +- Internal ACME CA integration (ca.home.2rjus.net) +- Daily auto-upgrades with auto-reboot +- Prometheus node-exporter + Promtail (logs to monitoring01) +- Custom root CA trust + +### Active Hosts + +Production servers managed by `rebuild-all.sh`: +- `ns1`, `ns2` - Primary/secondary DNS servers (10.69.13.5/6) +- `ca` - Internal Certificate Authority +- `ha1` - Home Assistant + Zigbee2MQTT + Mosquitto +- `http-proxy` - Reverse proxy +- `monitoring01` - Full observability stack (Prometheus, Grafana, Loki, Tempo, Pyroscope) +- `jelly01` - Jellyfin media server +- `nix-cache01` - Binary cache server +- `pgdb1` - PostgreSQL database +- `nats1` - NATS messaging server +- `auth01` - Authentication service + +Template/test hosts: +- `template1` - Base template for cloning new hosts +- `nixos-test1` - Test environment + +### Flake Inputs + +- `nixpkgs` - NixOS 25.11 stable (primary) +- `nixpkgs-unstable` - Unstable channel (available via overlay as `pkgs.unstable.`) +- `sops-nix` - Secrets management +- Custom packages from git.t-juice.club: + - `backup-helper` - Backup automation module + - `alerttonotify` - Alert routing + - `labmon` - Lab monitoring + +### Network Architecture + +- Domain: `home.2rjus.net` +- Infrastructure subnet: `10.69.13.x` +- DNS: ns1/ns2 provide authoritative DNS with primary-secondary setup +- Internal CA for ACME certificates (no Let's Encrypt) +- Centralized monitoring at monitoring01 +- Static networking via systemd-networkd + +### Secrets Management + +- Uses SOPS with age encryption +- Each server has unique age key in `.sops.yaml` +- Keys auto-generated at `/var/lib/sops-nix/key.txt` on first boot +- Shared secrets: `/secrets/secrets.yaml` +- Per-host secrets: `/secrets//` +- All production servers can decrypt shared secrets; host-specific secrets require specific host keys + +### Auto-Upgrade System + +All hosts pull updates daily from: +``` +git+https://git.t-juice.club/torjus/nixos-servers.git +``` + +Configured in `/system/autoupgrade.nix`: +- Random delay to avoid simultaneous upgrades +- Auto-reboot after successful upgrade +- Systemd service: `nixos-upgrade.service` + +### Adding a New Host + +1. Create `/hosts//` directory +2. Copy structure from `template1` or similar host +3. Add host entry to `flake.nix` nixosConfigurations +4. Add hostname to dns zone files. Merge to master. Run auto-upgrade on dns servers. +5. User clones template host +6. User runs `prepare-host.sh` on new host, this deletes files which should be regenerated, like ssh host keys, machine-id etc. It also creates a new age key, and prints the public key +7. This key is then added to `.sops.yaml` +8. Create `/secrets//` if needed +9. Configure networking (static IP, DNS servers) +10. Commit changes, and merge to master. +11. Deploy by running `nixos-rebuild boot --flake URL#` on the host. + +### Important Patterns + +**Overlay usage**: Access unstable packages via `pkgs.unstable.` (defined in flake.nix overlay-unstable) + +**Service composition**: Services in `/services/` are designed to be imported by multiple hosts. Keep them modular and reusable. + +**Hardware configuration reuse**: Multiple hosts share `/hosts/template/hardware-configuration.nix` for VM instances. + +**State version**: All hosts use stateVersion `"23.11"` - do not change this on existing hosts. + +**Firewall**: Disabled on most hosts (trusted network). Enable selectively in host configuration if needed. + +### Monitoring Stack + +All hosts ship metrics and logs to `monitoring01`: +- **Metrics**: Prometheus scrapes node-exporter from all hosts +- **Logs**: Promtail ships logs to Loki on monitoring01 +- **Access**: Grafana at monitoring01 for visualization +- **Tracing**: Tempo for distributed tracing +- **Profiling**: Pyroscope for continuous profiling + +### DNS Architecture + +- `ns1` (10.69.13.5) - Primary authoritative DNS + resolver +- `ns2` (10.69.13.6) - Secondary authoritative DNS (AXFR from ns1) +- Zone files managed in `/services/ns/` +- All hosts point to ns1/ns2 for DNS resolution