Commit Graph

73 Commits

Author SHA1 Message Date
35e62dafbc media1: add NixOS media PC configuration
GMKtec G3 (Intel N100) replacing the old Ubuntu media PC on VLAN 31.
Hyprland compositor with Kodi on workspace 1 and Firefox on workspace 2,
greetd auto-login, PipeWire audio, VA-API hardware decode, and NFS
mount for media from NAS.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:09:23 +01:00
e81ebb0e75 flake: migrate homelab-deploy input to code.t-juice.club
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 19:40:55 +01:00
01b53e323b flake: migrate nixos-exporter input to code.t-juice.club
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 19:34:31 +01:00
2d73627a2a flake: migrate alerttonotify input to code.t-juice.club
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 19:25:07 +01:00
a27e2ec213 nrec-nixos02: add Pocket ID with Caddy reverse proxy
Some checks failed
Run nix flake check / flake-check (push) Has been cancelled
Run nix flake check / flake-check (pull_request) Has been cancelled
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 18:11:49 +01:00
adc267bd95 nrec-nixos01: add host configuration with Caddy web server
Some checks failed
Run nix flake check / flake-check (push) Failing after 9m20s
Run nix flake check / flake-check (pull_request) Failing after 3m58s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 14:10:05 +01:00
7ffe2d71d6 openstack-template: add minimal NixOS image for OpenStack
Adds a new host configuration for building qcow2 images targeting
OpenStack (NREC). Uses a nixos user with SSH key and sudo instead
of root login, firewall enabled, and no internal services.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 13:56:55 +01:00
dd9ba42eb5 devshell: add openstack cli client
Some checks failed
Run nix flake check / flake-check (push) Failing after 4m16s
2026-03-08 13:31:54 +01:00
4a83363ee5 hosts: add pn01 and pn02 (ASUS PN51 mini PCs)
Some checks failed
Run nix flake check / flake-check (push) Failing after 5m33s
Add two ASUS PN51 hosts on VLAN 12 for stability testing.
pn01 at 10.69.12.60, pn02 at 10.69.12.61, both test-tier compute role.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 23:37:14 +01:00
4f593126c0 monitoring01: remove host and migrate services to monitoring02
Some checks failed
Run nix flake check / flake-check (push) Failing after 3m15s
Run nix flake check / flake-check (pull_request) Failing after 3m8s
Remove monitoring01 host configuration and unused service modules
(prometheus, grafana, loki, tempo, pyroscope). Migrate blackbox,
exportarr, and pve exporters to monitoring02 with scrape configs
moved to VictoriaMetrics. Update alert rules, terraform vault
policies/secrets, http-proxy entries, and documentation to reflect
the monitoring02 migration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 21:50:20 +01:00
b2b6ab4799 garage01: add Garage S3 service with Caddy HTTPS proxy
Configure Garage object storage on garage01 with S3 API, Vault secrets
for RPC secret and admin token, and Caddy reverse proxy for HTTPS access
at s3.home.2rjus.net via internal ACME CA. Includes flake entry, VM
definition, and Vault policy for the host.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 21:24:25 +01:00
75210805d5 nix-cache01: decommission and remove all references
Some checks failed
Run nix flake check / flake-check (push) Has been cancelled
Removed:
- hosts/nix-cache01/ directory
- services/nix-cache/build-flakes.{nix,sh} (replaced by NATS builder)
- Vault secret and AppRole for nix-cache01
- Old signing key variable from terraform
- Old trusted public key from system/nix.nix

Updated:
- flake.nix: removed nixosConfiguration
- README.md: nix-cache01 -> nix-cache02
- Monitoring rules: removed build-flakes alerts, updated harmonia to nix-cache02
- Simplified proxy.nix (no longer needs hostname conditional)

nix-cache02 is now the sole binary cache host.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-10 23:40:51 +01:00
2d9ca2a73f hosts: add nix-cache02 build host
Some checks failed
Run nix flake check / flake-check (push) Failing after 16m26s
New build host to replace nix-cache01 with:
- 8 CPU cores, 16GB RAM, 200GB disk
- Static IP 10.69.13.25

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-10 21:53:29 +01:00
6e08ba9720 ansible: restructure with dynamic inventory from flake
- Move playbooks/ to ansible/playbooks/
- Add dynamic inventory script that extracts hosts from flake
  - Groups by tier (tier_test, tier_prod) and role (role_dns, etc.)
  - Reads homelab.host.* options for metadata
- Add static inventory for non-flake hosts (Proxmox)
- Add ansible.cfg with inventory path and SSH optimizations
- Add group_vars/all.yml for common variables
- Add restart-service.yml playbook for restarting systemd services
- Update provision-approle.yml with single-host safeguard
- Add ANSIBLE_CONFIG to devshell for automatic inventory discovery
- Add ansible = "false" label to template2 to exclude from inventory
- Update CLAUDE.md to reference ansible/README.md for details

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-09 21:41:29 +01:00
60c04a2052 nixos-exporter: enable NATS cache sharing
Some checks failed
Run nix flake check / flake-check (pull_request) Successful in 2m17s
Run nix flake check / flake-check (push) Failing after 5m16s
When one host fetches the latest flake revision, it publishes to NATS
and all other hosts receive the update immediately. This reduces
redundant nix flake metadata calls across the fleet.

- Add nkeys to devshell for key generation
- Add nixos-exporter user to NATS HOMELAB account
- Add Vault secret for NKey storage
- Configure all hosts to use NATS for revision sharing
- Update nixos-exporter input to version with NATS support

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 23:57:28 +01:00
0b977808ca hosts: add monitoring02 configuration
Some checks failed
Run nix flake check / flake-check (push) Has been cancelled
New test-tier host for monitoring stack expansion with:
- Static IP 10.69.13.24
- 4 CPU cores, 4GB RAM, 20GB disk
- Vault integration and NATS-based deployment enabled

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 19:19:38 +01:00
54b6e37420 flake: add kanidm to devshell
Add kanidm_1_8 CLI for administering the Kanidm server.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 15:12:19 +01:00
ca0e3fd629 kanidm01: add kanidm authentication server
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
- New test-tier VM at 10.69.13.23 with role=auth
- Kanidm 1.8 server with HTTPS (443) and LDAPS (636)
- ACME certificate from internal CA (auth.home.2rjus.net)
- Provisioned groups: admins, users, ssh-users
- Provisioned user: torjus
- Daily backups at 22:00 (7 versions)
- Prometheus monitoring scrape target

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 00:13:59 +01:00
94feae82a0 ns1: recreate with OpenTofu workflow
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
Old VM had incorrect hardware-configuration.nix with hardcoded UUIDs
that didn't match actual disk layout, causing boot failure (emergency mode).

Recreated using template2-based configuration for OpenTofu provisioning.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 23:18:08 +01:00
8ec2a083bd pgdb1: decommission postgresql host
Remove pgdb1 host configuration and postgres service module.
The only consumer (Open WebUI on gunter) has migrated to local PostgreSQL.

Removed:
- hosts/pgdb1/ - host configuration
- services/postgres/ - service module (only used by pgdb1)
- postgres_rules from monitoring rules
- rebuild-all.sh (obsolete script)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 22:54:50 +01:00
536daee4c7 ns2: migrate to OpenTofu management
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
- Remove hosts/template/ (legacy template1) and give each legacy host
  its own hardware-configuration.nix copy
- Recreate ns2 using create-host with template2 base
- Add secondary DNS services (NSD + Unbound resolver)
- Configure Vault policy for shared DNS secrets
- Fix create-host IP uniqueness validator to check CIDR notation
  (prevents false positives from DNS resolver entries)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 19:28:35 +01:00
aedccbd9a0 flake: remove sops-nix (no longer used)
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
All secrets are now managed by OpenBao (Vault). Remove the legacy
sops-nix infrastructure that is no longer in use.

Removed:
- sops-nix flake input
- system/sops.nix module
- .sops.yaml configuration file
- Age key generation from template prepare-host scripts

Updated:
- flake.nix - removed sops-nix references from all hosts
- flake.lock - removed sops-nix input
- scripts/create-host/ - removed sops references
- CLAUDE.md - removed SOPS documentation

Note: secrets/ directory should be manually removed by the user.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 18:46:24 +01:00
bdc6057689 hosts: decommission ca host and remove labmon
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
Remove the step-ca host and labmon flake input now that ACME has been
migrated to OpenBao PKI.

Removed:
- hosts/ca/ - step-ca host configuration
- services/ca/ - step-ca service module
- labmon flake input and module (no longer used)

Updated:
- flake.nix - removed ca host and labmon references
- flake.lock - removed labmon input
- rebuild-all.sh - removed ca from host list
- CLAUDE.md - updated documentation

Note: secrets/ca/ should be manually removed by the user.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 18:41:49 +01:00
7bc465b414 hosts: add testvm01, testvm02, testvm03 test hosts
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
Three permanent test hosts for validating deployment and bootstrapping
workflow. Each host configured with:
- Static IP (10.69.13.20-22/24)
- Vault AppRole integration
- Bootstrap from deploy-test-hosts branch

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 13:34:16 +01:00
8d7bc50108 hosts: remove testvm01
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
Test host no longer needed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 12:58:24 +01:00
03e70ac094 hosts: remove vaulttest01
Test host no longer needed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 12:55:38 +01:00
13c3897e86 flake: update homelab-deploy, add to devShell
Update homelab-deploy to include bugfix. Add CLI to devShell for
easier testing and deployment operations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 06:54:42 +01:00
ad8570f8db homelab-deploy: add NATS-based deployment system
Some checks failed
Run nix flake check / flake-check (push) Failing after 3m45s
Add homelab-deploy flake input and NixOS module for message-based
deployments across the fleet. Configure DEPLOY account in NATS with
tiered access control (listener, test-deployer, admin-deployer).
Enable listener on vaulttest01 as initial test host.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 05:22:06 +01:00
12bf0683f5 modules: add homelab.host for host metadata
Add a shared `homelab.host` module that provides host metadata for
multiple consumers:
- tier: deployment tier (test/prod) for future homelab-deploy service
- priority: alerting priority (high/low) for Prometheus label filtering
- role: primary role of the host (dns, database, monitoring, etc.)
- labels: free-form labels for additional metadata

Host configurations updated with appropriate values:
- ns1, ns2: role=dns with dns_role labels
- nix-cache01: priority=low, role=build-host
- vault01: role=vault
- jump: role=bastion
- template, template2, testvm01, vaulttest01: tier=test, priority=low

The module is now imported via commonModules in flake.nix, making it
available to all hosts including minimal configurations like template2.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 02:49:58 +01:00
2034004280 flake: update nixos-exporter and set configurationRevision
Some checks failed
Run nix flake check / flake-check (push) Failing after 4m33s
- Update nixos-exporter to 0.2.3
- Set system.configurationRevision for all hosts so the exporter
  can report the flake's git revision

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 01:06:47 +01:00
97ff774d3f monitoring: add nixos-exporter to all hosts
All checks were successful
Run nix flake check / flake-check (push) Successful in 3m16s
Run nix flake check / flake-check (pull_request) Successful in 3m14s
Add nixos-exporter prometheus exporter to track NixOS generation metrics
and flake revision status across all hosts.

Changes:
- Add nixos-exporter flake input
- Add commonModules list in flake.nix for modules shared by all hosts
- Enable nixos-exporter in system/monitoring/metrics.nix
- Configure Prometheus to scrape nixos-exporter on all hosts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 23:55:29 +01:00
59e1962d75 auth01: decommission host and remove authelia/lldap services
Some checks failed
Run nix flake check / flake-check (pull_request) Successful in 2m5s
Run nix flake check / flake-check (push) Failing after 18m1s
Remove auth01 host configuration and associated services in preparation
for new auth stack with different provisioning system.

Removed:
- hosts/auth01/ - host configuration
- services/authelia/ - authelia service module
- services/lldap/ - lldap service module
- secrets/auth01/ - sops secrets
- Reverse proxy entries for auth and lldap
- Monitoring alert rules for authelia and lldap
- SOPS configuration for auth01

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 23:35:45 +01:00
0ef63ad874 hosts: remove decommissioned media1, ns3, ns4, nixos-test1
Some checks failed
Run nix flake check / flake-check (push) Failing after 4m47s
Run nix flake check / flake-check (pull_request) Successful in 3m20s
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 01:36:57 +01:00
d25fc99e1d backup: migrate to native services.restic.backups
Some checks failed
Run nix flake check / flake-check (push) Has been cancelled
Run nix flake check / flake-check (pull_request) Successful in 4m0s
Replace custom backup-helper flake input with NixOS native
services.restic.backups module for ha1, monitoring01, and nixos-test1.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 00:41:40 +01:00
01d4812280 vault: implement bootstrap integration
Some checks failed
Run nix flake check / flake-check (push) Successful in 2m31s
Run nix flake check / flake-check (pull_request) Failing after 14m16s
2026-02-03 01:10:36 +01:00
4133eafc4e flake: add openbao to devshell
Some checks failed
Run nix flake check / flake-check (push) Failing after 18m52s
2026-02-01 22:16:52 +01:00
6d64e53586 hosts: add vault01 host
All checks were successful
Run nix flake check / flake-check (push) Successful in 2m20s
2026-02-01 20:08:48 +01:00
9908286062 scripts: fix create-host flake.nix insertion point
Some checks failed
Run nix flake check / flake-check (pull_request) Successful in 2m12s
Run nix flake check / flake-check (push) Failing after 8m24s
Fix bug where new hosts were added outside of nixosConfigurations block
instead of inside it.

Issues fixed:
1. Pattern was looking for "packages =" but actual text is "packages = forAllSystems"
2. Replacement was putting new entry AFTER closing brace instead of BEFORE
3. testvm01 was at top-level flake output instead of in nixosConfigurations

Changes:
- Update pattern to match "packages = forAllSystems"
- Put new entry BEFORE the closing brace of nixosConfigurations
- Move testvm01 to correct location inside nixosConfigurations block

Result: nix flake show now correctly shows testvm01 as NixOS configuration

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-01 17:41:04 +01:00
7fe0aa0f54 test: add testvm01 for pipeline testing 2026-02-01 17:41:04 +01:00
408554b477 scripts: add create-host tool for automated host configuration generation
Some checks failed
Run nix flake check / flake-check (push) Failing after 1m50s
Run nix flake check / flake-check (pull_request) Failing after 1m49s
Implements Phase 2 of the automated deployment pipeline.

This commit adds a Python CLI tool that automates the creation of NixOS host
configurations, eliminating manual boilerplate and reducing errors.

Features:
- Python CLI using typer framework with rich terminal UI
- Comprehensive validation (hostname format/uniqueness, IP subnet/uniqueness)
- Jinja2 templates for NixOS configurations
- Automatic updates to flake.nix and terraform/vms.tf
- Support for both static IP and DHCP configurations
- Dry-run mode for safe previews
- Packaged as Nix derivation and added to devShell

Usage:
  create-host --hostname myhost --ip 10.69.13.50/24

The tool generates:
- hosts/<hostname>/default.nix
- hosts/<hostname>/configuration.nix
- Updates flake.nix with new nixosConfigurations entry
- Updates terraform/vms.tf with new VM definition

All generated configurations include full system imports (monitoring, SOPS,
autoupgrade, etc.) and are validated with nix flake check and tofu validate.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-01 02:27:57 +01:00
3a464bc323 proxmox: add VM automation with OpenTofu and Ansible
Add automated workflow for building and deploying NixOS VMs on Proxmox including template2 host configuration, Ansible playbook for image building/deployment, and OpenTofu configuration for VM provisioning with cloud-init.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-31 21:54:08 +01:00
7f72a72043 flake: add opentofu to devshell
Some checks failed
Run nix flake check / flake-check (push) Failing after 17m5s
2026-01-31 16:12:49 +01:00
f2963a150b flake: stable to 25.11
Some checks failed
Run nix flake check / flake-check (push) Failing after 3m44s
2025-12-06 10:45:14 +01:00
ccd9bbf4da Remove incus hosts
Some checks failed
Run nix flake check / flake-check (push) Failing after 14m57s
Periodic flake update / flake-update (push) Successful in 3m35s
2025-07-07 21:30:04 +02:00
6fda081dc8 Add labmon to monitoring01
Some checks failed
Run nix flake check / flake-check (push) Has been cancelled
2025-05-24 03:27:59 +02:00
5e9aff0590 Update stable to 25.05 2025-05-23 00:54:13 +02:00
cba1821f3b Add lldap to auth01 host 2025-04-01 22:23:59 +02:00
abb4cf58ea Add alerttonotify to monitoring host
Some checks failed
Run nix flake check / flake-check (push) Has been cancelled
2025-02-11 22:25:54 +01:00
c43e2aa063 Add nats server
Some checks failed
Run nix flake check / flake-check (push) Failing after 17m6s
Periodic flake update / flake-update (push) Successful in 1m28s
2025-02-08 00:26:53 +01:00
002f934c70 Add ansible and playbook to trigger upgrade
Some checks failed
Run nix flake check / flake-check (push) Failing after 27m26s
Periodic flake update / flake-update (push) Successful in 1m24s
2025-02-07 00:28:05 +01:00