Commit Graph

15 Commits

Author SHA1 Message Date
979040aaf7 vault01: enable homelab-deploy listener
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
Enable vault.enable and homelab.deploy.enable on vault01 so it can
receive NATS-based remote deployments. Vault fetches secrets from
itself using AppRole after auto-unseal.

Add systemd ordering to ensure vault-secret services wait for openbao
to be unsealed before attempting to fetch secrets.

Also adds vault01 AppRole entry to Terraform.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 17:55:09 +01:00
38348c5980 vault: add homelab-deploy policy to generated hosts
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
The homelab-deploy listener requires access to shared/homelab-deploy/*
secrets. Update hosts-generated.tf and the generator script to include
this policy automatically.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 14:05:42 +01:00
7bc465b414 hosts: add testvm01, testvm02, testvm03 test hosts
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
Three permanent test hosts for validating deployment and bootstrapping
workflow. Each host configured with:
- Static IP (10.69.13.20-22/24)
- Vault AppRole integration
- Bootstrap from deploy-test-hosts branch

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 13:34:16 +01:00
03e70ac094 hosts: remove vaulttest01
Test host no longer needed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 12:55:38 +01:00
c214f8543c homelab: add deploy.enable option with assertion
All checks were successful
Run nix flake check / flake-check (push) Successful in 2m6s
Run nix flake check / flake-check (pull_request) Successful in 2m7s
- Add homelab.deploy.enable option (requires vault.enable)
- Create shared homelab-deploy Vault policy for all hosts
- Enable homelab.deploy on all vault-enabled hosts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 06:54:42 +01:00
7933127d77 system: enable homelab-deploy listener for all vault hosts
Add system/homelab-deploy.nix module that automatically enables the
listener on all hosts with vault.enable=true. Uses homelab.host.tier
and homelab.host.role for NATS subject subscriptions.

- Add homelab-deploy access to all host AppRole policies
- Remove manual listener config from vaulttest01 (now handled by system module)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 06:54:42 +01:00
ad8570f8db homelab-deploy: add NATS-based deployment system
Some checks failed
Run nix flake check / flake-check (push) Failing after 3m45s
Add homelab-deploy flake input and NixOS module for message-based
deployments across the fleet. Configure DEPLOY account in NATS with
tiered access control (listener, test-deployer, admin-deployer).
Enable listener on vaulttest01 as initial test host.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 05:22:06 +01:00
e9857afc11 monitoring: use AppRole token for OpenBao metrics scraping
All checks were successful
Run nix flake check / flake-check (push) Successful in 2m12s
Run nix flake check / flake-check (pull_request) Successful in 2m19s
Instead of creating a long-lived Vault token in Terraform (which gets
invalidated when Terraform recreates it), monitoring01 now uses its
existing AppRole credentials to fetch a fresh token for Prometheus.

Changes:
- Add prometheus-metrics policy to monitoring01's AppRole
- Remove vault_token.prometheus_metrics resource from Terraform
- Remove openbao-token KV secret from Terraform
- Add systemd service to fetch AppRole token on boot
- Add systemd timer to refresh token every 30 minutes

This ensures Prometheus always has a valid token without depending on
Terraform state or manual intervention.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 23:51:11 +01:00
3cccfc0487 monitoring: implement monitoring gaps coverage
Some checks failed
Run nix flake check / flake-check (push) Failing after 7m36s
Add exporters and scrape targets for services lacking monitoring:
- PostgreSQL: postgres-exporter on pgdb1
- Authelia: native telemetry metrics on auth01
- Unbound: unbound-exporter with remote-control on ns1/ns2
- NATS: HTTP monitoring endpoint on nats1
- OpenBao: telemetry config and Prometheus scrape with token auth
- Systemd: systemd-exporter on all hosts for per-service metrics

Add alert rules for postgres, auth (authelia + lldap), jellyfin,
vault (openbao), plus extend existing nats and unbound rules.

Add Terraform config for Prometheus metrics policy and token. The
token is created via vault_token resource and stored in KV, so no
manual token creation is needed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 21:44:13 +01:00
ccb1c3fe2e terraform: auto-generate backup password instead of manual
All checks were successful
Run nix flake check / flake-check (push) Successful in 2m19s
Remove backup_helper_secret variable and switch shared/backup/password
to auto_generate. New password will be added alongside existing restic
repository key.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 18:58:39 +01:00
0700033c0a secrets: migrate all hosts from sops to OpenBao vault
Replace sops-nix secrets with OpenBao vault secrets across all hosts.
Hardcode root password hash, add extractKey option to vault-secrets
module, update Terraform with secrets/policies for all hosts, and
create AppRole provisioning playbook.

Hosts migrated: ha1, monitoring01, ns1, ns2, http-proxy, nix-cache01
Wave 1 hosts (nats1, jelly01, pgdb1) get AppRole policies only.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 18:43:09 +01:00
7ae474fd3e pki: add new vault root ca to pki 2026-02-03 06:53:59 +01:00
01d4812280 vault: implement bootstrap integration
Some checks failed
Run nix flake check / flake-check (push) Successful in 2m31s
Run nix flake check / flake-check (pull_request) Failing after 14m16s
2026-02-03 01:10:36 +01:00
3f2f91aedd terraform: add vault pki management to terraform
Some checks failed
Run nix flake check / flake-check (push) Has been cancelled
2026-02-01 23:23:03 +01:00
5d513fd5af terraform: add vault secret managment to terraform 2026-02-01 23:07:47 +01:00