3 Commits

Author SHA1 Message Date
3f94f7ee95 docs: update pgdb1 decommission progress
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 22:55:55 +01:00
b7e398c9a7 terraform: remove pgdb1 vault approle
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 22:55:39 +01:00
8ec2a083bd pgdb1: decommission postgresql host
Remove pgdb1 host configuration and postgres service module.
The only consumer (Open WebUI on gunter) has migrated to local PostgreSQL.

Removed:
- hosts/pgdb1/ - host configuration
- services/postgres/ - service module (only used by pgdb1)
- postgres_rules from monitoring rules
- rebuild-all.sh (obsolete script)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 22:54:50 +01:00
12 changed files with 123 additions and 213 deletions

View File

@@ -13,7 +13,6 @@ NixOS Flake-based configuration repository for a homelab infrastructure. All hos
| `monitoring01` | Prometheus, Grafana, Loki, Tempo, Pyroscope | | `monitoring01` | Prometheus, Grafana, Loki, Tempo, Pyroscope |
| `jelly01` | Jellyfin media server | | `jelly01` | Jellyfin media server |
| `nix-cache01` | Nix binary cache | | `nix-cache01` | Nix binary cache |
| `pgdb1` | PostgreSQL |
| `nats1` | NATS messaging | | `nats1` | NATS messaging |
| `vault01` | OpenBao (Vault) secrets management | | `vault01` | OpenBao (Vault) secrets management |
| `template1`, `template2` | VM templates for cloning new hosts | | `template1`, `template2` | VM templates for cloning new hosts |

View File

@@ -163,17 +163,19 @@ Host was already removed from flake.nix and VM destroyed. Configuration cleaned
Host configuration, services, and VM already removed. Host configuration, services, and VM already removed.
### pgdb1 ### pgdb1 (in progress)
Only consumer was Open WebUI on gunter, which is being migrated to use local PostgreSQL. Only consumer was Open WebUI on gunter, which has been migrated to use local PostgreSQL.
1. Verify Open WebUI on gunter is using local PostgreSQL (not pgdb1) 1. ~~Verify Open WebUI on gunter is using local PostgreSQL (not pgdb1)~~
2. Remove host configuration from `hosts/pgdb1/` 2. ~~Remove host configuration from `hosts/pgdb1/`~~
3. Remove `services/postgres/` (only used by pgdb1) 3. ~~Remove `services/postgres/` (only used by pgdb1)~~
4. Remove from `flake.nix` 4. ~~Remove from `flake.nix`~~
5. Remove Vault AppRole from `terraform/vault/approle.tf` 5. ~~Remove Vault AppRole from `terraform/vault/approle.tf`~~
6. Destroy the VM in Proxmox 6. Destroy the VM in Proxmox
7. Commit cleanup 7. ~~Commit cleanup~~
See `docs/plans/pgdb1-decommission.md` for detailed plan.
## Phase 5: Decommission ca Host ✓ COMPLETE ## Phase 5: Decommission ca Host ✓ COMPLETE

View File

@@ -0,0 +1,113 @@
# pgdb1 Decommissioning Plan
## Overview
Decommission the pgdb1 PostgreSQL server. The only consumer was Open WebUI on gunter, which has been migrated to use a local PostgreSQL instance.
## Pre-flight Verification
Before proceeding, verify that gunter is no longer using pgdb1:
1. Check Open WebUI on gunter is configured for local PostgreSQL (not 10.69.13.16)
2. Optionally: Check pgdb1 for recent connection activity:
```bash
ssh pgdb1 'sudo -u postgres psql -c "SELECT * FROM pg_stat_activity WHERE datname IS NOT NULL;"'
```
## Files to Remove
### Host Configuration
- `hosts/pgdb1/default.nix`
- `hosts/pgdb1/configuration.nix`
- `hosts/pgdb1/hardware-configuration.nix`
- `hosts/pgdb1/` (directory)
### Service Module
- `services/postgres/postgres.nix`
- `services/postgres/default.nix`
- `services/postgres/` (directory)
Note: This service module is only used by pgdb1, so it can be removed entirely.
### Flake Entry
Remove from `flake.nix` (lines 131-138):
```nix
pgdb1 = nixpkgs.lib.nixosSystem {
inherit system;
specialArgs = {
inherit inputs self;
};
modules = commonModules ++ [
./hosts/pgdb1
];
};
```
### Vault AppRole
Remove from `terraform/vault/approle.tf` (lines 69-73):
```hcl
"pgdb1" = {
paths = [
"secret/data/hosts/pgdb1/*",
]
}
```
### Monitoring Rules
Remove from `services/monitoring/rules.yml` the `postgres_down` alert (lines 359-365):
```yaml
- name: postgres_rules
rules:
- alert: postgres_down
expr: node_systemd_unit_state{instance="pgdb1.home.2rjus.net:9100", name="postgresql.service", state="active"} == 0
for: 5m
labels:
severity: critical
```
### Utility Scripts
Delete `rebuild-all.sh` entirely (obsolete script).
## Execution Steps
### Phase 1: Verification
- [ ] Confirm Open WebUI on gunter uses local PostgreSQL
- [ ] Verify no active connections to pgdb1
### Phase 2: Code Cleanup
- [ ] Create feature branch: `git checkout -b decommission-pgdb1`
- [ ] Remove `hosts/pgdb1/` directory
- [ ] Remove `services/postgres/` directory
- [ ] Remove pgdb1 entry from `flake.nix`
- [ ] Remove postgres alert from `services/monitoring/rules.yml`
- [ ] Delete `rebuild-all.sh` (obsolete)
- [ ] Run `nix flake check` to verify no broken references
- [ ] Commit changes
### Phase 3: Terraform Cleanup
- [ ] Remove pgdb1 from `terraform/vault/approle.tf`
- [ ] Run `tofu plan` in `terraform/vault/` to preview changes
- [ ] Run `tofu apply` to remove the AppRole
- [ ] Commit terraform changes
### Phase 4: Infrastructure Cleanup
- [ ] Shut down pgdb1 VM in Proxmox
- [ ] Delete the VM from Proxmox
- [ ] (Optional) Remove any DNS entries if not auto-generated
### Phase 5: Finalize
- [ ] Merge feature branch to master
- [ ] Trigger auto-upgrade on DNS servers (ns1, ns2) to remove DNS entry
- [ ] Move this plan to `docs/plans/completed/`
## Rollback
If issues arise after decommissioning:
1. The VM can be recreated from template using the git history
2. Database data would need to be restored from backup (if any exists)
## Notes
- pgdb1 IP: 10.69.13.16
- The postgres service allowed connections from gunter (10.69.30.105)
- No restic backup was configured for this host

View File

@@ -128,15 +128,6 @@
./hosts/nix-cache01 ./hosts/nix-cache01
]; ];
}; };
pgdb1 = nixpkgs.lib.nixosSystem {
inherit system;
specialArgs = {
inherit inputs self;
};
modules = commonModules ++ [
./hosts/pgdb1
];
};
nats1 = nixpkgs.lib.nixosSystem { nats1 = nixpkgs.lib.nixosSystem {
inherit system; inherit system;
specialArgs = { specialArgs = {

View File

@@ -1,66 +0,0 @@
{
pkgs,
...
}:
{
imports = [
./hardware-configuration.nix
../../system
../../common/vm
];
nixpkgs.config.allowUnfree = true;
# Use the systemd-boot EFI boot loader.
boot.loader.grub = {
enable = true;
device = "/dev/sda";
configurationLimit = 3;
};
networking.hostName = "pgdb1";
networking.domain = "home.2rjus.net";
networking.useNetworkd = true;
networking.useDHCP = false;
services.resolved.enable = true;
networking.nameservers = [
"10.69.13.5"
"10.69.13.6"
];
systemd.network.enable = true;
systemd.network.networks."ens18" = {
matchConfig.Name = "ens18";
address = [
"10.69.13.16/24"
];
routes = [
{ Gateway = "10.69.13.1"; }
];
linkConfig.RequiredForOnline = "routable";
};
time.timeZone = "Europe/Oslo";
nix.settings.experimental-features = [
"nix-command"
"flakes"
];
nix.settings.tarball-ttl = 0;
environment.systemPackages = with pkgs; [
vim
wget
git
];
# Open ports in the firewall.
# networking.firewall.allowedTCPPorts = [ ... ];
# networking.firewall.allowedUDPPorts = [ ... ];
# Or disable the firewall altogether.
networking.firewall.enable = false;
vault.enable = true;
homelab.deploy.enable = true;
system.stateVersion = "23.11"; # Did you read the comment?
}

View File

@@ -1,7 +0,0 @@
{ ... }:
{
imports = [
./configuration.nix
../../services/postgres
];
}

View File

@@ -1,42 +0,0 @@
{
config,
lib,
pkgs,
modulesPath,
...
}:
{
imports = [
(modulesPath + "/profiles/qemu-guest.nix")
];
boot.initrd.availableKernelModules = [
"ata_piix"
"uhci_hcd"
"virtio_pci"
"virtio_scsi"
"sd_mod"
"sr_mod"
];
boot.initrd.kernelModules = [ "dm-snapshot" ];
boot.kernelModules = [
"ptp_kvm"
];
boot.extraModulePackages = [ ];
fileSystems."/" = {
device = "/dev/disk/by-label/root";
fsType = "xfs";
};
swapDevices = [ { device = "/dev/disk/by-label/swap"; } ];
# Enables DHCP on each ethernet and wireless interface. In case of scripted networking
# (the default) this is the recommended approach. When using systemd-networkd it's
# still possible to use this option, but it's recommended to use it in conjunction
# with explicit per-interface declarations with `networking.interfaces.<interface>.useDHCP`.
networking.useDHCP = lib.mkDefault true;
# networking.interfaces.ens18.useDHCP = lib.mkDefault true;
nixpkgs.hostPlatform = lib.mkDefault "x86_64-linux";
}

View File

@@ -1,19 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
# array of hosts
HOSTS=(
"ns1"
"ns2"
"ha1"
"http-proxy"
"jelly01"
"monitoring01"
"nix-cache01"
"pgdb1"
)
for host in "${HOSTS[@]}"; do
echo "Rebuilding $host"
nixos-rebuild boot --flake .#${host} --target-host root@${host}
done

View File

@@ -356,32 +356,6 @@ groups:
annotations: annotations:
summary: "Proxmox VM {{ $labels.id }} is stopped" summary: "Proxmox VM {{ $labels.id }} is stopped"
description: "Proxmox VM {{ $labels.id }} ({{ $labels.name }}) has onboot=1 but is stopped." description: "Proxmox VM {{ $labels.id }} ({{ $labels.name }}) has onboot=1 but is stopped."
- name: postgres_rules
rules:
- alert: postgres_down
expr: node_systemd_unit_state{instance="pgdb1.home.2rjus.net:9100", name="postgresql.service", state="active"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: "PostgreSQL not running on {{ $labels.instance }}"
description: "PostgreSQL has been down on {{ $labels.instance }} more than 5 minutes."
- alert: postgres_exporter_down
expr: up{job="postgres"} == 0
for: 5m
labels:
severity: warning
annotations:
summary: "PostgreSQL exporter down on {{ $labels.instance }}"
description: "Cannot scrape PostgreSQL metrics from {{ $labels.instance }}."
- alert: postgres_high_connections
expr: pg_stat_activity_count / pg_settings_max_connections > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "PostgreSQL connection pool near exhaustion on {{ $labels.instance }}"
description: "PostgreSQL is using over 80% of max_connections on {{ $labels.instance }}."
- name: jellyfin_rules - name: jellyfin_rules
rules: rules:
- alert: jellyfin_down - alert: jellyfin_down

View File

@@ -1,6 +0,0 @@
{ ... }:
{
imports = [
./postgres.nix
];
}

View File

@@ -1,23 +0,0 @@
{ pkgs, ... }:
{
homelab.monitoring.scrapeTargets = [{
job_name = "postgres";
port = 9187;
}];
services.prometheus.exporters.postgres = {
enable = true;
runAsLocalSuperUser = true; # Use peer auth as postgres user
};
services.postgresql = {
enable = true;
enableJIT = true;
enableTCPIP = true;
extensions = ps: with ps; [ pgvector ];
authentication = ''
# Allow access to everything from gunter
host all all 10.69.30.105/32 scram-sha-256
'';
};
}

View File

@@ -66,12 +66,6 @@ locals {
] ]
} }
"pgdb1" = {
paths = [
"secret/data/hosts/pgdb1/*",
]
}
# Wave 3: DNS servers # Wave 3: DNS servers
"ns1" = { "ns1" = {
paths = [ paths = [