Files
nixos-servers/docs/plans/completed/pgdb1-decommission.md
Torjus Håkestad 5d3d93b280
Some checks failed
Run nix flake check / flake-check (push) Failing after 13m22s
docs: move completed plans to completed folder
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 21:08:17 +01:00

3.1 KiB

pgdb1 Decommissioning Plan

Overview

Decommission the pgdb1 PostgreSQL server. The only consumer was Open WebUI on gunter, which has been migrated to use a local PostgreSQL instance.

Pre-flight Verification

Before proceeding, verify that gunter is no longer using pgdb1:

  1. Check Open WebUI on gunter is configured for local PostgreSQL (not 10.69.13.16)
  2. Optionally: Check pgdb1 for recent connection activity:
    ssh pgdb1 'sudo -u postgres psql -c "SELECT * FROM pg_stat_activity WHERE datname IS NOT NULL;"'
    

Files to Remove

Host Configuration

  • hosts/pgdb1/default.nix
  • hosts/pgdb1/configuration.nix
  • hosts/pgdb1/hardware-configuration.nix
  • hosts/pgdb1/ (directory)

Service Module

  • services/postgres/postgres.nix
  • services/postgres/default.nix
  • services/postgres/ (directory)

Note: This service module is only used by pgdb1, so it can be removed entirely.

Flake Entry

Remove from flake.nix (lines 131-138):

pgdb1 = nixpkgs.lib.nixosSystem {
  inherit system;
  specialArgs = {
    inherit inputs self;
  };
  modules = commonModules ++ [
    ./hosts/pgdb1
  ];
};

Vault AppRole

Remove from terraform/vault/approle.tf (lines 69-73):

"pgdb1" = {
  paths = [
    "secret/data/hosts/pgdb1/*",
  ]
}

Monitoring Rules

Remove from services/monitoring/rules.yml the postgres_down alert (lines 359-365):

- name: postgres_rules
  rules:
    - alert: postgres_down
      expr: node_systemd_unit_state{instance="pgdb1.home.2rjus.net:9100", name="postgresql.service", state="active"} == 0
      for: 5m
      labels:
        severity: critical

Utility Scripts

Delete rebuild-all.sh entirely (obsolete script).

Execution Steps

Phase 1: Verification

  • Confirm Open WebUI on gunter uses local PostgreSQL
  • Verify no active connections to pgdb1

Phase 2: Code Cleanup

  • Create feature branch: git checkout -b decommission-pgdb1
  • Remove hosts/pgdb1/ directory
  • Remove services/postgres/ directory
  • Remove pgdb1 entry from flake.nix
  • Remove postgres alert from services/monitoring/rules.yml
  • Delete rebuild-all.sh (obsolete)
  • Run nix flake check to verify no broken references
  • Commit changes

Phase 3: Terraform Cleanup

  • Remove pgdb1 from terraform/vault/approle.tf
  • Run tofu plan in terraform/vault/ to preview changes
  • Run tofu apply to remove the AppRole
  • Commit terraform changes

Phase 4: Infrastructure Cleanup

  • Shut down pgdb1 VM in Proxmox
  • Delete the VM from Proxmox
  • (Optional) Remove any DNS entries if not auto-generated

Phase 5: Finalize

  • Merge feature branch to master
  • Trigger auto-upgrade on DNS servers (ns1, ns2) to remove DNS entry
  • Move this plan to docs/plans/completed/

Rollback

If issues arise after decommissioning:

  1. The VM can be recreated from template using the git history
  2. Database data would need to be restored from backup (if any exists)

Notes

  • pgdb1 IP: 10.69.13.16
  • The postgres service allowed connections from gunter (10.69.30.105)
  • No restic backup was configured for this host