docs: update migration plan for monitoring01 and pgdb1 completion
Some checks failed
Run nix flake check / flake-check (push) Failing after 16m37s
Periodic flake update / flake-update (push) Successful in 2m21s

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-17 22:26:23 +01:00
parent 65acf13e6f
commit b218b4f8bc

View File

@@ -20,9 +20,9 @@ Hosts to migrate:
| http-proxy | Stateless | Reverse proxy, recreate | | http-proxy | Stateless | Reverse proxy, recreate |
| nats1 | Stateless | Messaging, recreate | | nats1 | Stateless | Messaging, recreate |
| ha1 | Stateful | Home Assistant + Zigbee2MQTT + Mosquitto | | ha1 | Stateful | Home Assistant + Zigbee2MQTT + Mosquitto |
| monitoring01 | Stateful | Prometheus, Grafana, Loki | | ~~monitoring01~~ | ~~Decommission~~ | ✓ Complete — replaced by monitoring02 (VictoriaMetrics) |
| jelly01 | Stateful | Jellyfin metadata, watch history, config | | jelly01 | Stateful | Jellyfin metadata, watch history, config |
| pgdb1 | Decommission | Only used by Open WebUI on gunter, migrating to local postgres | | ~~pgdb1~~ | ~~Decommission~~ | ✓ Complete |
| ~~jump~~ | ~~Decommission~~ | ✓ Complete | | ~~jump~~ | ~~Decommission~~ | ✓ Complete |
| ~~auth01~~ | ~~Decommission~~ | ✓ Complete | | ~~auth01~~ | ~~Decommission~~ | ✓ Complete |
| ~~ca~~ | ~~Deferred~~ | ✓ Complete | | ~~ca~~ | ~~Deferred~~ | ✓ Complete |
@@ -31,10 +31,12 @@ Hosts to migrate:
Before migrating any stateful host, ensure restic backups are in place and verified. Before migrating any stateful host, ensure restic backups are in place and verified.
### 1a. Expand monitoring01 Grafana Backup ### ~~1a. Expand monitoring01 Grafana Backup~~ ✓ N/A
The existing backup only covers `/var/lib/grafana/plugins` and a sqlite dump of `grafana.db`. ~~The existing backup only covers `/var/lib/grafana/plugins` and a sqlite dump of `grafana.db`.
Expand to back up all of `/var/lib/grafana/` to capture config directory and any other state. Expand to back up all of `/var/lib/grafana/` to capture config directory and any other state.~~
No longer needed — monitoring01 decommissioned, replaced by monitoring02 with declarative Grafana dashboards.
### 1b. Add Jellyfin Backup to jelly01 ### 1b. Add Jellyfin Backup to jelly01
@@ -94,15 +96,17 @@ For each stateful host, the procedure is:
7. Start services and verify functionality 7. Start services and verify functionality
8. Decommission the old VM 8. Decommission the old VM
### 3a. monitoring01 ### 3a. monitoring01 ✓ COMPLETE
1. Run final Grafana backup ~~1. Run final Grafana backup~~
2. Provision new monitoring01 via OpenTofu ~~2. Provision new monitoring01 via OpenTofu~~
3. After bootstrap, restore `/var/lib/grafana/` from restic ~~3. After bootstrap, restore `/var/lib/grafana/` from restic~~
4. Restart Grafana, verify dashboards and datasources are intact ~~4. Restart Grafana, verify dashboards and datasources are intact~~
5. Prometheus and Loki start fresh with empty data (acceptable) ~~5. Prometheus and Loki start fresh with empty data (acceptable)~~
6. Verify all scrape targets are being collected ~~6. Verify all scrape targets are being collected~~
7. Decommission old VM ~~7. Decommission old VM~~
Replaced by monitoring02 with VictoriaMetrics, standalone Loki and Grafana modules. Host configuration, old service modules, and terraform resources removed.
### 3b. jelly01 ### 3b. jelly01
@@ -163,19 +167,19 @@ Host was already removed from flake.nix and VM destroyed. Configuration cleaned
Host configuration, services, and VM already removed. Host configuration, services, and VM already removed.
### pgdb1 (in progress) ### pgdb1 ✓ COMPLETE
Only consumer was Open WebUI on gunter, which has been migrated to use local PostgreSQL. ~~Only consumer was Open WebUI on gunter, which has been migrated to use local PostgreSQL.~~
1. ~~Verify Open WebUI on gunter is using local PostgreSQL (not pgdb1)~~ ~~1. Verify Open WebUI on gunter is using local PostgreSQL (not pgdb1)~~
2. ~~Remove host configuration from `hosts/pgdb1/`~~ ~~2. Remove host configuration from `hosts/pgdb1/`~~
3. ~~Remove `services/postgres/` (only used by pgdb1)~~ ~~3. Remove `services/postgres/` (only used by pgdb1)~~
4. ~~Remove from `flake.nix`~~ ~~4. Remove from `flake.nix`~~
5. ~~Remove Vault AppRole from `terraform/vault/approle.tf`~~ ~~5. Remove Vault AppRole from `terraform/vault/approle.tf`~~
6. Destroy the VM in Proxmox ~~6. Destroy the VM in Proxmox~~
7. ~~Commit cleanup~~ ~~7. Commit cleanup~~
See `docs/plans/pgdb1-decommission.md` for detailed plan. Host configuration, services, terraform resources, and VM removed. See `docs/plans/pgdb1-decommission.md` for detailed plan.
## Phase 5: Decommission ca Host ✓ COMPLETE ## Phase 5: Decommission ca Host ✓ COMPLETE