26 KiB
TODO: Automated Host Deployment Pipeline
Vision
Automate the entire process of creating, configuring, and deploying new NixOS hosts on Proxmox from a single command or script.
Desired workflow:
./scripts/create-host.sh --hostname myhost --ip 10.69.13.50
# Script creates config, deploys VM, bootstraps NixOS, and you're ready to go
Current manual workflow (from CLAUDE.md):
- Create
/hosts/<hostname>/directory structure - Add host to
flake.nix - Add DNS entries
- Clone template VM manually
- Run
prepare-host.shon new VM - Add generated age key to
.sops.yaml - Configure networking
- Commit and push
- Run
nixos-rebuild boot --flake URL#<hostname>on host
The Plan
Phase 1: Parameterized OpenTofu Deployments ✅ COMPLETED
Status: Fully implemented and tested
Implementation:
- Locals-based structure using
for_eachpattern for multiple VM deployments - All VM parameters configurable with smart defaults (CPU, memory, disk, IP, storage, etc.)
- Automatic DHCP vs static IP detection based on
ipfield presence - Dynamic outputs showing deployed VM IPs and specifications
- Successfully tested deploying multiple VMs simultaneously
Tasks:
- Create module/template structure in terraform for repeatable VM deployments
- Parameterize VM configuration (hostname, CPU, memory, disk, IP)
- Support both DHCP and static IP configuration via cloud-init
- Test deploying multiple VMs from same template
Deliverable: ✅ Can deploy multiple VMs with custom parameters via OpenTofu in a single tofu apply
Files:
terraform/vms.tf- VM definitions using locals mapterraform/outputs.tf- Dynamic outputs for all VMsterraform/variables.tf- Configurable defaultsterraform/README.md- Complete documentation
Phase 2: Host Configuration Generator ✅ COMPLETED
Status: ✅ Fully implemented and tested Completed: 2025-02-01 Enhanced: 2025-02-01 (added --force flag)
Goal: Automate creation of host configuration files
Implementation:
- Python CLI tool packaged as Nix derivation
- Available as
create-hostcommand in devShell - Rich terminal UI with configuration previews
- Comprehensive validation (hostname format/uniqueness, IP subnet/uniqueness)
- Jinja2 templates for NixOS configurations
- Automatic updates to flake.nix and terraform/vms.tf
--forceflag for regenerating existing configurations (useful for testing)
Tasks:
- Create Python CLI with typer framework
- Takes parameters: hostname, IP, CPU cores, memory, disk size
- Generates
/hosts/<hostname>/directory structure - Creates
configuration.nixwith proper hostname and networking - Generates
default.nixwith standard imports - References shared
hardware-configuration.nixfrom template
- Add host entry to
flake.nixprogrammatically- Text-based manipulation (regex insertion)
- Inserts new nixosConfiguration entry
- Maintains proper formatting
- Generate corresponding OpenTofu configuration
- Adds VM definition to
terraform/vms.tf - Uses parameters from CLI input
- Supports both static IP and DHCP modes
- Adds VM definition to
- Package as Nix derivation with templates
- Add to flake packages and devShell
- Implement dry-run mode
- Write comprehensive README
Usage:
# In nix develop shell
create-host \
--hostname test01 \
--ip 10.69.13.50/24 \ # optional, omit for DHCP
--cpu 4 \ # optional, default 2
--memory 4096 \ # optional, default 2048
--disk 50G \ # optional, default 20G
--dry-run # optional preview mode
Files:
scripts/create-host/- Complete Python package with Nix derivationscripts/create-host/README.md- Full documentation and examples
Deliverable: ✅ Tool generates all config files for a new host, validated with Nix and Terraform
Phase 3: Bootstrap Mechanism ✅ COMPLETED
Status: ✅ Fully implemented and tested Completed: 2025-02-01 Enhanced: 2025-02-01 (added branch support for testing)
Goal: Get freshly deployed VM to apply its specific host configuration
Implementation: Systemd oneshot service that runs on first boot after cloud-init
Approach taken: Systemd service (variant of Option A)
- Systemd service
nixos-bootstrap.serviceruns on first boot - Depends on
cloud-config.serviceto ensure hostname is set - Reads hostname from
hostnamectl(set by cloud-init via Terraform) - Supports custom git branch via
NIXOS_FLAKE_BRANCHenvironment variable - Runs
nixos-rebuild boot --flake git+https://git.t-juice.club/torjus/nixos-servers.git?ref=$BRANCH#${hostname} - Reboots into new configuration on success
- Fails gracefully without reboot on errors (network issues, missing config)
- Service self-destructs after successful bootstrap (not in new config)
Tasks:
- Create bootstrap service module in template2
- systemd oneshot service with proper dependencies
- Reads hostname from hostnamectl (cloud-init sets it)
- Checks network connectivity via HTTPS (curl)
- Runs nixos-rebuild boot with flake URL
- Reboots on success, fails gracefully on error
- Configure cloud-init datasource
- Use ConfigDrive datasource (Proxmox provider)
- Add cloud-init disk to Terraform VMs (disks.ide.ide2.cloudinit)
- Hostname passed via cloud-init user-data from Terraform
- Test bootstrap service execution on fresh VM
- Handle failure cases (flake doesn't exist, network issues)
- Clear error messages in journald
- No reboot on failure
- System remains accessible for debugging
Files:
hosts/template2/bootstrap.nix- Bootstrap service definitionhosts/template2/configuration.nix- Cloud-init ConfigDrive datasourceterraform/vms.tf- Cloud-init disk configuration
Deliverable: ✅ VMs automatically bootstrap and reboot into host-specific configuration on first boot
Phase 4: Secrets Management with OpenBao (Vault)
Status: 🚧 Phases 4a & 4b Complete, 4c & 4d In Progress
Challenge: Current sops-nix approach has chicken-and-egg problem with age keys
Current workflow:
- VM boots, generates age key at
/var/lib/sops-nix/key.txt - User runs
prepare-host.shwhich prints public key - User manually adds public key to
.sops.yaml - User commits, pushes
- VM can now decrypt secrets
Selected approach: Migrate to OpenBao (Vault fork) for centralized secrets management
Why OpenBao instead of HashiCorp Vault:
- HashiCorp Vault switched to BSL (Business Source License), unavailable in NixOS cache
- OpenBao is the community fork maintaining the pre-BSL MPL 2.0 license
- API-compatible with Vault, uses same Terraform provider
- Maintains all Vault features we need
Benefits:
- Industry-standard secrets management (Vault-compatible experience)
- Eliminates manual age key distribution step
- Secrets-as-code via OpenTofu (infrastructure-as-code aligned)
- Centralized PKI management with ACME support (ready to replace step-ca)
- Automatic secret rotation capabilities
- Audit logging for all secret access (not yet enabled)
- AppRole authentication enables automated bootstrap
Current Architecture:
vault01.home.2rjus.net (10.69.13.19)
├─ KV Secrets Engine (ready to replace sops-nix)
│ ├─ secret/hosts/{hostname}/*
│ ├─ secret/services/{service}/*
│ └─ secret/shared/{category}/*
├─ PKI Engine (ready to replace step-ca for TLS)
│ ├─ Root CA (EC P-384, 10 year)
│ ├─ Intermediate CA (EC P-384, 5 year)
│ └─ ACME endpoint enabled
├─ SSH CA Engine (TODO: Phase 4c)
└─ AppRole Auth (per-host authentication configured)
↓
[✅ Phase 4d] New hosts authenticate on first boot
[✅ Phase 4d] Fetch secrets via Vault API
No manual key distribution needed
Completed:
- ✅ Phase 4a: OpenBao server with TPM2 auto-unseal
- ✅ Phase 4b: Infrastructure-as-code (secrets, policies, AppRoles, PKI)
- ✅ Phase 4d: Bootstrap integration for automated secrets access
Next Steps:
- Phase 4c: Migrate from step-ca to OpenBao PKI
Phase 4a: Vault Server Setup ✅ COMPLETED
Status: ✅ Fully implemented and tested Completed: 2026-02-02
Goal: Deploy and configure Vault server with auto-unseal
Implementation:
- Used OpenBao (Vault fork) instead of HashiCorp Vault due to BSL licensing concerns
- TPM2-based auto-unseal using systemd's native
LoadCredentialEncrypted - Self-signed bootstrap TLS certificates (avoiding circular dependency with step-ca)
- File-based storage backend at
/var/lib/openbao - Unix socket + TCP listener (0.0.0.0:8200) configuration
Tasks:
- Create
hosts/vault01/configuration- Basic NixOS configuration (hostname: vault01, IP: 10.69.13.19/24)
- Created reusable
services/vaultmodule - Firewall not needed (trusted network)
- Already in flake.nix, deployed via terraform
- Implement auto-unseal mechanism
- TPM2-based auto-unseal (preferred option)
- systemd
LoadCredentialEncryptedwith TPM2 binding writeShellApplicationscript with proper runtime dependencies- Reads multiple unseal keys (one per line) until unsealed
- Auto-unseals on service start via
ExecStartPost
- systemd
- TPM2-based auto-unseal (preferred option)
- Initial Vault setup
- Initialized OpenBao with Shamir secret sharing (5 keys, threshold 3)
- File storage backend
- Self-signed TLS certificates via LoadCredential
- Deploy to infrastructure
- DNS entry added for vault01.home.2rjus.net
- VM deployed via terraform
- Verified OpenBao running and auto-unsealing
Changes from Original Plan:
- Used OpenBao instead of HashiCorp Vault (licensing)
- Used systemd's native TPM2 support instead of tpm2-tools directly
- Skipped audit logging (can be enabled later)
- Used self-signed certs initially (will migrate to OpenBao PKI later)
Deliverable: ✅ Running OpenBao server that auto-unseals on boot using TPM2
Documentation:
/services/vault/README.md- Service module overview/docs/vault/auto-unseal.md- Complete TPM2 auto-unseal setup guide
Phase 4b: Vault-as-Code with OpenTofu ✅ COMPLETED
Status: ✅ Fully implemented and tested Completed: 2026-02-02
Goal: Manage all Vault configuration (secrets structure, policies, roles) as code
Implementation:
- Complete Terraform/OpenTofu configuration in
terraform/vault/ - Locals-based pattern (similar to
vms.tf) for declaring secrets and policies - Auto-generation of secrets using
random_passwordprovider - Three-tier secrets path hierarchy:
hosts/,services/,shared/ - PKI infrastructure with Elliptic Curve certificates (P-384 for CAs, P-256 for leaf certs)
- ACME support enabled on intermediate CA
Tasks:
- Set up Vault Terraform provider
- Created
terraform/vault/directory - Configured Vault provider (uses HashiCorp provider, compatible with OpenBao)
- Credentials in terraform.tfvars (gitignored)
- terraform.tfvars.example for reference
- Created
- Enable and configure secrets engines
- KV v2 engine at
secret/ - Three-tier path structure:
secret/hosts/{hostname}/*- Host-specific secretssecret/services/{service}/*- Service-wide secretssecret/shared/{category}/*- Shared secrets (SMTP, backups, etc.)
- KV v2 engine at
- Define policies as code
- Policies auto-generated from
locals.host_policies - Per-host policies with read/list on designated paths
- Principle of least privilege enforced
- Policies auto-generated from
- Set up AppRole authentication
- AppRole backend enabled at
approle/ - Roles auto-generated per host from
locals.host_policies - Token TTL: 1 hour, max 24 hours
- Policies bound to roles
- AppRole backend enabled at
- Implement secrets-as-code patterns
- Auto-generated secrets using
random_passwordprovider - Manual secrets supported via variables in terraform.tfvars
- Secret structure versioned in .tf files
- Secret values excluded from git
- Auto-generated secrets using
- Set up PKI infrastructure
- Root CA (10 year TTL, EC P-384)
- Intermediate CA (5 year TTL, EC P-384)
- PKI role for
*.home.2rjus.net(30 day max TTL, EC P-256) - ACME enabled on intermediate CA
- Support for static certificate issuance via Terraform
- CRL, OCSP, and issuing certificate URLs configured
Changes from Original Plan:
- Used Elliptic Curve instead of RSA for all certificates (better performance, smaller keys)
- Implemented PKI infrastructure in Phase 4b instead of Phase 4c (more logical grouping)
- ACME support configured immediately (ready for migration from step-ca)
- Did not migrate existing sops-nix secrets yet (deferred to gradual migration)
Files:
terraform/vault/main.tf- Provider configurationterraform/vault/variables.tf- Variable definitionsterraform/vault/approle.tf- AppRole authentication (locals-based pattern)terraform/vault/pki.tf- PKI infrastructure with EC certificatesterraform/vault/secrets.tf- KV secrets engine (auto-generation support)terraform/vault/README.md- Complete documentation and usage examplesterraform/vault/terraform.tfvars.example- Example credentials
Deliverable: ✅ All secrets, policies, AppRoles, and PKI managed as OpenTofu code in terraform/vault/
Documentation:
/terraform/vault/README.md- Comprehensive guide covering:- Setup and deployment
- AppRole usage and host access patterns
- PKI certificate issuance (ACME, static, manual)
- Secrets management patterns
- ACME configuration and troubleshooting
Phase 4c: PKI Migration (Replace step-ca)
Goal: Migrate hosts from step-ca to OpenBao PKI for TLS certificates
Note: PKI infrastructure already set up in Phase 4b (root CA, intermediate CA, ACME support)
Tasks:
- Set up OpenBao PKI engines (completed in Phase 4b)
- Root CA (
pki/mount, 10 year TTL, EC P-384) - Intermediate CA (
pki_int/mount, 5 year TTL, EC P-384) - Signed intermediate with root CA
- Configured CRL, OCSP, and issuing certificate URLs
- Root CA (
- Enable ACME support (completed in Phase 4b)
- Enabled ACME on intermediate CA
- Created PKI role for
*.home.2rjus.net - Set certificate TTLs (30 day max) and allowed domains
- ACME directory:
https://vault01.home.2rjus.net:8200/v1/pki_int/acme/directory
- Download and distribute root CA certificate
- Export root CA:
bao read -field=certificate pki/cert/ca > homelab-root-ca.crt - Add to NixOS trust store on all hosts via
security.pki.certificateFiles - Deploy via auto-upgrade
- Export root CA:
- Test certificate issuance
- Issue test certificate using ACME client (lego/certbot)
- Or issue static certificate via OpenBao CLI
- Verify certificate chain and trust
- Migrate vault01's own certificate
- Issue new certificate from OpenBao PKI (self-issued)
- Replace self-signed bootstrap certificate
- Update service configuration
- Migrate hosts from step-ca to OpenBao
- Update
system/acme.nixto use OpenBao ACME endpoint - Change server to
https://vault01.home.2rjus.net:8200/v1/pki_int/acme/directory - Test on one host (non-critical service)
- Roll out to all hosts via auto-upgrade
- Update
- Configure SSH CA in OpenBao (optional, future work)
- Enable SSH secrets engine (
ssh/mount) - Generate SSH signing keys
- Create roles for host and user certificates
- Configure TTLs and allowed principals
- Distribute SSH CA public key to all hosts
- Update sshd_config to trust OpenBao CA
- Enable SSH secrets engine (
- Decommission step-ca
- Verify all ACME services migrated and working
- Stop step-ca service on ca host
- Archive step-ca configuration for backup
- Update documentation
Deliverable: All TLS certificates issued by OpenBao PKI, step-ca retired
Phase 4d: Bootstrap Integration ✅ COMPLETED (2026-02-02)
Goal: New hosts automatically authenticate to Vault on first boot, no manual steps
Tasks:
- Update create-host tool
- Generate wrapped token (24h TTL, single-use) for new host
- Add host-specific policy to Vault (via terraform/vault/hosts-generated.tf)
- Store wrapped token in terraform/vms.tf for cloud-init injection
- Add
--regenerate-tokenflag to regenerate only the token without overwriting config
- Update template2 for Vault authentication
- Reads wrapped token from cloud-init (/run/cloud-init-env)
- Unwraps token to get role_id + secret_id
- Stores AppRole credentials in /var/lib/vault/approle/ (persistent)
- Graceful fallback if Vault unavailable during bootstrap
- Create NixOS Vault secrets module (system/vault-secrets.nix)
- Runtime secret fetching (services fetch on start, not at nixos-rebuild time)
- Secrets cached in /var/lib/vault/cache/ for fallback when Vault unreachable
- Secrets written to /run/secrets/ (tmpfs, cleared on reboot)
- Fresh authentication per service start (no token renewal needed)
- Optional periodic rotation with systemd timers
- Critical service protection (no auto-restart for DNS, CA, Vault itself)
- Create vault-fetch helper script
- Standalone tool for fetching secrets from Vault
- Authenticates using AppRole credentials
- Writes individual files per secret key
- Handles caching and fallback logic
- Update bootstrap service (hosts/template2/bootstrap.nix)
- Unwraps Vault token on first boot
- Stores persistent AppRole credentials
- Continues with nixos-rebuild
- Services fetch secrets when they start
- Update terraform cloud-init (terraform/cloud-init.tf)
- Inject VAULT_ADDR and VAULT_WRAPPED_TOKEN via write_files
- Write to /run/cloud-init-env (tmpfs, cleaned on reboot)
- Fixed YAML indentation issues (write_files at top level)
- Support flake_branch alongside vault credentials
- Test complete flow
- Created vaulttest01 test host
- Verified bootstrap with Vault integration
- Verified service secret fetching
- Tested cache fallback when Vault unreachable
- Tested wrapped token single-use (second bootstrap fails as expected)
- Confirmed zero manual steps required
Implementation Details:
Wrapped Token Security:
- Single-use tokens prevent reuse if leaked
- 24h TTL limits exposure window
- Safe to commit to git (expired/used tokens useless)
- Regenerate with
create-host --hostname X --regenerate-token
Secret Fetching:
- Runtime (not build-time) keeps secrets out of Nix store
- Cache fallback enables service availability when Vault down
- Fresh authentication per service start (no renewal complexity)
- Individual files per secret key for easy consumption
Bootstrap Flow:
1. create-host --hostname myhost --ip 10.69.13.x/24
↓ Generates wrapped token, updates terraform
2. tofu apply (deploys VM with cloud-init)
↓ Cloud-init writes wrapped token to /run/cloud-init-env
3. nixos-bootstrap.service runs:
↓ Unwraps token → gets role_id + secret_id
↓ Stores in /var/lib/vault/approle/ (persistent)
↓ Runs nixos-rebuild boot
4. Service starts → fetches secrets from Vault
↓ Uses stored AppRole credentials
↓ Caches secrets for fallback
5. Done - zero manual intervention
Files Created:
scripts/vault-fetch/- Secret fetching helper (Nix package)system/vault-secrets.nix- NixOS module for declarative Vault secretsscripts/create-host/vault_helper.py- Vault API integrationterraform/vault/hosts-generated.tf- Auto-generated host policiesdocs/vault-bootstrap-implementation.md- Architecture documentationdocs/vault-bootstrap-testing.md- Testing guide
Configuration:
- Vault address:
https://vault01.home.2rjus.net:8200(configurable) - All defaults remain configurable via environment variables or NixOS options
Next Steps:
- Gradually migrate existing services from sops-nix to Vault
- Add CNAME for vault.home.2rjus.net → vault01.home.2rjus.net
- Phase 4c: Migrate from step-ca to OpenBao PKI (future)
Deliverable: ✅ Fully automated secrets access from first boot, zero manual steps
Phase 5: DNS Automation
Goal: Automatically generate DNS entries from host configurations
Approach: Leverage Nix to generate zone file entries from flake host configurations
Since most hosts use static IPs defined in their NixOS configurations, we can extract this information and automatically generate A records. This keeps DNS in sync with the actual host configs.
Tasks:
- Add optional CNAME field to host configurations
- Add
networking.cnames = [ "alias1" "alias2" ]or similar option - Document in host configuration template
- Add
- Create Nix function to extract DNS records from all hosts
- Parse each host's
networking.hostNameand IP configuration - Collect any defined CNAMEs
- Generate zone file fragment with A and CNAME records
- Parse each host's
- Integrate auto-generated records into zone files
- Keep manual entries separate (for non-flake hosts/services)
- Include generated fragment in main zone file
- Add comments showing which records are auto-generated
- Update zone file serial number automatically
- Test zone file validity after generation
- Either:
- Automatically trigger DNS server reload (Ansible)
- Or document manual step: merge to master, run upgrade on ns1/ns2
Deliverable: DNS A records and CNAMEs automatically generated from host configs
Phase 6: Integration Script
Goal: Single command to create and deploy a new host
Tasks:
- Create
scripts/create-host.shmaster script that orchestrates:- Prompts for: hostname, IP (or DHCP), CPU, memory, disk
- Validates inputs (IP not in use, hostname unique, etc.)
- Calls host config generator (Phase 2)
- Generates OpenTofu config (Phase 2)
- Handles secrets (Phase 4)
- Updates DNS (Phase 5)
- Commits all changes to git
- Runs
tofu applyto deploy VM - Waits for bootstrap to complete (Phase 3)
- Prints success message with IP and SSH command
- Add
--dry-runflag to preview changes - Add
--interactivemode vs--batchmode - Error handling and rollback on failures
Deliverable: ./scripts/create-host.sh --hostname myhost --ip 10.69.13.50 creates a fully working host
Phase 7: Testing & Documentation
Status: 🚧 In Progress (testing improvements completed)
Testing Improvements Implemented (2025-02-01):
The pipeline now supports efficient testing without polluting master branch:
1. --force Flag for create-host
- Re-run
create-hostto regenerate existing configurations - Updates existing entries in flake.nix and terraform/vms.tf (no duplicates)
- Skip uniqueness validation checks
- Useful for iterating on configuration templates during testing
2. Branch Support for Bootstrap
- Bootstrap service reads
NIXOS_FLAKE_BRANCHenvironment variable - Defaults to
masterif not set - Allows testing pipeline changes on feature branches
- Cloud-init passes branch via
/etc/environment
3. Cloud-init Disk for Branch Configuration
- Terraform generates custom cloud-init snippets for test VMs
- Set
flake_branchfield in VM definition to use non-master branch - Production VMs omit this field and use master (default)
- Files automatically uploaded to Proxmox via SSH
Testing Workflow:
# 1. Create test branch
git checkout -b test-pipeline
# 2. Generate or update host config
create-host --hostname testvm01 --ip 10.69.13.100/24
# 3. Edit terraform/vms.tf to add test VM with branch
# vms = {
# "testvm01" = {
# ip = "10.69.13.100/24"
# flake_branch = "test-pipeline" # Bootstrap from this branch
# }
# }
# 4. Commit and push test branch
git add -A && git commit -m "test: add testvm01"
git push origin test-pipeline
# 5. Deploy VM
cd terraform && tofu apply
# 6. Watch bootstrap (VM fetches from test-pipeline branch)
ssh root@10.69.13.100
journalctl -fu nixos-bootstrap.service
# 7. Iterate: modify templates and regenerate with --force
cd .. && create-host --hostname testvm01 --ip 10.69.13.100/24 --force
git commit -am "test: update config" && git push
# Redeploy to test fresh bootstrap
cd terraform
tofu destroy -target=proxmox_vm_qemu.vm[\"testvm01\"] && tofu apply
# 8. Clean up when done: squash commits, merge to master, remove test VM
Files:
scripts/create-host/create_host.py- Added --force parameterscripts/create-host/manipulators.py- Update vs insert logichosts/template2/bootstrap.nix- Branch support via environment variableterraform/vms.tf- flake_branch field supportterraform/cloud-init.tf- Custom cloud-init disk generationterraform/variables.tf- proxmox_host variable for SSH uploads
Remaining Tasks:
- Test full pipeline end-to-end on feature branch
- Update CLAUDE.md with testing workflow
- Add troubleshooting section
- Create examples for common scenarios (DHCP host, static IP host, etc.)
Open Questions
- Bootstrap method: Cloud-init runcmd vs Terraform provisioner vs Ansible?
- Secrets handling: Pre-generate keys vs post-deployment injection?
- DNS automation: Auto-commit or manual merge?
- Git workflow: Auto-push changes or leave for user review?
- Template selection: Single template2 or multiple templates for different host types?
- Networking: Always DHCP initially, or support static IP from start?
- Error recovery: What happens if bootstrap fails? Manual intervention or retry?
Implementation Order
Recommended sequence:
- Phase 1: Parameterize OpenTofu (foundation for testing)
- Phase 3: Bootstrap mechanism (core automation)
- Phase 2: Config generator (automate the boilerplate)
- Phase 4: Secrets (solves biggest chicken-and-egg)
- Phase 5: DNS (nice-to-have automation)
- Phase 6: Integration script (ties it all together)
- Phase 7: Testing & docs
Success Criteria
When complete, creating a new host should:
- Take < 5 minutes of human time
- Require minimal user input (hostname, IP, basic specs)
- Result in a fully configured, secret-enabled, DNS-registered host
- Be reproducible and documented
- Handle common errors gracefully
Notes
- Keep incremental commits at each phase
- Test each phase independently before moving to next
- Maintain backward compatibility with manual workflow
- Document any manual steps that can't be automated