docs: add new service candidates and NixOS router plans
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
145
docs/plans/new-services.md
Normal file
145
docs/plans/new-services.md
Normal file
@@ -0,0 +1,145 @@
|
||||
# New Service Candidates
|
||||
|
||||
Ideas for additional services to deploy in the homelab. These lean more enterprise/obscure
|
||||
than the typical self-hosted fare.
|
||||
|
||||
## Litestream
|
||||
|
||||
Continuous SQLite replication to S3-compatible storage. Streams WAL changes in near-real-time,
|
||||
providing point-in-time recovery without scheduled backup jobs.
|
||||
|
||||
**Why:** Several services use SQLite (Home Assistant, potentially others). Litestream would
|
||||
give continuous backup to Garage S3 with minimal resource overhead and near-zero configuration.
|
||||
Replaces cron-based backup scripts with a small daemon per database.
|
||||
|
||||
**Integration points:**
|
||||
- Garage S3 as replication target (already deployed)
|
||||
- Home Assistant SQLite database is the primary candidate
|
||||
- Could also cover any future SQLite-backed services
|
||||
|
||||
**Complexity:** Low. Single Go binary, minimal config (source DB path + S3 endpoint).
|
||||
|
||||
**NixOS packaging:** Available in nixpkgs as `litestream`.
|
||||
|
||||
---
|
||||
|
||||
## ntopng
|
||||
|
||||
Deep network traffic analysis and flow monitoring. Provides real-time visibility into bandwidth
|
||||
usage, protocol distribution, top talkers, and anomaly detection via a web UI.
|
||||
|
||||
**Why:** We have host-level metrics (node-exporter) and logs (Loki) but no network-level
|
||||
visibility. ntopng would show traffic patterns across the infrastructure — NFS throughput to
|
||||
the NAS, DNS query volume, inter-host traffic, and bandwidth anomalies. Useful for capacity
|
||||
planning and debugging network issues.
|
||||
|
||||
**Integration points:**
|
||||
- Could export metrics to Prometheus via its built-in exporter
|
||||
- Web UI behind http-proxy with Kanidm OIDC (if supported) or Pomerium
|
||||
- NetFlow/sFlow from managed switches (if available)
|
||||
- Passive traffic capture on a mirror port or the monitoring host itself
|
||||
|
||||
**Complexity:** Medium. Needs network tap or mirror port for full visibility, or can run
|
||||
in host-local mode. May need a dedicated interface or VLAN mirror.
|
||||
|
||||
**NixOS packaging:** Available in nixpkgs as `ntopng`.
|
||||
|
||||
---
|
||||
|
||||
## Renovate
|
||||
|
||||
Automated dependency update bot that understands Nix flakes natively. Creates branches/PRs
|
||||
to bump flake inputs on a configurable schedule.
|
||||
|
||||
**Why:** Currently `nix flake update` is manual. Renovate can automatically propose updates
|
||||
to individual flake inputs (nixpkgs, homelab-deploy, nixos-exporter, etc.), group related
|
||||
updates, and respect schedules. More granular than updating everything at once — can bump
|
||||
nixpkgs weekly but hold back other inputs, auto-merge patch-level changes, etc.
|
||||
|
||||
**Integration points:**
|
||||
- Runs against git.t-juice.club repositories
|
||||
- Understands `flake.lock` format natively
|
||||
- Could target both `nixos-servers` and `nixos` repos
|
||||
- Update branches would be validated by homelab-deploy builder
|
||||
|
||||
**Complexity:** Medium. Needs git forge integration (Gitea/Forgejo API). Self-hosted runner
|
||||
mode available. Configuration via `renovate.json` in each repo.
|
||||
|
||||
**NixOS packaging:** Available in nixpkgs as `renovate`.
|
||||
|
||||
---
|
||||
|
||||
## Pomerium
|
||||
|
||||
Identity-aware reverse proxy implementing zero-trust access. Every request is authenticated
|
||||
and authorized based on identity, device, and context — not just network location.
|
||||
|
||||
**Why:** Currently Caddy terminates TLS but doesn't enforce authentication on most services.
|
||||
Pomerium would put Kanidm OIDC authentication in front of every internal service, with
|
||||
per-route authorization policies (e.g., "only admins can access Prometheus," "require re-auth
|
||||
for Vault UI"). Directly addresses the security hardening plan's goals.
|
||||
|
||||
**Integration points:**
|
||||
- Kanidm as OIDC identity provider (already deployed)
|
||||
- Could replace or sit in front of Caddy for internal services
|
||||
- Per-route policies based on Kanidm groups (admins, users, ssh-users)
|
||||
- Centralizes access logging and audit trail
|
||||
|
||||
**Complexity:** Medium-high. Needs careful integration with existing Caddy reverse proxy.
|
||||
Decision needed on whether Pomerium replaces Caddy or works alongside it (Pomerium for
|
||||
auth, Caddy for TLS termination and routing, or Pomerium handles everything).
|
||||
|
||||
**NixOS packaging:** Available in nixpkgs as `pomerium`.
|
||||
|
||||
---
|
||||
|
||||
## Apache Guacamole
|
||||
|
||||
Clientless remote desktop and SSH gateway. Provides browser-based access to hosts via
|
||||
RDP, VNC, SSH, and Telnet with no client software required. Supports session recording
|
||||
and playback.
|
||||
|
||||
**Why:** Provides an alternative remote access path that doesn't require VPN software or
|
||||
SSH keys on the client device. Useful for accessing hosts from untrusted machines (phone,
|
||||
borrowed laptop) or providing temporary access to others. Session recording gives an audit
|
||||
trail. Could complement the WireGuard remote access plan rather than replace it.
|
||||
|
||||
**Integration points:**
|
||||
- Kanidm for authentication (OIDC or LDAP)
|
||||
- Behind http-proxy or Pomerium for TLS
|
||||
- SSH access to all hosts in the fleet
|
||||
- Session recordings could be stored on Garage S3
|
||||
- Could serve as the "emergency access" path when VPN is unavailable
|
||||
|
||||
**Complexity:** Medium. Java-based (guacd + web app), typically needs PostgreSQL for
|
||||
connection/user storage (already available). Docker is the common deployment method but
|
||||
native packaging exists.
|
||||
|
||||
**NixOS packaging:** Available in nixpkgs as `guacamole-server` and `guacamole-client`.
|
||||
|
||||
---
|
||||
|
||||
## CrowdSec
|
||||
|
||||
Collaborative intrusion prevention system with crowd-sourced threat intelligence.
|
||||
Parses logs to detect attack patterns, applies remediation (firewall bans, CAPTCHA),
|
||||
and shares/receives threat signals from a global community network.
|
||||
|
||||
**Why:** Goes beyond fail2ban with behavioral detection, crowd-sourced IP reputation,
|
||||
and a scenario-based engine. Fits the security hardening plan. The community blocklist
|
||||
means we benefit from threat intelligence gathered across thousands of deployments.
|
||||
Could parse SSH logs, HTTP access logs, and other service logs to detect and block
|
||||
malicious activity.
|
||||
|
||||
**Integration points:**
|
||||
- Could consume logs from Loki or directly from journald/log files
|
||||
- Firewall bouncer for iptables/nftables remediation
|
||||
- Caddy bouncer for HTTP-level blocking
|
||||
- Prometheus metrics exporter for alert integration
|
||||
- Scenarios available for SSH brute force, HTTP scanning, and more
|
||||
- Feeds into existing alerting pipeline (Alertmanager -> alerttonotify)
|
||||
|
||||
**Complexity:** Medium. Agent (log parser + decision engine) on each host or centralized.
|
||||
Bouncers (enforcement) on edge hosts. Free community tier includes threat intel access.
|
||||
|
||||
**NixOS packaging:** Available in nixpkgs as `crowdsec`.
|
||||
Reference in New Issue
Block a user