docs: update auth-system-replacement plan with progress
Some checks failed
Run nix flake check / flake-check (push) Failing after 1s
Periodic flake update / flake-update (push) Failing after 5s

- Mark completed implementation steps
- Document deployed kanidm01 configuration
- Record UID/GID range decision (65,536-69,999)
- Add verified working items (WebUI, LDAP, certs)
- Update next steps and resolved questions

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-02-08 00:45:03 +01:00
parent 538c2ad097
commit 93dbb45802

View File

@@ -2,7 +2,7 @@
## Overview
Replace the current auth01 setup (LLDAP + Authelia) with a modern, unified authentication solution. The current setup is not in active use, making this a good time to evaluate alternatives.
Deploy a modern, unified authentication solution for the homelab. Provides central user management, SSO for web services, and consistent UID/GID mapping for NAS permissions.
## Goals
@@ -11,66 +11,9 @@ Replace the current auth01 setup (LLDAP + Authelia) with a modern, unified authe
3. **UID/GID consistency** - Proper POSIX attributes for NAS share permissions
4. **OIDC provider** - Single sign-on for homelab web services (Grafana, etc.)
## Options Evaluated
## Solution: Kanidm
### OpenLDAP (raw)
- **NixOS Support:** Good (`services.openldap` with `declarativeContents`)
- **Pros:** Most widely supported, very flexible
- **Cons:** LDIF format is painful, schema management is complex, no built-in OIDC, requires SSSD on each client
- **Verdict:** Doesn't address LDAP complexity concerns
### LLDAP + Authelia (current)
- **NixOS Support:** Both have good modules
- **Pros:** Already configured, lightweight, nice web UIs
- **Cons:** Two services to manage, limited POSIX attribute support in LLDAP, requires SSSD on every client host
- **Verdict:** Workable but has friction for NAS/UID goals
### FreeIPA
- **NixOS Support:** None
- **Pros:** Full enterprise solution (LDAP + Kerberos + DNS + CA)
- **Cons:** Extremely heavy, wants to own DNS, designed for Red Hat ecosystems, massive overkill for homelab
- **Verdict:** Overkill, no NixOS support
### Keycloak
- **NixOS Support:** None
- **Pros:** Good OIDC/SAML, nice UI
- **Cons:** Primarily an identity broker not a user directory, poor POSIX support, heavy (Java)
- **Verdict:** Wrong tool for Linux user management
### Authentik
- **NixOS Support:** None (would need Docker)
- **Pros:** All-in-one with LDAP outpost and OIDC, modern UI
- **Cons:** Heavy stack (Python + PostgreSQL + Redis), LDAP is a separate component
- **Verdict:** Would work but requires Docker and is heavy
### Kanidm
- **NixOS Support:** Excellent - first-class module with PAM/NSS integration
- **Pros:**
- Native PAM/NSS module (no SSSD needed)
- Built-in OIDC provider
- Optional LDAP interface for legacy services
- Declarative provisioning via NixOS (users, groups, OAuth2 clients)
- Modern, written in Rust
- Single service handles everything
- **Cons:** Newer project, smaller community than LDAP
- **Verdict:** Best fit for requirements
### Pocket-ID
- **NixOS Support:** Unknown
- **Pros:** Very lightweight, passkey-first
- **Cons:** No LDAP, no PAM/NSS integration - purely OIDC for web apps
- **Verdict:** Doesn't solve Linux user management goal
## Recommendation: Kanidm
Kanidm is the recommended solution for the following reasons:
Kanidm was chosen for the following reasons:
| Requirement | Kanidm Support |
|-------------|----------------|
@@ -82,42 +25,10 @@ Kanidm is the recommended solution for the following reasons:
| Simplicity | Modern API, LDAP optional |
| NixOS integration | First-class |
### Key NixOS Features
### Configuration Files
**Server configuration:**
```nix
services.kanidm.enableServer = true;
services.kanidm.serverSettings = {
domain = "home.2rjus.net";
origin = "https://auth.home.2rjus.net";
ldapbindaddress = "0.0.0.0:636"; # Optional LDAP interface
};
```
**Declarative user provisioning:**
```nix
services.kanidm.provision.enable = true;
services.kanidm.provision.persons.torjus = {
displayName = "Torjus";
groups = [ "admins" "nas-users" ];
};
```
**Declarative OAuth2 clients:**
```nix
services.kanidm.provision.systems.oauth2.grafana = {
displayName = "Grafana";
originUrl = "https://grafana.home.2rjus.net/login/generic_oauth";
originLanding = "https://grafana.home.2rjus.net";
};
```
**Client host configuration (add to system/):**
```nix
services.kanidm.enableClient = true;
services.kanidm.enablePam = true;
services.kanidm.clientSettings.uri = "https://auth.home.2rjus.net";
```
- **Host configuration:** `hosts/kanidm01/`
- **Service module:** `services/kanidm/default.nix`
## NAS Integration
@@ -148,42 +59,79 @@ This future migration path is a strong argument for Kanidm over LDAP-only soluti
## Implementation Steps
1. **Create Kanidm service module** in `services/kanidm/`
- Server configuration
- TLS via internal ACME
- Vault secrets for admin passwords
1. **Create kanidm01 host and service module**
- Host: `kanidm01.home.2rjus.net` (10.69.13.23, test tier)
- Service module: `services/kanidm/`
- TLS via internal ACME (`auth.home.2rjus.net`)
- Vault integration for idm_admin password
- LDAPS on port 636
2. **Configure declarative provisioning**
- Define initial users and groups
- Set up POSIX attributes (UID/GID ranges)
2. **Configure declarative provisioning**
- Groups: `admins`, `users`, `ssh-users`
- User: `torjus` (member of all groups)
- POSIX attributes enabled (UID/GID range 65,536-69,999)
3. **Add OIDC clients** for homelab services
- Grafana
- Other services as needed
4. **Create client module** in `system/` for PAM/NSS
- Enable on all hosts that need central auth
- Configure trusted CA
5. **Test NAS integration**
3. **Test NAS integration** (in progress)
- ✅ LDAP interface verified working
- Configure TrueNAS LDAP client to connect to Kanidm
- Verify UID/GID mapping works with NFS shares
6. **Migrate auth01**
- Remove LLDAP and Authelia services
- Deploy Kanidm
- Update DNS CNAMEs if needed
4. **Add OIDC clients** for homelab services
- Grafana
- Other services as needed
7. **Documentation**
5. **Create client module** in `system/` for PAM/NSS
- Enable on all hosts that need central auth
- Configure trusted CA
6. **Documentation**
- User management procedures
- Adding new OAuth2 clients
- Troubleshooting PAM/NSS issues
## Open Questions
## Progress
- What UID/GID range should be reserved for Kanidm-managed users?
- Which hosts should have PAM/NSS enabled initially?
- What OAuth2 clients are needed at launch?
### Completed (2026-02-08)
**Kanidm server deployed on kanidm01 (test tier):**
- Host: `kanidm01.home.2rjus.net` (10.69.13.23)
- WebUI: `https://auth.home.2rjus.net`
- LDAPS: port 636
- Valid certificate from internal CA
**Configuration:**
- Kanidm 1.8 with secret provisioning support
- Daily backups at 22:00 (7 versions retained)
- Vault integration for idm_admin password
- Prometheus monitoring scrape target configured
**Provisioned entities:**
- Groups: `admins`, `users`, `ssh-users`
- User: `torjus` (member of all groups, POSIX enabled with GID 65536)
**Verified working:**
- WebUI login with idm_admin
- LDAP bind and search with POSIX-enabled user
- LDAPS with valid internal CA certificate
### UID/GID Range (Resolved)
**Range: 65,536 - 69,999** (manually allocated)
- Users: 65,536 - 67,999 (up to ~2500 users)
- Groups: 68,000 - 69,999 (up to ~2000 groups)
Rationale:
- Starts at Kanidm's recommended minimum (65,536)
- Well above NixOS system users (typically <1000)
- Avoids Podman/container issues with very high GIDs
### Next Steps
1. Deploy to monitoring01 to enable Prometheus scraping
2. Configure TrueNAS LDAP client for NAS integration testing
3. Add OAuth2 clients (Grafana first)
4. Create PAM/NSS client module for other hosts
## References