Files
nixos-servers/docs/plans/auth-system-replacement.md
Torjus Håkestad 02270a0e4a
Some checks failed
Run nix flake check / flake-check (pull_request) Successful in 2m7s
Run nix flake check / flake-check (push) Failing after 16m31s
docs: update plans with Grafana OIDC progress
- auth-system-replacement.md: Mark OAuth2 client (Grafana) as completed,
  document key findings (PKCE, attribute paths, user requirements)
- monitoring-migration-victoriametrics.md: Note Grafana deployment on
  monitoring02 with Kanidm OIDC as test instance

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 20:28:10 +01:00

184 lines
6.5 KiB
Markdown

# Authentication System Replacement Plan
## Overview
Deploy a modern, unified authentication solution for the homelab. Provides central user management, SSO for web services, and consistent UID/GID mapping for NAS permissions.
## Goals
1. **Central user database** - Manage users across all homelab hosts from a single source
2. **Linux PAM/NSS integration** - Users can SSH into hosts using central credentials
3. **UID/GID consistency** - Proper POSIX attributes for NAS share permissions
4. **OIDC provider** - Single sign-on for homelab web services (Grafana, etc.)
## Solution: Kanidm
Kanidm was chosen for the following reasons:
| Requirement | Kanidm Support |
|-------------|----------------|
| Central user database | Native |
| Linux PAM/NSS (host login) | Native NixOS module |
| UID/GID for NAS | POSIX attributes supported |
| OIDC for services | Built-in |
| Declarative config | Excellent NixOS provisioning |
| Simplicity | Modern API, LDAP optional |
| NixOS integration | First-class |
### Configuration Files
- **Host configuration:** `hosts/kanidm01/`
- **Service module:** `services/kanidm/default.nix`
## NAS Integration
### Current: TrueNAS CORE (FreeBSD)
TrueNAS CORE has a built-in LDAP client. Kanidm's read-only LDAP interface will work for NFS share permissions:
- **NFS shares**: Only need consistent UID/GID mapping - Kanidm's LDAP provides this
- **No SMB requirement**: SMB would need Samba schema attributes (deprecated in TrueNAS 13.0+), but we're NFS-only
Configuration approach:
1. Enable Kanidm's LDAP interface (`ldapbindaddress = "0.0.0.0:636"`)
2. Import internal CA certificate into TrueNAS
3. Configure TrueNAS LDAP client with Kanidm's Base DN and bind credentials
4. Users/groups appear in TrueNAS permission dropdowns
Note: Kanidm's LDAP is read-only and uses LDAPS only (no StartTLS). This is fine for our use case.
### Future: NixOS NAS
When the NAS is migrated to NixOS, it becomes a first-class citizen:
- Native Kanidm PAM/NSS integration (same as other hosts)
- No LDAP compatibility layer needed
- Full integration with the rest of the homelab
This future migration path is a strong argument for Kanidm over LDAP-only solutions.
## Implementation Steps
1. **Create kanidm01 host and service module**
- Host: `kanidm01.home.2rjus.net` (10.69.13.23, test tier)
- Service module: `services/kanidm/`
- TLS via internal ACME (`auth.home.2rjus.net`)
- Vault integration for idm_admin password
- LDAPS on port 636
2. **Configure provisioning**
- Groups provisioned declaratively: `admins`, `users`, `ssh-users`
- Users managed imperatively via CLI (allows setting POSIX passwords in one step)
- POSIX attributes enabled (UID/GID range 65,536-69,999)
3. **Test NAS integration** (in progress)
- ✅ LDAP interface verified working
- Configure TrueNAS LDAP client to connect to Kanidm
- Verify UID/GID mapping works with NFS shares
4. **Add OIDC clients** for homelab services
- Grafana
- Other services as needed
5. **Create client module** in `system/` for PAM/NSS ✅
- Module: `system/kanidm-client.nix`
- `homelab.kanidm.enable = true` enables PAM/NSS
- Short usernames (not SPN format)
- Home directory symlinks via `home_alias`
- Enabled on test tier: testvm01, testvm02, testvm03
6. **Documentation**
- `docs/user-management.md` - CLI workflows, troubleshooting
- User/group creation procedures verified working
## Progress
### Completed (2026-02-08)
**Kanidm server deployed on kanidm01 (test tier):**
- Host: `kanidm01.home.2rjus.net` (10.69.13.23)
- WebUI: `https://auth.home.2rjus.net`
- LDAPS: port 636
- Valid certificate from internal CA
**Configuration:**
- Kanidm 1.8 with secret provisioning support
- Daily backups at 22:00 (7 versions retained)
- Vault integration for idm_admin password
- Prometheus monitoring scrape target configured
**Provisioned entities:**
- Groups: `admins`, `users`, `ssh-users` (declarative)
- Users managed via CLI (imperative)
**Verified working:**
- WebUI login with idm_admin
- LDAP bind and search with POSIX-enabled user
- LDAPS with valid internal CA certificate
### Completed (2026-02-08) - PAM/NSS Client
**Client module deployed (`system/kanidm-client.nix`):**
- `homelab.kanidm.enable = true` enables PAM/NSS integration
- Connects to auth.home.2rjus.net
- Short usernames (`torjus` instead of `torjus@home.2rjus.net`)
- Home directory symlinks (`/home/torjus` → UUID-based dir)
- Login restricted to `ssh-users` group
**Enabled on test tier:**
- testvm01, testvm02, testvm03
**Verified working:**
- User/group resolution via `getent`
- SSH login with Kanidm unix passwords
- Home directory creation with symlinks
- Imperative user/group creation via CLI
**Documentation:**
- `docs/user-management.md` with full CLI workflows
- Password requirements (min 10 chars)
- Troubleshooting guide (nscd, cache invalidation)
### UID/GID Range (Resolved)
**Range: 65,536 - 69,999** (manually allocated)
- Users: 65,536 - 67,999 (up to ~2500 users)
- Groups: 68,000 - 69,999 (up to ~2000 groups)
Rationale:
- Starts at Kanidm's recommended minimum (65,536)
- Well above NixOS system users (typically <1000)
- Avoids Podman/container issues with very high GIDs
### Completed (2026-02-08) - OAuth2/OIDC for Grafana
**OAuth2 client deployed for Grafana on monitoring02:**
- Client ID: `grafana`
- Redirect URL: `https://grafana-test.home.2rjus.net/login/generic_oauth`
- Scope maps: `openid`, `profile`, `email`, `groups` for `users` group
- Role mapping: `admins` group → Grafana Admin, others → Viewer
**Configuration locations:**
- Kanidm OAuth2 client: `services/kanidm/default.nix`
- Grafana OIDC config: `services/grafana/default.nix`
- Vault secret: `services/grafana/oauth2-client-secret`
**Key findings:**
- PKCE is required by Kanidm - enable `use_pkce = true` in Grafana
- Must set `email_attribute_path`, `login_attribute_path`, `name_attribute_path` to extract from userinfo
- Users need: primary credential (password + TOTP for MFA), membership in `users` group, email address set
- Unix password is separate from primary credential (web login requires primary credential)
### Next Steps
1. Enable PAM/NSS on production hosts (after test tier validation)
2. Configure TrueNAS LDAP client for NAS integration testing
3. Add OAuth2 clients for other services as needed
## References
- [Kanidm Documentation](https://kanidm.github.io/kanidm/stable/)
- [NixOS Kanidm Module](https://search.nixos.org/options?query=services.kanidm)
- [Kanidm PAM/NSS Integration](https://kanidm.github.io/kanidm/stable/pam_and_nsswitch.html)