Compare commits
6 Commits
3042803c4d
...
jellyfin-m
| Author | SHA1 | Date | |
|---|---|---|---|
|
16ef202530
|
|||
|
5f3508a6d4
|
|||
|
2ca2509083
|
|||
|
58702bd10b
|
|||
|
c9f47acb01
|
|||
|
09ce018fb2
|
@@ -39,23 +39,17 @@ Expand storage capacity for the main hdd-pool. Since we need to add disks anyway
|
|||||||
- nzbget: NixOS service or OCI container
|
- nzbget: NixOS service or OCI container
|
||||||
- NFS exports: `services.nfs.server`
|
- NFS exports: `services.nfs.server`
|
||||||
|
|
||||||
### Filesystem: BTRFS RAID1
|
### Filesystem: Keep ZFS
|
||||||
|
|
||||||
**Decision**: Migrate from ZFS to BTRFS with RAID1
|
**Decision**: Keep existing ZFS pool, import on NixOS
|
||||||
|
|
||||||
**Rationale**:
|
**Rationale**:
|
||||||
- **In-kernel**: No out-of-tree module issues like ZFS
|
- **No data migration needed**: Existing ZFS pool can be imported directly on NixOS
|
||||||
- **Flexible expansion**: Add individual disks, not required to buy pairs
|
- **Proven reliability**: Pool has been running reliably on TrueNAS
|
||||||
- **Mixed disk sizes**: Better handling than ZFS multi-vdev approach
|
- **NixOS ZFS support**: Well-supported, declarative configuration via `boot.zfs` and `services.zfs`
|
||||||
- **RAID level conversion**: Can convert between RAID levels in place
|
- **BTRFS RAID5/6 unreliable**: Research showed BTRFS RAID5/6 write hole is still unresolved
|
||||||
- Built-in checksumming, snapshots, compression (zstd)
|
- **BTRFS RAID1 wasteful**: With mixed disk sizes, RAID1 wastes significant capacity vs ZFS mirrors
|
||||||
- NixOS has good BTRFS support
|
- Checksumming, snapshots, compression (lz4/zstd) all available
|
||||||
|
|
||||||
**BTRFS RAID1 notes**:
|
|
||||||
- "RAID1" means 2 copies of all data
|
|
||||||
- Distributes across all available devices
|
|
||||||
- With 6+ disks, provides redundancy + capacity scaling
|
|
||||||
- RAID5/6 avoided (known issues), RAID1/10 are stable
|
|
||||||
|
|
||||||
### Hardware: Keep Existing + Add Disks
|
### Hardware: Keep Existing + Add Disks
|
||||||
|
|
||||||
@@ -69,83 +63,94 @@ Expand storage capacity for the main hdd-pool. Since we need to add disks anyway
|
|||||||
|
|
||||||
**Storage architecture**:
|
**Storage architecture**:
|
||||||
|
|
||||||
**Bulk storage** (BTRFS RAID1 on HDDs):
|
**hdd-pool** (ZFS mirrors):
|
||||||
- Current: 6x HDDs (2x16TB + 2x8TB + 2x8TB)
|
- Current: 3 mirror vdevs (2x16TB + 2x8TB + 2x8TB) = 32TB usable
|
||||||
- Add: 2x new HDDs (size TBD)
|
- Add: mirror-3 with 2x 24TB = +24TB usable
|
||||||
|
- Total after expansion: ~56TB usable
|
||||||
- Use: Media, downloads, backups, non-critical data
|
- Use: Media, downloads, backups, non-critical data
|
||||||
- Risk tolerance: High (data mostly replaceable)
|
|
||||||
|
|
||||||
**Critical data** (small volume):
|
|
||||||
- Use 2x 240GB SSDs in mirror (BTRFS or ZFS)
|
|
||||||
- Or use 2TB NVMe for critical data
|
|
||||||
- Risk tolerance: Low (data important but small)
|
|
||||||
|
|
||||||
### Disk Purchase Decision
|
### Disk Purchase Decision
|
||||||
|
|
||||||
**Options under consideration**:
|
**Decision**: 2x 24TB drives (ordered, arriving 2026-02-21)
|
||||||
|
|
||||||
**Option A: 2x 16TB drives**
|
|
||||||
- Matches largest current drives
|
|
||||||
- Enables potential future RAID5 if desired (6x 16TB array)
|
|
||||||
- More conservative capacity increase
|
|
||||||
|
|
||||||
**Option B: 2x 20-24TB drives**
|
|
||||||
- Larger capacity headroom
|
|
||||||
- Better $/TB ratio typically
|
|
||||||
- Future-proofs better
|
|
||||||
|
|
||||||
**Initial purchase**: 2 drives (chassis has space for 2 more without modifications)
|
|
||||||
|
|
||||||
## Migration Strategy
|
## Migration Strategy
|
||||||
|
|
||||||
### High-Level Plan
|
### High-Level Plan
|
||||||
|
|
||||||
1. **Preparation**:
|
1. **Expand ZFS pool** (on TrueNAS):
|
||||||
- Purchase 2x new HDDs (16TB or 20-24TB)
|
- Install 2x 24TB drives (may need new drive trays - order from abroad if needed)
|
||||||
- Create NixOS configuration for new storage host
|
- If chassis space is limited, temporarily replace the two oldest 8TB drives (da0/ada4)
|
||||||
- Set up bare metal NixOS installation
|
- Add as mirror-3 vdev to hdd-pool
|
||||||
|
- Verify pool health and resilver completes
|
||||||
|
- Check SMART data on old 8TB drives (all healthy as of 2026-02-20, no reallocated sectors)
|
||||||
|
- Burn-in: at minimum short + long SMART test before adding to pool
|
||||||
|
|
||||||
2. **Initial BTRFS pool**:
|
2. **Prepare NixOS configuration**:
|
||||||
- Install 2 new disks
|
- Create host configuration (`hosts/nas1/` or similar)
|
||||||
- Create BTRFS filesystem in RAID1
|
- Configure ZFS pool import (`boot.zfs.extraPools`)
|
||||||
- Mount and test NFS exports
|
- Set up services: radarr, sonarr, nzbget, restic-rest, NFS
|
||||||
|
- Configure monitoring (node-exporter, promtail, smartctl-exporter)
|
||||||
|
|
||||||
3. **Data migration**:
|
3. **Install NixOS**:
|
||||||
- Copy data from TrueNAS ZFS pool to new BTRFS pool over 10GbE
|
- `zfs export hdd-pool` on TrueNAS before shutdown (clean export)
|
||||||
- Verify data integrity
|
- Wipe TrueNAS boot-pool SSDs, set up as mdadm RAID1 for NixOS root
|
||||||
|
- Install NixOS on mdadm mirror (keeps boot path ZFS-independent)
|
||||||
|
- Import hdd-pool via `boot.zfs.extraPools`
|
||||||
|
- Verify all datasets mount correctly
|
||||||
|
|
||||||
4. **Expand pool**:
|
4. **Service migration**:
|
||||||
- As old ZFS pool is emptied, wipe drives and add to BTRFS pool
|
- Configure NixOS services to use ZFS dataset paths
|
||||||
- Pool grows incrementally: 2 → 4 → 6 → 8 disks
|
- Update NFS exports
|
||||||
- BTRFS rebalances data across new devices
|
- Test from consuming hosts
|
||||||
|
|
||||||
5. **Service migration**:
|
5. **Cutover**:
|
||||||
- Set up radarr/sonarr/nzbget/restic as NixOS services
|
- Update DNS/client mounts if IP changes
|
||||||
- Update NFS client mounts on consuming hosts
|
- Verify monitoring integration
|
||||||
|
|
||||||
6. **Cutover**:
|
|
||||||
- Point consumers to new NAS host
|
|
||||||
- Decommission TrueNAS
|
- Decommission TrueNAS
|
||||||
- Repurpose hardware or keep as spare
|
|
||||||
|
### Post-Expansion: Vdev Rebalancing
|
||||||
|
|
||||||
|
ZFS has no built-in rebalance command. After adding the new 24TB vdev, ZFS will
|
||||||
|
write new data preferentially to it (most free space), leaving old vdevs packed
|
||||||
|
at ~97%. This is suboptimal but not urgent once overall pool usage drops to ~50%.
|
||||||
|
|
||||||
|
To gradually rebalance, rewrite files in place so ZFS redistributes blocks across
|
||||||
|
all vdevs proportional to free space:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Rewrite files individually (spreads blocks across all vdevs)
|
||||||
|
find /pool/dataset -type f -exec sh -c '
|
||||||
|
for f; do cp "$f" "$f.rebal" && mv "$f.rebal" "$f"; done
|
||||||
|
' _ {} +
|
||||||
|
```
|
||||||
|
|
||||||
|
Avoid `zfs send/recv` for large datasets (e.g. 20TB) as this would concentrate
|
||||||
|
data on the emptiest vdev rather than spreading it evenly.
|
||||||
|
|
||||||
|
**Recommendation**: Do this after NixOS migration is stable. Not urgent - the pool
|
||||||
|
will function fine with uneven distribution, just slightly suboptimal for performance.
|
||||||
|
|
||||||
### Migration Advantages
|
### Migration Advantages
|
||||||
|
|
||||||
- **Low risk**: New pool created independently, old data remains intact during migration
|
- **No data migration**: ZFS pool imported directly, no copying terabytes of data
|
||||||
- **Incremental**: Can add old disks one at a time as space allows
|
- **Low risk**: Pool expansion done on stable TrueNAS before OS swap
|
||||||
- **Flexible**: BTRFS handles mixed disk sizes gracefully
|
- **Reversible**: Can boot back to TrueNAS if NixOS has issues (ZFS pool is OS-independent)
|
||||||
- **Reversible**: Keep TrueNAS running until fully validated
|
- **Quick cutover**: Once NixOS config is ready, the OS swap is fast
|
||||||
|
|
||||||
## Next Steps
|
## Next Steps
|
||||||
|
|
||||||
1. Decide on disk size (16TB vs 20-24TB)
|
1. ~~Decide on disk size~~ - 2x 24TB ordered
|
||||||
2. Purchase disks
|
2. Install drives and add mirror vdev to ZFS pool
|
||||||
3. Design NixOS host configuration (`hosts/nas1/`)
|
3. Check SMART data on 8TB drives - decide whether to keep or retire
|
||||||
4. Plan detailed migration timeline
|
4. Design NixOS host configuration (`hosts/nas1/`)
|
||||||
5. Document NFS export mapping (current → new)
|
5. Document NFS export mapping (current -> new)
|
||||||
|
6. Plan NixOS installation and cutover
|
||||||
|
|
||||||
## Open Questions
|
## Open Questions
|
||||||
|
|
||||||
- [ ] Final decision on disk size?
|
|
||||||
- [ ] Hostname for new NAS host? (nas1? storage1?)
|
- [ ] Hostname for new NAS host? (nas1? storage1?)
|
||||||
- [ ] IP address allocation (keep 10.69.12.50 or new IP?)
|
- [ ] IP address/subnet: NAS and Proxmox are both on 10GbE to the same switch but different subnets, forcing traffic through the router (bottleneck). Move to same subnet during migration.
|
||||||
- [ ] Timeline/maintenance window for migration?
|
- [x] Boot drive: Reuse TrueNAS boot-pool SSDs as mdadm RAID1 for NixOS root (no ZFS on boot path)
|
||||||
|
- [ ] Retire old 8TB drives? (SMART looks healthy, keep unless chassis space is needed)
|
||||||
|
- [ ] Drive trays: do new 24TB drives fit, or order trays from abroad?
|
||||||
|
- [ ] Timeline/maintenance window for NixOS swap?
|
||||||
|
|||||||
@@ -61,7 +61,42 @@
|
|||||||
mode 644
|
mode 644
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
reverse_proxy http://jelly01.home.2rjus.net:8096
|
header Content-Type text/html
|
||||||
|
respond <<HTML
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>Jellyfin - Maintenance</title>
|
||||||
|
<style>
|
||||||
|
body {
|
||||||
|
background: #101020;
|
||||||
|
color: #ddd;
|
||||||
|
font-family: sans-serif;
|
||||||
|
display: flex;
|
||||||
|
justify-content: center;
|
||||||
|
align-items: center;
|
||||||
|
min-height: 100vh;
|
||||||
|
margin: 0;
|
||||||
|
text-align: center;
|
||||||
|
}
|
||||||
|
.container { max-width: 500px; }
|
||||||
|
.disk { font-size: 80px; animation: spin 3s linear infinite; display: inline-block; }
|
||||||
|
@keyframes spin { from { transform: rotate(0deg); } to { transform: rotate(360deg); } }
|
||||||
|
h1 { color: #00a4dc; }
|
||||||
|
p { font-size: 1.2em; line-height: 1.6; }
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<div class="container">
|
||||||
|
<div class="disk">💿</div>
|
||||||
|
<h1>Jellyfin is taking a nap</h1>
|
||||||
|
<p>The NAS is getting shiny new hard drives.<br>
|
||||||
|
Jellyfin will be back once the disks stop spinning up.</p>
|
||||||
|
<p style="color:#666;font-size:0.9em;">In the meantime, maybe go outside?</p>
|
||||||
|
</div>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
HTML 200
|
||||||
}
|
}
|
||||||
http://http-proxy.home.2rjus.net/metrics {
|
http://http-proxy.home.2rjus.net/metrics {
|
||||||
log {
|
log {
|
||||||
|
|||||||
@@ -67,13 +67,13 @@ groups:
|
|||||||
summary: "Promtail service not running on {{ $labels.instance }}"
|
summary: "Promtail service not running on {{ $labels.instance }}"
|
||||||
description: "The promtail service has not been active on {{ $labels.instance }} for 5 minutes."
|
description: "The promtail service has not been active on {{ $labels.instance }} for 5 minutes."
|
||||||
- alert: filesystem_filling_up
|
- alert: filesystem_filling_up
|
||||||
expr: predict_linear(node_filesystem_free_bytes{mountpoint="/"}[6h], 24*3600) < 0
|
expr: predict_linear(node_filesystem_free_bytes{mountpoint="/"}[24h], 24*3600) < 0
|
||||||
for: 1h
|
for: 1h
|
||||||
labels:
|
labels:
|
||||||
severity: warning
|
severity: warning
|
||||||
annotations:
|
annotations:
|
||||||
summary: "Filesystem predicted to fill within 24h on {{ $labels.instance }}"
|
summary: "Filesystem predicted to fill within 24h on {{ $labels.instance }}"
|
||||||
description: "Based on the last 6h trend, the root filesystem on {{ $labels.instance }} is predicted to run out of space within 24 hours."
|
description: "Based on the last 24h trend, the root filesystem on {{ $labels.instance }} is predicted to run out of space within 24 hours."
|
||||||
- alert: systemd_not_running
|
- alert: systemd_not_running
|
||||||
expr: node_systemd_system_running == 0
|
expr: node_systemd_system_running == 0
|
||||||
for: 10m
|
for: 10m
|
||||||
|
|||||||
Reference in New Issue
Block a user