pn51: document BIOS tweaks, second pn02 freeze, amdgpu blacklist
Some checks failed
Run nix flake check / flake-check (push) Has been cancelled
Some checks failed
Run nix flake check / flake-check (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -71,6 +71,22 @@ Two ASUS PN51-E1 mini PCs (Ryzen 7 5700U) purchased years ago but shelved due to
|
|||||||
- **Conclusion**: TSC is genuinely unstable on the PN51-E1 platform. HPET is the correct clocksource.
|
- **Conclusion**: TSC is genuinely unstable on the PN51-E1 platform. HPET is the correct clocksource.
|
||||||
- For virtualization (Incus), this means guest VMs will use HPET-backed timing. Performance impact is minimal for typical server workloads (DNS, monitoring, light services) but would matter for latency-sensitive applications.
|
- For virtualization (Incus), this means guest VMs will use HPET-backed timing. Performance impact is minimal for typical server workloads (DNS, monitoring, light services) but would matter for latency-sensitive applications.
|
||||||
|
|
||||||
|
### 2026-02-22: BIOS Tweaks (Both Units)
|
||||||
|
|
||||||
|
- Disabled ErP Ready on both (EU power efficiency mode — aggressively cuts power in idle)
|
||||||
|
- Disabled WiFi and Bluetooth in BIOS on both
|
||||||
|
- **TSC still unstable** after these changes — same ~3.8ms skew on both units
|
||||||
|
- ErP/power states are not the cause of the TSC issue
|
||||||
|
|
||||||
|
### 2026-02-22: pn02 Second Freeze
|
||||||
|
|
||||||
|
- pn02 froze again ~5.5 hours after boot (at idle, not under load)
|
||||||
|
- All Prometheus targets down simultaneously — same hard freeze pattern
|
||||||
|
- Last log entry was normal nix-daemon activity — zero warning/error logs before crash
|
||||||
|
- Survived the 1h stress test earlier but froze at idle later — not thermal
|
||||||
|
- pn01 remains stable throughout
|
||||||
|
- **Action**: Blacklisted `amdgpu` kernel module on pn02 (`boot.blacklistedKernelModules = [ "amdgpu" ]`) to eliminate GPU/PSP firmware interactions as a cause. No console output but managed via SSH.
|
||||||
|
|
||||||
## Benign Kernel Errors (Both Units)
|
## Benign Kernel Errors (Both Units)
|
||||||
|
|
||||||
These appear on both units and can be ignored:
|
These appear on both units and can be ignored:
|
||||||
@@ -84,8 +100,8 @@ These appear on both units and can be ignored:
|
|||||||
|
|
||||||
## Next Steps
|
## Next Steps
|
||||||
|
|
||||||
- Monitor both units for stability over the next few days
|
- Monitor pn02 with amdgpu blacklisted — if stable, try the less impactful `amdgpu.runpm=0 amdgpu.dpm=0` kernel params instead
|
||||||
- If either freezes again, try disabling unused hardware in BIOS (GPU, WiFi, Bluetooth, audio)
|
- If pn02 still freezes without amdgpu, likely a hardware defect on this unit
|
||||||
- If still freezing, may be a hardware defect
|
- pn01 continues to be stable — keep monitoring
|
||||||
- Once stable: add second RAM stick back to pn02, reinstall with NVMe
|
- Once stable: add second RAM stick back to pn02, reinstall with NVMe
|
||||||
- Evaluate for Incus hypervisor use (see `nixos-hypervisor.md`)
|
- Evaluate for Incus hypervisor use (see `nixos-hypervisor.md`)
|
||||||
|
|||||||
Reference in New Issue
Block a user