Some checks failed
Run nix flake check / flake-check (push) Failing after 12m3s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3.0 KiB
3.0 KiB
ASUS PN51 Stability Testing
Overview
Two ASUS PN51-E1 mini PCs (Ryzen 7 5700U) purchased years ago but shelved due to stability issues. Revisiting them to potentially add to the homelab.
Hardware
| pn01 (10.69.12.60) | pn02 (10.69.12.61) | |
|---|---|---|
| CPU | AMD Ryzen 7 5700U (8C/16T) | AMD Ryzen 7 5700U (8C/16T) |
| RAM | 2x 32GB DDR4 SO-DIMM (64GB) | 1x 32GB DDR4 SO-DIMM (32GB) |
| Storage | 1TB NVMe | 1TB Samsung 870 EVO (SATA SSD) |
| BIOS | 0508 (2023-11-08) | Updated 2026-02-21 (latest from ASUS) |
Original Issues
- pn01: Would boot but freeze randomly after some time. No console errors, completely unresponsive. memtest86 passed.
- pn02: Had trouble booting — would start loading kernel from installer USB then instantly reboot. When it did boot, would also freeze randomly.
Debugging Steps
2026-02-21: Initial Setup
- Disabled fTPM (labeled "Security Device" in ASUS BIOS) on both units
- AMD Ryzen 5000 series had a known fTPM bug causing random hard freezes with no console output
- Both units booted the NixOS installer successfully after this change
- Installed NixOS on both, added to repo as
pn01andpn02on VLAN 12 - Configured monitoring (node-exporter, promtail, nixos-exporter)
2026-02-21: pn02 First Freeze
- pn02 froze approximately 1 hour after boot
- All three Prometheus targets went down simultaneously — hard freeze, not graceful shutdown
- Journal on next boot:
system.journal corrupted or uncleanly shut down - Kernel warnings from boot log before freeze:
- TSC clocksource unstable:
Marking clocksource 'tsc' as unstable because the skew is too large— TSC skewing ~3.8ms over 500ms relative to HPET watchdog - AMD PSP error:
psp gfx command LOAD_TA(0x1) failed and response status is (0x7)— Platform Security Processor failing to load trusted application
- TSC clocksource unstable:
- pn01 showed neither of these warnings and remained stable
2026-02-21: pn02 BIOS Update
- Updated pn02 BIOS to latest version from ASUS website
- TSC still unstable after BIOS update — same ~3.8ms skew
- PSP LOAD_TA still failing after BIOS update
- Monitoring back up, letting it run to see if freeze recurs
Benign Kernel Errors (Both Units)
These appear on both units and can be ignored:
pcie_mp2_amd: amd_sfh_hid_client_init failed err -95— AMD Sensor Fusion Hub, no sensors connectedBluetooth: hci0: Reading supported features failed— Bluetooth init quirkSerial bus multi instantiate pseudo device driver INT3515:00: error -ENXIO— unused serial bus deviceata2.00: supports DRM functions and may not be fully accessible— Samsung SSD DRM quirk (pn02 only)
Next Steps
- Monitor pn01 stability (fTPM disabled, no other changes needed)
- Monitor pn02 stability after BIOS update
- If pn02 continues to freeze, try adding
tsc=unstablekernel parameter - If pn02 still freezes, may be a hardware defect on that specific unit
- Once stable: add second RAM stick back to pn02, reinstall with NVMe