Memtest86 ran 38 passes (109 hours) with zero errors, ruling out RAM.
Disable sched_ext scheduler to test whether kernel scheduler crashes stop.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enable memtest86 in systemd-boot menu on both PN51 units to allow
extended memory testing. Update stability document with March crash
data from pstore/Loki — crashes now traced to sched_ext scheduler
kernel oops, suggesting possible memory corruption.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
pn02 crashed again after ~2d21h uptime despite all mitigations
(amdgpu blacklist, max_cstate=1, NMI watchdog, rasdaemon).
NMI watchdog didn't fire and rasdaemon recorded nothing,
confirming hard lockup below NMI level. Unit is unreliable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Both units survived 1h stress test at 80-85C. TSC clocksource
is genuinely unstable at runtime (not just boot), HPET is the
correct fallback for this platform.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>