Troubleshooting hardware problems

From Elvanör's Technical Wiki
Jump to navigation Jump to search

Memory (RAM)

  • The Gentoo packages memtest86 and memtest86+ provide tests for the memory in your computer. However, there is no way to run these programs from a running system. They must be ran from a special bootloader: the Gentoo packages basically install that bootloader and add an option to GRUB to start the tests.

Nvidia GPUs

  • I had a case where an nvidia card HDMI output would no longer work (black monitor), even during power-up and bringing up UEFI / BIOS interface. Booting into Windows (using the built-in Intel HD HDMI output from the motherboard) and then switching the HDMI cable to the nvidia card "resurrected" the card. Windows output was even working instantly without a reboot.
  • Make sure you have a powerful enough PSU (Power Supply Unit). I had a case where the computer would freeze very often when using a 600W PSU (and a nVidia GeForce RTX 3080 Ti). See this report. Switching to a 1000W PSU fixed all the crashes.

EVGA nVidia GeForce RTX 3080 Ti

  • The card worked, but produced a blank screen at boot (during POST/UEFI BIOS, GRUB bootloader screen, and Linux console driven by the framebuffer driver). The screen only started after Linux X11 (or Windows) has booted. This made it very difficult to access the BIOS or the GRUB dual bootloader.
  • Note that this issue usually happened only when a cold boot was issued (or a reboot triggered by the BIOS/GRUB). Just after a reboot from Linux or Windows, it worked and I could see the BIOS screen.
  • If you're unable to get a BIOS screen, it's very dangerous to keep the "Wait For F1 If Error" setting enabled on the BIOS configuration. Any error during the boot process won't be visible, and will prevent the startup of the system as a result. It will never reach the stages of X11 or Windows loading, which would power up the display at last.
  • To fix this situation, I tried to update motherboard BIOS, GPU VBIOS and firmware without any result.
  • Update: this issue was at last fixed with a firmware update on the monitor (display, which was an Asus ROG Swift PG32UQ model). So apparently, the issue was linked to the monitor, and it was fixed by the firmware update.