Wednesday, May 8, 2024
 Popular · Latest · Hot · Upcoming
1
rated 0 times [  1] [ 0]  / answers: 1 / hits: 2279  / 2 Years ago, fri, september 30, 2022, 4:24:21

  1. Thinkpad t520; Ubuntu 12.04.1 LTS; 3.2.0-33-generic; 16GB of ram.

  2. Memtest86+ ran for 26 hours, 9 passes, no errors.

  3. Booted into "recovery mode": ran fsck all filesystems - no errors; "check all packages" - no errors.

  4. Apparent random memory corruption: perl/R/chrome segfault every now and then, seemingly at random; sort(1) produces corrupt unsorted files.



What could be possibly wrong and how do I debug it?


More From » hardware

 Answers
2

Random memory corruption doesn't tell you that it's for sure memory module problem, there might be lots of other reasons. Starting with software and configuration...




  • You might have been unlucky and your package tree is 1-in-a-million-chance "internally consistent", while "externally inconsistent" (package and crc corruption resulting in valid package) <- purely theoretical.

  • Using not stable branch packages with a bug (software, system or kernel).

  • Using stable branch packages with a bug (system bug in your specific hardware and software conditions) or outdated versions.

  • Virus, that corrupts in-memory files, like libraries already cached from hdd.

  • Kernel problem - one of your kernel drivers is not as stable as it should be. Example? Virtualbox driver is known to cause some random memory problems in host. Other, especially custom (or beta) drivers might cause similar (or other) bad stuff happening.

  • Malfunctioning external devices, their drivers might not make some sanity checks... which are not needed for fully operational hardware.

  • Hardware problems, while not exactly memory module problem. Malfunctioning internal devices - your chips (like audio/graphics) or pci/pcie cards might be corrupted and might do bad things on your system memory, as they all share the hardware level memory access. Or they might corrupt other parts, that corrupt the memory.

  • Environmental problems - Your CPU or bridges might be overheated - (especially the north bridge which connects CPU with system memory, but lately is combined in CPU) - but mind you, they might get overheated from other actions taking place, like GPU-hungry applications (so you won't get any errors in memory-testing software running on VGA session).



So - as you see, there are many different possibilities, but most of the stuff above doesn't happen often in such cases. I'd recommend you trying running the system from liveCD and checking if it segfaults there, if it does, trying to unplug any hardware you don't really need (or disabling it in bios/uefi), then - checking memory module in another computer and checking your computer with different memory module.


[#34215] Saturday, October 1, 2022, 2 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
memorrappin

Total Points: 325
Total Questions: 122
Total Answers: 100

Location: Armenia
Member since Sat, Sep 12, 2020
4 Years ago
;