1

I have a computer that I would call "rock solid". I run it for months on end without the need to restart and have many programs, tabs, etc open all the time on multiple work spaces. No stability issues.

Recently, I've been trying to get into 3D art with Blender. Blender runs fine on my computer for the most part, but using the Cycles rendering engine with CPU rendering enabled locks up the entire computer with a few seconds, requiring a reboot. (ctrl+alt+backspace does not work).

Since it was only Blender with the problem, I reported it as a bug and they had me run mprime (which is sometimes called prime95). I used the "stress test" option with test type 3 "blend" and it freezes my computer within seconds as well.

So there is some issue with my computer that causes it to freeze when doing something that is (extremely) CPU intensive, that never causes any problems with day to day work, and the computer can run for months on end, as long as I don't try something like mprime or cycles render.

Side question: A properly functioning computer should be able to run mprime without issues right? Continuously?

I would like to use Blender with Cycles, so I'd like to determine the cause of the problem.

I've already tried running memtest86 for 12 hours with no errors, so it's probably not the system memory. I've also changed OS versions from Mint 18 to 19 which (probably) included a new kernel so it likely not a kernel related issue.

That leaves us with:

  1. A faulty processor
  2. A faulty motherboard
  3. Low level software like motherboard firmware or processor driver
  4. Something else?

Other than cross swapping parts (CPU's are expensive, not returnable) , is there a way to determine if I need to replace the motherboard and CPU, or just the CPU? Or if this can be fixed with software/firmware updates?

I'm looking for an experienced answer like "if this happens, it's always the CPU and not the motherboard" or "it could be either, but it's certainly hardware that needs replaced and software/firmware updates wont help" or "the hardware's fine and just needs more cooling", etc.

Update:

I tried using CPU limit on mprime. I set a 10% limit with:

cpulimit -P ~/mersenne-stress-test.linux64/mprime -l 10

Then ran the program. cpulimit detected mprime when it started up, but when i told mprime to run, the computer immediately froze. So either mprime is bypassing cpu limit, or the issue I'm having isn't affected by resource limiting the CPU. ie. It may be attempting a certain type of operation that mprime/cycles uses the CPU for that office programs don't.

Update 2:

Additional System Information:

Operating system: Linux Mint 19.1, Mate, 64 bit
Graphics card: GeForce GTX 760, 4096 MB, proprietary NVIDIA Driver Version: 410.93
CPU: AMD FX-8150 Eight-Core Processor, 3.6 Ghz
Motherboard: ASUSTeK M5A99FX PRO R2.0
RAM: 2x4GB (8gb total) Corsair DDR3, 1600, Memtested OK

Update 3:

More troubleshooting information:

I got another computer with the same processor (Computer B). It's running Windows. I tested Blender on that one (B), and it worked fine. So, next, I swapped the processors between A and B, cleaning and applying new thermal compound in the process.

This is where things get Bizare. Computer B (running Windows, with A's processor seems to work fine with Blender.

Computer A, the original running Linux, with computer B's processor installed will now run mprime without the system freezing. But, after a little while, I got:

[Worker #7 Mar 7 15:08] ERROR: ILLEGAL SUMOUT
[Worker #7 Mar 7 15:08] Possible hardware failure, consult readme.txt file, restarting test.

This repeated quickly until it stopped that core. Then another core did the same thing. Then it ran for a while, and eventually ended with a segmentation fault, but the computer did not freeze.

[Worker #5 Mar 7 15:16] Self-test 640K passed!
[Worker #5 Mar 7 15:16] Test 1, 3200000 Lucas-Lehmer iterations of M172031 using AMD K10 type-1 FFT length 8K, Pass1=32, Pass2=256, clm=4.
[Worker #1 Mar 7 15:16] Self-test 640K passed!
[Worker #1 Mar 7 15:16] Test 1, 3200000 Lucas-Lehmer iterations of M172031 using AMD K10 type-1 FFT length 8K, Pass1=32, Pass2=256, clm=4.
[Worker #4 Mar 7 15:16] Self-test 640K passed!
[Worker #4 Mar 7 15:16] Test 1, 3200000 Lucas-Lehmer iterations of M172031 using AMD K10 type-1 FFT length 8K, Pass1=32, Pass2=256, clm=4.
[Worker #3 Mar 7 15:16] Self-test 640K passed!
[Worker #3 Mar 7 15:16] Test 1, 3200000 Lucas-Lehmer iterations of M172031 using AMD K10 type-1 FFT length 8K, Pass1=32, Pass2=256, clm=4.
[Worker #2 Mar 7 15:16] Self-test 640K passed!
[Worker #2 Mar 7 15:16] Test 1, 3200000 Lucas-Lehmer iterations of M172031 using AMD K10 type-1 FFT length 8K, Pass1=32, Pass2=256, clm=4.
[Worker #8 Mar 7 15:19] Self-test 8K passed!
[Worker #8 Mar 7 15:19] Test 1, 21000 Lucas-Lehmer iterations of M14155777 using AMD K10 type-2 FFT length 720K, Pass1=320, Pass2=2304, clm=4.
[Worker #5 Mar 7 15:21] Self-test 8K passed!
[Worker #5 Mar 7 15:21] Test 1, 21000 Lucas-Lehmer iterations of M14155777 using AMD K10 type-2 FFT length 720K, Pass1=320, Pass2=2304, clm=4.
Segmentation fault (core dumped)

So this is bizare, because neither computer seems to be freezing with swapped processors, but something odd is still going on with system A (at least).

I guess the next step is to try mprime on the windws box and see if that goes OK, or if ILLEGAL SUMOUT is normal for mprime.

  • 1
    When a CPU is stressed to its maximum, it cannot service your requests at mouse movement, keyboard, even display, since all of its processing power is used on the stress test. When using Blender, it likely uses your CPU to the max so that it freezes up - this is common behaviour with all CPU stress tests. I suggest restricting Blenders usage to something like 80% if possible, or use a GPU rendering scheme which should do it quicker anyway. – QuickishFM Feb 19 at 15:44
  • 1
    If the cycles render used 100% CPU such that the mouse couldn't respond, eventually when the render finished, the mouse would become responsive again, correct? So if a scene takes 15 seconds to render, I should be able to wait 60 seconds or so and get mouse control back. But that isn't the case. I've run the render from the command line, the computer froze in seconds, and I left it "frozen" for a while to see if the render completed. It never eventually unfroze and upon reboot, the log file and the output were empty. So this looks like a hardware lockup, rather than competition for resources. – Nick Feb 19 at 16:06
  • 2
    It was not clear you needed to reboot to recover. Please edit the question and make it clear. Yes, I think this looks like a hardware lockup. Can you use cpulimit? Use it to find the percentage that is on the edge of lockup, monitor temperatures (sensors) and frequencies (grep MHz /proc/cpuinfo) during this process. Find what the safe temperature range is for your CPU according to the manufacturer. Do you use cpufreqd? – Kamil Maciorowski Feb 19 at 16:14
  • 1
    @KamilMaciorowski: I've updated the question per your suggestions. – Nick Feb 19 at 17:31
  • @K7AAY: I've added additional system specs. I don't have /var/log/dmesg on my system. dmesg -wH produces a lot of output, is there something specific I should be looking for? smartctl tests are PASSED for all drives. – Nick Feb 20 at 0:09

Your Answer

By clicking "Post Your Answer", you agree to our terms of service, privacy policy and cookie policy

Browse other questions tagged or ask your own question.