Page MenuHome

Cycles crashes on GPU render using OpenCL - Dual AMD WX 7100 GPUs
Open, Waiting for Developer to ReproducePublic

Description

System Information
Operating system: Windows-10-10.0.17763 64 Bits
Graphics card: AMD Radeon (TM) Pro WX 7100 Graphics ATI Technologies Inc. 4.5.13558 Core Profile Context FireGL 26.20.11015.5009

Blender Version
Broken: version: 2.80 (sub 75), branch: master, commit date: 2019-07-29 14:47, hash: rBf6cb5f54494e

Short description of error
When I open Blender and try to render the default scene using Cycles and GPU compute I get an immediate crash.

Under Preferences > System > Cycles Render Devices > OpenCL, I have four identical lines that say, "AMD Radeon Pro WX 7100 Graphics".
If the top two are checked off then I can render without Blender crashing.
But with all four checked off or even just the bottom two checked off Blender always crashes.

Exact steps for others to reproduce the error

  1. Launch the official Blender 2.8 Windows 10 build from today, July 30, 2019
  2. In Edit > Preferences > Cycles Render Devices > OpenCL, check off all four instances of 'AMD Radeon Pro WX 7100 Graphics'.

  1. Under render properties, select Cycles engine.
  2. Set device to GPU compute
  3. Use F12 or Render menu to render a single image

Details

Type
Bug

Event Timeline

Do you have Crossfire enabled for the two cards? It seems like it's double-enumerating them somehow. I believe generally things will work better for rendering if Crossfire is disabled, similar to SLI on Nvidia cards. But I'm not an AMD user so I'm not as familiar with the setup there.

Do you have Crossfire enabled for the two cards? It seems like it's double-enumerating them somehow. I believe generally things will work better for rendering if Crossfire is disabled, similar to SLI on Nvidia cards. But I'm not an AMD user so I'm not as familiar with the setup there.

Crossfire isn't enabled, and I've attempted to update Windows 10 and the graphics driver to the latest version, and I'm still getting the same crash as long as the bottom two options are selected.
For now, I'm disabling them to continue working and rendering with my GPU, but I'm not able to compare to see if I'm getting a performance drop without the other two cards checked in the list.

For now, I'll leave them unchecked, but I think, either way, it's a bug.

It seems like I've had all these enabled since the 2.8 alpha releases and it rendered without any issues, only with the release candidates have I been seeing this problem.

Does the speed of rendering approximately double between choosing only the first card and choosing the first two on the list?

Yes, so the first two in the list are behaving correctly. This gives me a workaround, but I'm still not sure why the double-enumeration is occurring.
At my earliest possible convenience, I'm switching to an NVidia card. AMD has been nothing but headaches since I first got these cards.
Real-time performance in Eevee is a breeze, but GPU rendering in Cycles always seems to have less stability and support by the BF.

@Gavin Scott (Zoot) Thanks for your help in diagnosing the problem.

Hope the double listing bug gets fixed as it definitely bricks Blender when they're all checked.

I searched around but could not find other examples of this sort of issue either with Blender or with OpenCL apps in general, so I don't think it's really likely to be a Blender bug but some obscure issue in your particular system. As long as it's otherwise working I'm not sure I would spend too much time messing with it, but you could try completely removing all AMD/ATI video related drivers and stuff and re-installing the latest. I will keep my eyes out for anyone else seeing the same issue though.

IIRC the double enumeration occurs because of two different OpenCL platforms being available - one from AMD, one from Intel. I have found the Intel platform for OpenCL to be quite unreliable - I think I've blacklisted it for Cycles in Rhino so that one can not select it.

FWIW I run on a machine that has one WX 9100 in it (and a GTX 1060 and a GTX 760, but that is besides the point ;) )

FWIW2: I see only one WX 9100 in the OpenCL section.

@Brandon Hix (Acolyte) did you by any chance ever set any of the Cycles environment variables. Maybe double-check your user and system environment variables, and remove any that says CYCLES in it...

IIRC the double enumeration occurs because of two different OpenCL platforms being available - one from AMD, one from Intel. I have found the Intel platform for OpenCL to be quite unreliable - I think I've blacklisted it for Cycles in Rhino so that one can not select it.
FWIW I run on a machine that has one WX 9100 in it (and a GTX 1060 and a GTX 760, but that is besides the point ;) )
FWIW2: I see only one WX 9100 in the OpenCL section.

Hey Nathan, thanks for the explanation, this helps me understand a bit more about what's going on behind the scenes. The WX 7100s haven't been bad cards, but I think I'll have less overall issues if I just get a card that has CUDA, so that's where I'm headed next I think.

@Brandon Hix (Acolyte) did you by any chance ever set any of the Cycles environment variables. Maybe double-check your user and system environment variables, and remove any that says CYCLES in it...

There are no environment variables related to Blender or Cycles listed in my system, but I appreciate the suggestion.
For now, I'll just proceed with the top two checked in the list.

@Nathan Letwory (jesterking) Is there any chance this is still a bug and can be fixed so that it doesn't cause an instant crash when rendering with Cycles? Just thinking for other OpenCL users who might run into this issue in the future.

@Brandon Hix (Acolyte) it could be a bug still yes. We'd need to be able to repro this though.

With the one WX9100 in I don't see problems. Running Cycles on it works just fine in the 2.80 official release.

I'll dig a bit more into this on the code side. Maybe I can get sometime this month access to a machine with two AMD GPUs.

Nathan Letwory (jesterking) lowered the priority of this task from Needs Triage by Developer to Waiting for Developer to Reproduce.

@Jeroen Bakker (jbakker) I remembered having dealt with something like this when I had the Intel CPU OpenCL driver installed as well - it has been a few years since I tested that, and opted to remove that driver, since it caused a whole lot of instabilities on my machine. Perhaps something to keep in mind while investigating.