Page MenuHome

wrong assumptions about running OpenCL kernels on GPU device
Confirmed, NormalPublicBUG

Description

System Information
Operating system: Linux Fedora 30
Graphics card: Intel Corporation HD Graphics 630

Blender Version
Broken: reproduced on 2.79b-17 and newer
Worked: (optional)

Short description of error
Looking at https://developer.blender.org/diffusion/B/browse/master/intern/cycles/kernel/split/kernel_holdout_emission_blurring_pathtermination_ao.h$76 I've found this assumption is incorrect according to OpenCL spec: https://www.khronos.org/registry/OpenCL/sdk/1.0/docs/man/xhtml/barrier.html

Exact steps for others to reproduce the error
I know Intel GPU is not officially supported by Blender, but I'm trying to fix issues reported by users: https://github.com/intel/compute-runtime/issues/178 and https://github.com/intel/compute-runtime/issues/195.
Debugging these issues I've found problems with OpenCL kernels: kernel_direct_lighting, kernel_holdout_emission_blurring_pathtermination_ao, kernel_buffer_update. kernel_split_branched_indirect_light_init, kernel_next_iteration_setup, kernel_enqueue_inactive, kernel_shader_setup and barrier in function enqueue_ray_index_local at line https://developer.blender.org/diffusion/B/browse/master/intern/cycles/kernel/kernel_queues.h$87
According to OpenCL spec all threads must execute barrier function, and can't return earlier, but in some of mentioned kernels there are earlier returns for example when ray_index == QUEUE_EMPTY_SLOT or ray_index >= queue_index. This earlier returns causes GPU hang during barrier executions.

Event Timeline

The comment you are talking about was introduced in rB7f4479da4. Unfortunately, that was a large commit, so that's probably not very helpful...

Brecht Van Lommel (brecht) changed the subtype of this task from "Report" to "Bug".
Campbell Barton (campbellbarton) changed the task status from Needs Triage to Confirmed.Wed, Feb 12, 8:28 AM