OpenCL kernel build fails on Linux with Mesa drivers (RX 480)
Closed, ResolvedPublic

Description

Software Versions

Arch Linux x86_64 4.8.13-1
Mesa OpenCL 13.0.3-1
OCL-ICD 2.2.10-1
Blender 2.78a

Hardware

AMD FX-9590
ASUS Sabertooth 990FX
2x AMD RX 480 4GB, 4th generation GCN, also known as Polaris 10 or Ellesmere. (I've also experienced this with only one card, so I don't think the number of cards is the problem).

Error

  • Open up Blender on Linux with the Mesa open source drivers and an RX 480.
  • Switch the render engine to Cycles.
  • Switch the render device to GPU Compute. (You will need to enable GPU compute in the System tab of user settings beforehand).
  • Hit render. The OpenCL kernel build will fail with the following error:
Compiling base_kernel OpenCL kernel ...
Build flags: -D__NODES_MAX_GROUP__=0 -D__NODES_FEATURES__=0 -D__MAX_CLOSURE__=1 -D__NO_HAIR__ -D__NO_OBJECT_MOTION__ -D__NO_CAMERA_MOTION__ -D__NO_BAKING__ -D__NO_VOLUME__ -D__NO_SUBSURFACE__ -D__NO_BRANCHED_PATH__ -D__NO_PATCH_EVAL__
OpenCL kernel build output:
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../kernel_compat_opencl.h:134:9: warning: 'NULL' macro redefined
/usr/include/clc/clctypes.h:89:9: note: previous definition is here
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../kernel_montecarlo.h:41:14: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../kernel_montecarlo.h:75:14: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../kernel_montecarlo.h:92:14: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../kernel_montecarlo.h:107:14: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../kernel_projection.h:78:62: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../kernel_projection.h:83:63: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../closure/alloc.h:65:5: error: invalid argument type 'ShaderClosure *' (aka 'struct ShaderClosure *') to unary expression
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../closure/../closure/bsdf_microfacet.h:62:21: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../closure/../closure/bsdf_microfacet.h:145:21: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../closure/../closure/bsdf_microfacet.h:269:11: error: invalid argument type 'MicrofacetExtra *' (aka 'struct MicrofacetExtra *') to unary expression
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../closure/../closure/bsdf_microfacet.h:269:29: error: invalid argument type 'MicrofacetExtra *' (aka 'struct MicrofacetExtra *') to unary expression
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../closure/../closure/bsdf_microfacet_multi.h:47:21: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../closure/../closure/bsdf_ashikhmin_shirley.h:136:52: warning: implicit declaration of function 'native_tan' is invalid in C99
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../kernel_compat_opencl.h:116:19: note: expanded from macro 'tanf'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../closure/../closure/bsdf_ashikhmin_shirley.h:166:10: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../svm/svm_gradient.h:46:25: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../svm/svm_closure.h:295:64: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../svm/svm_wave.h:37:8: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../svm/svm_sky.h:105:32: warning: double precision constant requires cl_khr_fp64, casting to single precision
/usr/share/blender/2.78/scripts/addons/cycles/kernel/kernels/opencl/../../util_math.h:56:26: note: expanded from macro 'M_2PI_F'

OpenCL build failed: errors in console
Build error: CL_BUILD_PROGRAM_FAILURE
Error: OpenCL build failed: errors in console

I attempted to repair the OpenCL source myself but not being familiar with compute development I didn't get very far. Fixing those unary expression errors by inserting bool casts causes a different more complex error, the output is as before but at the end of the warnings there is now:

<unknown>:0:0: in function kernel_ocl_shader void (%struct.KernelData addrspace(2)*, <4 x i32> addrspace(1)*, <4 x float> addrspace(1)*, float addrspace(1)*, <4 x float> addrspace(1)*, <4 x float> addrspace(1)*, <4 x float> addrspace(1)*, i32 addrspace(1)*, i32 addrspace(1)*, i32 addrspace(1)*, i32 addrspace(1)*, i32 addrspace(1)*, i32 addrspace(1)*, <4 x float> addrspace(1)*, <4 x float> addrspace(1)*, i32 addrspace(1)*, <4 x float> addrspace(1)*, <4 x i32> addrspace(1)*, i32 addrspace(1)*, <2 x float> addrspace(1)*, <4 x float> addrspace(1)*, <4 x float> addrspace(1)*, i32 addrspace(1)*, <4 x i32> addrspace(1)*, float addrspace(1)*, <4 x float> addrspace(1)*, <4 x i8> addrspace(1)*, <4 x float> addrspace(1)*, <4 x float> addrspace(1)*, <2 x float> addrspace(1)*, <2 x float> addrspace(1)*, <4 x float> addrspace(1)*, <4 x i32> addrspace(1)*, i32 addrspace(1)*, i32 addrspace(1)*, float addrspace(1)*, i32 addrspace(1)*, <4 x i8> addrspace(1)*, <4 x float> addrspace(1)*, i8 addrspace(1)*, float addrspace(1)*, <4 x i32> addrspace(1)*, i32, i32, i32, i32, i32): unsupported call to function get_local_size

Which is beyond my ability to diagnose or repair.

Looking at the errors and the code that produces them leads me to believe there must be a version mismatch at work, either Blender targets a slightly different version of OpenCL or Mesa does not provide the correct version on Ellesmere. I can't find any conclusive information either way though.

(A similar build error is also observed when using Luxrender on GPU instead of Cycles).

Details

Type
Bug
Sergey Sharybin (sergey) closed this task as "Archived".Jan 25 2017, 11:56 AM
Sergey Sharybin (sergey) claimed this task.

Thanks for the report, but we simply don't officially support MESA OpenCL implementation yet.

Nick Ashley (nashley) added a comment.EditedFeb 9 2017, 2:33 PM

I am also experiencing a similar error with my R9 280X running opencl-mesa 13.0.4-1 and ocl-icd 2.2.10-1 on Arch Linux x86_64 4.9.8-1 (Blender version 2.78.a).

Thanks for the report, but we simply don't officially support MESA OpenCL implementation yet.

When, if ever, can we expect Blender's OpenCL builds to be compatible with mesa drivers?

With a little dedication, would an amateur OpenGL developer be able to help fix this?

Ian Bruce (ian_bruce) reopened this task as "Open".EditedMon, Mar 13, 12:08 AM

This same problem (actually, several problems) has been reported here; it's not distribution- or hardware-specific:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=848258

Fortunately, all these problems are now solvable.

The "double precision constant requires cl_khr_fp64" compiler warning has just been addressed here:

https://developer.blender.org/rB2d3c44389ab7

I'm not clear on the origin of the compiler warning "implicit declaration of function {lgamma,native_tan} is invalid in C99"; I haven't been able to find out where these openCL functions are supposed to be defined. (Does anybody know?) But it's only a warning, so it may not matter.

There are three actual compile errors of the form "invalid argument type 'X *' to unary expression". This seems to be a problem with using a pointer as a boolean value, under some version of the C language standard; it's usually OK, as far as I know. A patch for this is attached below:

The "unsupported call to function get_local_size" error is a Mesa/LLVM problem. A patch is available upstream; it is recommended that Linux distributions apply it locally:

https://bugs.freedesktop.org/show_bug.cgi?id=99856#c24

Perhaps these changes can be made, and then openCL GPU rendering retested, to see if it then works.

<ian_bruce@mail.ru>

Sergey Sharybin (sergey) closed this task as "Resolved".Mon, Mar 13, 10:00 AM

I've applied the patch at rB8794a43 just because i prefer explicit NULL comparisons my self.

Keep in mind that this is a proper C syntax and there is no reason why OpenCL which is C99-based will forbid this. In fact, this code works just fine on AMD/Intel/NVidia drivers.

Since this is a valid syntax, there is a very high likelyhood of this issue happening again.

So preference is to get this fixed in MESA itself.

Please also note that for such a collaboration about unsupported platform we expect either ML/IRC or a patch tracker used. We do not accept bug reports about platforms we do not officially support.

It turns out that the "invalid argument type 'X *' to unary expression" compile errors are the result of a known bug in LLVM. It is said to be fixed in LLVM-v5; see here:

https://bugs.llvm.org/show_bug.cgi?id=30217

https://reviews.llvm.org/D29038

https://reviews.llvm.org/rL294313

a trivial test case illustrating the bug:

$ cat ptr-expr2.c
int ptr_expr(int *p)
{
    if (!p)
        return(0);
    else
        return(1);
}
$ 
$ clang-3.9 -x cl -emit-llvm -S ptr-expr2.c
ptr-expr2.c:3:9: error: invalid argument type 'int *' to unary expression
    if (!p)
        ^~
1 error generated.
$ 
$ clang-4.0 -x cl -emit-llvm -S ptr-expr2.c
ptr-expr2.c:3:9: error: invalid argument type 'int *' to unary expression
    if (!p)
        ^~
1 error generated.
$ 
$ clang-5.0 -x cl -emit-llvm -S ptr-expr2.c
$ # compilation succeeds

Since LLVM is the cause of the problem, patching completely correct Blender code, as I suggested above, is probably not an appropriate solution.

Instead, distributions should either require LLVM-v5 for openCL, or backport the bugfix into their current LLVM version -- it appears to be a single-line change (see link above).

Alternatively, the LLVM maintainers could be made aware that pre-version-5 LLVM is currently unusable for openCL, and asked to backport the fix into an earlier version, themselves.

<ian_bruce@mail.ru>

The two separate compiler errors involved here (neither is actually a Blender issue) are now the subject of two Debian bug reports. Patches are available upstream for both problems, as mentioned above.

unsupported call to function get_local_size

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857591

invalid argument type 'X *' to unary expression

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857623

Other distributions may wish to take note of the results, and resolve these issues in a similar way.

<ian_bruce@mail.ru>

It turns out that these compiler warnings --

implicit declaration of function {lgamma,native_tan} is invalid in C99

occur because the version of libclc (Mesa openCL) being used is not recent enough to have implemented those functions; one has only been available for a few weeks. These are the relevant development commits:

https://github.com/llvm-mirror/libclc/commit/07fa4ae82da5fa75af174f30c498ff160bbf8644

https://github.com/llvm-mirror/libclc/commit/a2593ed8adbf7386f88dfc828cfc32f788ec3983

Debian bug report for this issue:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857710

This bug, and the two above, are aggregated here, so these problems can be resolved as quickly as possible:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857718

All three problems are fixable NOW, as I have described. Other distributions should apply these patches and updates in their own archives.

It appears that if this were done, GPU rendering with openCL might very well start working.
People who are interested in this outcome should apply pressure in appropriate places; it's no longer a Blender problem.

<ian_bruce@mail.ru>

@Ian Bruce (ian_bruce), i'm not sure what we can do with that knowledge. We are not drivers developers and are not involved into any distro development process. Guess this info better be directed to distro maintainers to ensure they ship proper drivers for OpenCL support (Blender is not the only program on planet which will benefit from that ;)

Yes, that's exactly what I said --

i'm not sure what we can do with that knowledge.

"Since LLVM is the cause of the problem, patching completely correct Blender code, as I suggested above, is probably not an appropriate solution."

We are not drivers developers and are not involved into any distro development process.

"The two separate compiler errors involved here (neither is actually a Blender issue) are now the subject of two Debian bug reports."

Guess this info better be directed to distro maintainers to ensure they ship proper drivers for OpenCL support

"This bug, and the two above, are aggregated here, so these problems can be resolved as quickly as possible:"

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857718

"Other distributions should apply these patches and updates in their own archives. It appears that if this were done, GPU rendering with openCL might very well start working. People who are interested in this outcome should apply pressure in appropriate places; it's no longer a Blender problem."

Blender is not the only program on planet which will benefit from that

I found this discussion when I was searching for information on the same problem; it seems that there are more than fifteen people subscribed to it. Presumably other people will find it as well:

https://www.google.com/search?q=blender+mesa+opencl (second result)

I have now told anybody who cares, that "it's no longer a Blender problem", and that they should "apply pressure in appropriate places", "distributions should apply these patches and updates in their own archives." Along with exactly which patches and updates need to be applied to get openCL working for this or any other application.

It seems to me that this is all useful information for anybody who cares about the issue, and since the question was asked here, this is where I've answered it. Nobody need ask it again; I've told them exactly what their distribution needs to do to fix the problem, and filed appropriate bug reports for Debian, myself. Now other people can file appropriate bug reports with their distributions, and GPU rendering on AMD hardware can finally be made to work.

<ian_bruce@mail.ru>