Need more info when failed to compile OpenCL code #48842
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#48842
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
When I getting message like this (blender --verbose 99 --debug-all)
I still have two unanswered questions:
Compiling which OpenCL kernel?
Which errors, in which console?
May be knowing those answers someone (even may be I) may try to fix this kernel and even possible it's template and send some patches. Also very needed at least one thing: save unsuccessful kernel sources (one source if no includes were used) to temporary directory and print it's path.
Changed status to: 'Open'
Added subscriber: @inferrna
Added subscribers: @Sergey, @ThomasDinges
Try --debug-cycles
@Sergey: --debug-all does not include Cycles, Libmw and Memory. Intentional?
@ThomasDinges, i've run into
--debug-all
not including Libmv another day and was surprised actually. I don't remember any discussion about this and personally i would just enable Cycles and Libmv with--debug-all
.Also agree we should add note which exact kernel is being compiled in. Should be a simple thing to replace "Compiling OpenCL kernel ..." with "Compiling {Bake,Init,SceneIntersect...} OpenCL kernel ..."
However, it is important to note, that in this particular situation we can only add info about which kernel failed to compile. Actual compilation error is reported by OpenCL and always printed to stderr, regardless of debug flags set. The reason why there is nothing printed is because driver did not provide any information about what exactly went wrong. That we can't fix.
In pure C I can do such a blackmagic trick:
Gives me output:
I very hope you now also can too ) Little hint: /tmp/OCL27420T1.cl - is created by driver and will be deleted after compilation. To provide kernel source for some curious users, you need to save kernel source at another path.
This is almost the same as we do, with difference in the following:
First we call
clGetProgramBuildInfo
withparam_value
of NULL in order to get real size of buffer we need to retrieve the error. This is needed because error message might be way longer than buffer allocated on stack. This is also a legit situation according to OpenCL specs to get parameter size.Then we allocate buffer of the returned
param_value_size_ret
.After that we call
clGetProgramBuildInfo()
with proper buffer size andparam_value
to get actual error message.See - [x] for some more details in actual code :)
Can you switch your test application to do similar thing and see if it gives proper
error message to you?
Changed to:
Gives:
It seems to be a proper way. Currently I'm on 2.77-a-1459879952-0thomas~xenial0 from this ppa https://launchpad.net/~thomas-schiex/+archive/ubuntu/blender?field.series_filter=xenial (compiled 2016-04-08) Is this code already there?
The code was there since the beginning of
device_opencl.cpp
, which makes it weird why Cycles does not print same error message. Would need to have a closer look.At the meantime, please attach
system-info.txt
generated byHelp -> Save System Info
.system-info.txt
clinfo.txt
Very strange, but after switching render resolution to 640x480 I 've got no more error. And even switching back to 1920x1080 too.. -) Seems to be possible an strange driver bug and error was happened without any message(?)
Nevertheless it would be nice to have *.cl source file if something went wrong.
Added subscriber: @brecht
Rather than providing access to the temporary combined .cl source file, I would try to use
#line
preprocessor directives to point to the original .cl files.Instead I'd prefer to see the cooked source. Having this I can try to compile it another way (with pyopencl for example) on another platform with another options. Or attach it to some bugreport (for example final source may depend on some render parameters and it will be more easy - to provide just cl source rather than all cycles parameters and option )
First of all, OpenCL source does not depend on any of settings. It is compilation parameters which depends on that.
Second of all, you can set
CYCLES_OPENCL_DEBUG
enironment variable to enable some extra OpenCL debugging things which includes dump of exact source being compiled. Same think you can access by setting Debug Value to 256 and enable Debug in OpenCL section of Debug panel within Render buttons.Third of all, i've committed tweaks to code which makes it so --debug-cycles and --debug-libmv are included into --debug-all. I've committed change which prints which kernel is currently being compiled.
Forth of all, i've tested OpenCL kernel on NVidia cards here and they're happily compiled the code. Additionally, after adding some code which can not be compiled i've got proper error message printed to the console.
And finally, i can see you've got two OpenCL devices. Are you sure you're using same device in test application? Could be so CPU backend reports error properly, and GPU backend does not.
At first it seemed to be a kind of debug magic.
snodbg.txt - without "--debug"
sdbg.txt - with "--debug"
But then
sdbgmem.txt - finally "--debug-memory" solo
"fully guarded memory allocator." - solved my problem (as for now, and if no more hidden surprise errors exists)
You may add "--debug-memory" to possible solution for cases like this - when no compiler errors produced.
Yes, I tried both devices.
I steal amazing err_code function from here https://github.com/HandsOnOpenCL/Exercises-Solutions/blob/master/Exercises/C_common/err_code.h
And with magic power of it now I'm able to print error code right into stderr. Below is what I got:
Saving file to '/tmp/failed_program.cl' is my own invention - nothing with it, successful compiled using pyopencl with flags provided.
That is REALLY weird that any of the debug flags makes difference here. Especially the --debug-memory. Does it mean we've got some major memory issues (writing to outsides of the arrays)?
In any case, that's a good point that it's handy to dump error code to the console as well. We have
clewErrorString()
for that tho.Since i can not replicate this issue, do you mind updating your code to use
clewErrorString()
and provide a patch? :)P.S. Build options issue we'll also need to look into, but those i'll need to have a closer look. Maybe it's just a known double-space issue or so :S
What done:
You may remove saving stuff or change it to be working on non-having "/tmp/" platforms.
base_kernel.cl.xz - I also attached program source - it's very large (about 2.9 MB) and full of duplicates. For example - are we really need 6 copies of
and other functions?
It is very possible that I was wrong about it - just now I've got successful render without this option and unsuccessful with.
I've committed chaneg which prints human-readable description of error code when compilation fails.
Those copies are a result if inline-ing header files since OpenCL had difficulties using
#include
statements. We don't do any real pre-processing when expanding #include statements and leaving this to the compiler, which then expands#ifdef
blocks and strips unneeded parts of the code.All this is quite harmless actually.
This is getting really weird. For me compilation always successful. Are the latest builds from builder.blender.org still flackey for you?
Changed status from 'Open' to: 'Archived'
Ok, so few things.
First of all, didn't see reply in more that 5 days. So due to policy of the tracker archiving this task since it lacks feedback from the reporter.
Second of all, the original issue with lack of extra error print in OpenCL has been already fixed in git.
Third of all, if there are some other issues here to be handled please make a separate report. Will make this much easier to keep track on.