Page MenuHome

specific .blend with hair crashes in MacOS 2.78 RC1 on render
Closed, ResolvedPublic

Description

System Information
MacOS El Capitain Intel CPU, nvidia GPU (using CPU rendering)

Blender Version
Broken: 2.78 RC1
Worked: not sure as the file is new

Short description of error
The specific file causes a crash after hitting render

Exact steps for others to reproduce the error
download this .blend{F355677}
open in it on a mac running el capitaine
press render
it crashes here on multiple machine, just after the importance sampling message flashes on screen

Event Timeline

seems like all tube/wires for empathy files are crashing now. Our lab just upgraded macos, somehow this didn't happen on the old version :/

Didn’t went through whole render (takes ages here, even at 10% resolution), but did see some purple hands in tiles, so I guess I cannot reproduce the issue on linux with latest master, not even with asan-enabled debug build…

Sergey Sharybin (sergey) triaged this task as Needs Information from User priority.

Can't reproduce the crash here. but some notes:

  • Please always simplify file. It's even stated in the bug report guidelines: as simple as possible .blend. it is much easier from your side to isolate what exactly causes the error, so we can deal with the file in full debug mode.
  • Even tho the file is new, nothing prevents you from testing it in older Blender. Please do that.
  • Also check if it's GPU only issue or also happens on CPU.
  • If it's GPU, are you using NVidia or AMD card?

Ok, with help of @Bastien Montagne (mont29) i've learned how to read. It is CPU rendering :)

Sorry for the spam :)

Please also test latest builds from builder.blender.org. There were some crash fixes here and there since RC1.

Hey guys, some extra info:

  • This file does NOT crash on linux, don't worry about testing it there, (other than to confirm that it doesn't crash on linux)
  • I'll try to simplify the file and see what happens, rendering a cube does work on the macs.
  • The File crashes both on buildbot builds and RC1
  • Missing textures does not affect the crash, I didn't pack them for space, I just packed the .blend links.
  • I'll try to simplify. Actually every single wires for empathy file crashes on render on the macs, so there is something in common there.
  • This did not crash for us on the previous mac os version, only on El Capitain, which is 10.6 I believe. Meaning it may not crash on older or newer OS versions. Unfortunately, I can't pick the version as this is running in a Uni. computer lab.
  • Finally too bad Apple just didn't use the Linux Kernel instead of pig-headedly deciding to roll their own 'special' thing. Imagine how much easier things would be!

I can't reproduce the crash with Bender 2.78 RC1 on OS X 10.11.6. Is there a backtrace?

I'll get one tomorrow when I'm at the uni. as far as I remember it's really short, just a few lines.
(this is from the crash.txt)

Hey folks, so:
OS is 10.11.6 also
2.77a does not crash
I copied the report and pasted it here http://pasteall.org/79074

I've narrowed it down to one asset, our main character. I'm going to see if I can get it down smaller:

  • once again, all you need to do is press 'render', using the standard features, on CPU


Hey folks!
So I narrowed it down to the simplest element.
You'll see there is a single particle system on the single mesh in the file, called 'lashes'
The file crashes on render unless the particle system is deleted, again. only on macos. it is 100% reliable crash here.
Since Brecht couldn't replicate (are you using the release candidate or your own build) I would like to try bisecting till I find the commit. (2.77a works)
Are the buildbot builds saved somewhere or is only the most recent one available? I haven't built blender on macos so far.

I can reproduce the crash with this latest file, if I disable AVX and AVX2 in the debug panel. Seems to be something in the hair QBVH, don't understand what's happening exactly yet.

Brecht Van Lommel (brecht) raised the priority of this task from Needs Information from User to Confirmed, Medium.Sep 12 2016, 10:41 PM

thanks brecht! trying to use git bisect to pinpoint the commit where it breaks, not sure I'm doing it right ;)

Output of git bisect:

4beae09bae56e3f0116b745099a1226c82572bdd is the first bad commit
commit 4beae09bae56e3f0116b745099a1226c82572bdd
Author: Sergey Sharybin <sergey.vfx@gmail.com>
Date:   Thu Jul 7 12:41:45 2016 +0200

    Cycles: Enable unaligned BVH builder for scenes with hair
    
    This commit enables new unaligned BVH builder and traversal for scenes
    with hair. This happens automatically, no need of manual control over
    this.
    
    There are some possible optimization still to happen here and there,
    but overall there's already nice speedup:
    
                          Master                 Hair BVH
      bunny.blend         8:06.54                 5:57.14
      victor.blend       16:07.44                15:37.35
    
    Unfortunately, such more complexity is not really coming for free,
    so there's some downsides, but those are within acceptable range:
    
                          Master                Hair BVH
      classroom.blend     5:31.79                5:35.11
      barcelona.blend     4:38.58                4:44.51
    
    Memory usage is also somewhat bigger for hairy scenes, but speed
    benefit pays well for that. Additionally as was mentioned in one
    of previous commits we can add an option to disable hair BVH and
    have similar render time but have memory saving.
    
    Reviewers: brecht, dingto, lukasstockner97, juicyfruit, maiself
    
    Differential Revision: https://developer.blender.org/D2086

:040000 040000 f9404985491d0180a02382481c219144cb408ea8 84e2e13e8ef83478e26691ffe2ff54fa2de6b522 M	intern
bassam kurdali (bassamk) renamed this task from specific .blend crashes in MacOS 2.78 RC1 on render to specific .blend with hair crashes in MacOS 2.78 RC1 on render.Sep 13 2016, 2:43 AM

It doesn't crash in debug mode so it's a bit difficult to investigate. Here is the backtrace using RelWithDebInfo. I'd expect it to trigger an assert if we are reading outside of the array bounds but that doesn't happen.

* thread #23: tid = 0x17c55, 0x000000010169ee48 blender`ccl::kernel_branched_path_surface_connect_light(ccl::KernelGlobals*, unsigned int*, ccl::ShaderData*, ccl::ShaderData*, ccl::PathState*, ccl::float3, float, ccl::PathRadiance*, int) [inlined] ccl::__float_as_uint(float) at util_math.h:1173, stop reason = EXC_BAD_ACCESS (code=1, address=0x503f8dda0)
    frame #0: 0x000000010169ee48 blender`ccl::kernel_branched_path_surface_connect_light(ccl::KernelGlobals*, unsigned int*, ccl::ShaderData*, ccl::ShaderData*, ccl::PathState*, ccl::float3, float, ccl::PathRadiance*, int) [inlined] ccl::__float_as_uint(float) at util_math.h:1173
   1170	ccl_device_inline uint __float_as_uint(float f)
   1171	{
   1172		union { uint i; float f; } u;
-> 1173		u.f = f;
   1174		return u.i;
   1175	}
   1176
(lldb) up
frame #1: 0x000000010169ee48 blender`ccl::kernel_branched_path_surface_connect_light(ccl::KernelGlobals*, unsigned int*, ccl::ShaderData*, ccl::ShaderData*, ccl::PathState*, ccl::float3, float, ccl::PathRadiance*, int) [inlined] ccl::QBVH_bvh_intersect_hair(kg=<unavailable>, visibility=128, difl=0, extmax=0) + 308 at qbvh_traversal.h:125
   122
   123 					if(UNLIKELY(node_dist > isect->t)
   124 	#ifdef __VISIBILITY_FLAG__
-> 125 					   || (__float_as_uint(inodes.x) & visibility) == 0)
   126 	#endif
   127 					{
   128 						/* Pop. */
(lldb) bt
* thread #23: tid = 0x17c55, 0x000000010169ee48 blender`ccl::kernel_branched_path_surface_connect_light(ccl::KernelGlobals*, unsigned int*, ccl::ShaderData*, ccl::ShaderData*, ccl::PathState*, ccl::float3, float, ccl::PathRadiance*, int) [inlined] ccl::__float_as_uint(float) at util_math.h:1173, stop reason = EXC_BAD_ACCESS (code=1, address=0x503f8dda0)
    frame #0: 0x000000010169ee48 blender`ccl::kernel_branched_path_surface_connect_light(ccl::KernelGlobals*, unsigned int*, ccl::ShaderData*, ccl::ShaderData*, ccl::PathState*, ccl::float3, float, ccl::PathRadiance*, int) [inlined] ccl::__float_as_uint(float) at util_math.h:1173
  * frame #1: 0x000000010169ee48 blender`ccl::kernel_branched_path_surface_connect_light(ccl::KernelGlobals*, unsigned int*, ccl::ShaderData*, ccl::ShaderData*, ccl::PathState*, ccl::float3, float, ccl::PathRadiance*, int) [inlined] ccl::QBVH_bvh_intersect_hair(kg=<unavailable>, visibility=128, difl=0, extmax=0) + 308 at qbvh_traversal.h:125
    frame #2: 0x000000010169ed14 blender`ccl::kernel_branched_path_surface_connect_light(ccl::KernelGlobals*, unsigned int*, ccl::ShaderData*, ccl::ShaderData*, ccl::PathState*, ccl::float3, float, ccl::PathRadiance*, int) [inlined] ccl::bvh_intersect_hair(kg=<unavailable>, visibility=128, difl=0, extmax=0) at bvh_traversal.h:440
    frame #3: 0x000000010169ed14 blender`ccl::kernel_branched_path_surface_connect_light(ccl::KernelGlobals*, unsigned int*, ccl::ShaderData*, ccl::ShaderData*, ccl::PathState*, ccl::float3, float, ccl::PathRadiance*, int) [inlined] ccl::scene_intersect(kg=<unavailable>, visibility=128, difl=0, extmax=0) at bvh.h:229
    frame #4: 0x000000010169ed14 blender`ccl::kernel_branched_path_surface_connect_light(ccl::KernelGlobals*, unsigned int*, ccl::ShaderData*, ccl::ShaderData*, ccl::PathState*, ccl::float3, float, ccl::PathRadiance*, int) [inlined] ccl::shadow_blocked(kg=<unavailable>, shadow_sd=<unavailable>, state=0x0000700000786b68) at kernel_shadow.h:158
    frame #5: 0x000000010169ed14 blender`ccl::kernel_branched_path_surface_connect_light(kg=0x000070000078d600, rng=0x000070000078d46c, sd=0x0000700000787960, emission_sd=0x0000700000788f20, state=0x0000700000786b68, throughput=(x = 0, y = 0, z = 0.319999963, w = 0), num_samples_adjust=1, L=0x000070000078baa0, sample_all_lights=<unavailable>) + 73508 at kernel_path_surface.h:63
    frame #6: 0x000000010167c9ad blender`ccl::kernel_path_indirect(kg=0x000070000078d600, sd=0x0000700000787960, emission_sd=0x0000700000788f20, rng=0x000070000078d46c, ray=0x0000700000786c60, throughput=<unavailable>, num_samples=<unavailable>, state=0x0000700000786b68, L=0x000070000078baa0) + 348333 at kernel_path.h:380
    frame #7: 0x0000000101708b95 blender`ccl::kernel_branched_path_surface_indirect_light(kg=0x000070000078d600, rng=0x000070000078d46c, sd=0x000070000078a4e0, indirect_sd=0x0000700000787960, emission_sd=0x0000700000788f20, throughput=(x = 1, y = 1, z = 1, w = 0), num_samples_adjust=1, state=0x000070000078bc70, L=0x000070000078baa0) + 4309 at kernel_path_branched.h:114
    frame #8: 0x00000001015a2d03 blender`ccl::kernel_branched_path_integrate(kg=0x000070000078d600, rng=0x000070000078d46c, sample=0, ray=ccl::Ray @ 0x000070000078d190, buffer=0x000000010ea08160) + 264819 at kernel_path_branched.h:548
    frame #9: 0x000000010149d3de blender`ccl::kernel_branched_path_trace(kg=<unavailable>, buffer=0x000000010ea08160, rng_state=<unavailable>, sample=0, x=<unavailable>, y=<unavailable>, offset=-93555, stride=135) + 25006 at kernel_path_branched.h:610
    frame #10: 0x0000000100e1439a blender`ccl::CPUDevice::thread_path_trace(this=0x000000010eb4a000, task=0x000000010ee32410) + 618 at device_cpu.cpp:271

@Brecht Van Lommel (brecht), asserts will not have an affect in RelWithDebInfo builds, so you'll have to replace #define kernel_assert() with something like:

#define kernel_assert(statement) \
  if (!(statement)) { \
    printf("Assert failure at %s:%d: %s\n", __FILE__, __LINE__, #statement); \
    abort(); \
  }

I can't reproduce the error with neither of the CPU flags on my desktop and laptop. Will try to occupy our buildbot iMac and see if i can reproduce the issue.

So far my bet is on the difference in child intersection mask for degenerate boundbox or for some NaN/inf ray origin/direction (similar issues happened in the past with original QBVH travsersal). This could overflow traversal stack.

Ok, managed to reproduce on iMac. And indeed it's an overflwo of traversal stack. Digging deeper.

Thanks for fixing what seems to have been a tricky bug (and fixing our farm!)