Crash during path tracing, particle system (master) #52635

Closed
opened 2017-09-03 12:57:01 +02:00 by Alan Taylor · 22 comments

macOS 10.12.6, MacBook Air, NVIDIA GeForce 320M 256 MB

worked: 2.78c, 2.79rc2
broken: master, daily build 2017-09-02, 32e36a1782

Rendering the image from the viewport or the command line, or setting the viewport shading to rendered, causes a crash during path tracing.
test.blend
debug_backtrace.txt
crash_report.txt

macOS 10.12.6, MacBook Air, NVIDIA GeForce 320M 256 MB worked: 2.78c, 2.79rc2 broken: master, daily build 2017-09-02, 32e36a17824 Rendering the image from the viewport or the command line, or setting the viewport shading to rendered, causes a crash during path tracing. [test.blend](https://archive.blender.org/developer/F769033/test.blend) [debug_backtrace.txt](https://archive.blender.org/developer/F769031/debug_backtrace.txt) [crash_report.txt](https://archive.blender.org/developer/F769032/crash_report.txt)
Author

Changed status to: 'Open'

Changed status to: 'Open'
Author

Added subscriber: @skororu

Added subscriber: @skororu

Added subscriber: @mont29

Added subscriber: @mont29

Please follow our submission template and guidelines, also read these tips about bug reports, and make a complete, valid bug report, with required info, precise description of the issue, precise steps to reproduce it, small and simple .blend and/or other files to do so if needed, etc.

Can’t confirm any crash here on linux, what kernel are you using for the render? CPU? CUDA? (backtrace seems to point to CPU, but…)

Please follow our [submission template and guidelines](https:*developer.blender.org/maniphest/task/edit/form/1/), also read [these tips about bug reports](https:*wiki.blender.org/index.php/Dev:Doc/Process/Bug_Reports), and make a complete, valid bug report, with required info, precise description of the issue, precise steps to reproduce it, **small and simple** .blend and/or other files to do so if needed, etc. Can’t confirm any crash here on linux, what kernel are you using for the render? CPU? CUDA? (backtrace seems to point to CPU, but…)
Author

This is indeed a CPU render. The issue is consistent and repeatable, and has affected the master daily builds for at least a week. I have checked this issue with today's daily build (2017-09-03, 718af8e8b3) and the issue remains.

Instructions to repeat issue:

(1) Render the supplied .blend file via the command line e.g. blender -d -b test.blend -f 1

or

(2) Please open the supplied .blend file in the Blender GUI; to trigger the issue you may:

(2a) Press F12 or click Render -> Render Image
or
(2b) Set Viewport Shading to Rendered

This is indeed a CPU render. The issue is consistent and repeatable, and has affected the master daily builds for at least a week. I have checked this issue with today's daily build (2017-09-03, 718af8e8b35) and the issue remains. Instructions to repeat issue: (1) Render the supplied .blend file via the command line e.g. *blender -d -b test.blend -f 1* or (2) Please open the supplied .blend file in the Blender GUI; to trigger the issue you may: (2a) Press *F12* or click *Render -> Render Image* or (2b) Set *Viewport Shading* to *Rendered*

Added subscriber: @brecht

Added subscriber: @brecht

I couldn't repro this on either macOS or Linux, testing with the same 718af8e8b3 build from builder.blender.org.

Can you attach the output of Help > System Info? Rendering from the command line with the --debug-cycles option might also give insight.

From the backtrace it's not clear what the issue could be, perhaps some threading issue but then I would expect the crash to be more random. It could be related to the specific CPU model since this is the place we call different kernels depending on what's supported, but I didn't find issues running the various kernels manually with the debug options. Or maybe the backtrace is a bit deceptive and the actual issue is elsewhere.

I couldn't repro this on either macOS or Linux, testing with the same 718af8e8b35 build from builder.blender.org. Can you attach the output of Help > System Info? Rendering from the command line with the `--debug-cycles` option might also give insight. From the backtrace it's not clear what the issue could be, perhaps some threading issue but then I would expect the crash to be more random. It could be related to the specific CPU model since this is the place we call different kernels depending on what's supported, but I didn't find issues running the various kernels manually with the debug options. Or maybe the backtrace is a bit deceptive and the actual issue is elsewhere.
Author

I've attached the debug info and backtrace when using --debug-cycles

debug_cycles_daily.txt

I actually can't provide you with the system info from the daily build because that option is missing from the menu (it's there on 2.78c and 2.79rc2) - Is it possible to trigger the system info dump from the Python console? I've attached the system info from 2.79rc2 in case that provides some helpful background information.

To illustrate, the upper screen capture is from 2.79rc2, the lower capture is from the daily build. I downloaded a fresh copy of daily 718af8e8b3 to double check the help menu: it behaved identically.

out.png

system-info-279rc2.txt

I've attached the debug info and backtrace when using `--debug-cycles` [debug_cycles_daily.txt](https://archive.blender.org/developer/F775912/debug_cycles_daily.txt) I actually can't provide you with the system info from the daily build because that option is missing from the menu (it's there on 2.78c and 2.79rc2) - Is it possible to trigger the system info dump from the Python console? I've attached the system info from 2.79rc2 in case that provides some helpful background information. To illustrate, the upper screen capture is from 2.79rc2, the lower capture is from the daily build. I downloaded a fresh copy of daily 718af8e8b35 to double check the help menu: it behaved identically. ![out.png](https://archive.blender.org/developer/F775944/out.png) [system-info-279rc2.txt](https://archive.blender.org/developer/F775911/system-info-279rc2.txt)
Author

This comment was removed by @skororu

*This comment was removed by @skororu*

Thanks for the detailed info, I managed to repro the crash now.

The AVX2 being set to True in the --debug-cycles output is not a problem but it is confusing, what it means is that the AVX kernel has not been disabled by some debug options. It still checks the CPU capabilities after that.

System info being missing from the menu is fixed in master already, will be solved in the next macOS build.

Thanks for the detailed info, I managed to repro the crash now. The AVX2 being set to True in the `--debug-cycles` output is not a problem but it is confusing, what it means is that the AVX kernel has not been disabled by some debug options. It still checks the CPU capabilities after that. System info being missing from the menu is fixed in master already, will be solved in the next macOS build.

Added subscribers: @MaiLavelle, @Sergey

Added subscribers: @MaiLavelle, @Sergey

This seems to be caused by {dfae3de6bdf5d3e63d34281c840b9e568d0da613}. I guess it's the combination of a SSE4.1 kernel and hair (unaligned nodes) that is failing.

I'm not sure which exact case that commit solved, was it CPU / OpenCL / .. ? Either way we have to be careful backporting this to 2.79.

This seems to be caused by {dfae3de6bdf5d3e63d34281c840b9e568d0da613}. I guess it's the combination of a SSE4.1 kernel and hair (unaligned nodes) that is failing. I'm not sure which exact case that commit solved, was it CPU / OpenCL / .. ? Either way we have to be careful backporting this to 2.79.

So for NaNs in node intersection to work here we want them to preserved through max4() / min4(), so that tnear <= tfar fails and we skip the NaN nodes.

When running this code we get different results in the SSE4.1 kernel (nan nan) and AVX2 kernel (0.0 0.0).

    printf("%f\n", ssef(_mm_max_ps(_mm_set_ps1(0.0f), _mm_set_ps1(0.0f/0.0f)))[0]);
    printf("%f\n", ssef(_mm_max_ps(_mm_set_ps1(0.0f/0.0f), _mm_set_ps1(0.0f)))[0]);

It seems -ffast-math is to blame for that.

So for NaNs in node intersection to work here we want them to preserved through `max4()` / `min4()`, so that `tnear <= tfar` fails and we skip the NaN nodes. When running this code we get different results in the SSE4.1 kernel (`nan nan`) and AVX2 kernel (`0.0 0.0`). ``` printf("%f\n", ssef(_mm_max_ps(_mm_set_ps1(0.0f), _mm_set_ps1(0.0f/0.0f)))[0]); printf("%f\n", ssef(_mm_max_ps(_mm_set_ps1(0.0f/0.0f), _mm_set_ps1(0.0f)))[0]); ``` It seems `-ffast-math` is to blame for that.

This issue was referenced by blender/cycles@3dd201da1b

This issue was referenced by blender/cycles@3dd201da1b4fb607d38816c7ebdf5288504a37d0

This issue was referenced by ce1f2e271d

This issue was referenced by ce1f2e271d84f0bb7798c04cb9ca8459f12cee50

@brecht, the initial fix was needed to fix avx2 cpu where robust intersection was giving false positive results on empty children (multiplying FLT_MAX by difl would make it in).

Ideas:

  • Get rid of fast math, it seems to be only causing issues nowadays than really helping.
  • Use finite fast math (don't remember if we do it on host or for kernel as well)
  • Try to fix min/max, but how?
  • Revert the change and use FLT_MAX/100 so we don't cause inf.
@brecht, the initial fix was needed to fix avx2 cpu where robust intersection was giving false positive results on empty children (multiplying FLT_MAX by difl would make it in). Ideas: - Get rid of fast math, it seems to be only causing issues nowadays than really helping. - Use finite fast math (don't remember if we do it on host or for kernel as well) - Try to fix min/max, but how? - Revert the change and use FLT_MAX/100 so we don't cause inf.

@Sergey, I tried some code tweaks but could not get it to work reliably on all architectures.

I think disabling -ffast-math is the way to go, see D2828: Cycles: disable fast math flags, only use a subset..

@Sergey, I tried some code tweaks but could not get it to work reliably on all architectures. I think disabling `-ffast-math` is the way to go, see [D2828: Cycles: disable fast math flags, only use a subset.](https://archive.blender.org/developer/D2828).

@brecht, i've committed fix to 2.79 release branch. It's not fully ideal, but is closer to what 2.78 was doing and should be fixing both initial Mai bug an this one. However, i couldn't reproduce any of the bugs on my machine here, so please give it a test and see if we're good for 2.79.

@brecht, i've committed fix to 2.79 release branch. It's not fully ideal, but is closer to what 2.78 was doing and should be fixing both initial Mai bug an this one. However, i couldn't reproduce any of the bugs on my machine here, so please give it a test and see if we're good for 2.79.

Changed status from 'Open' to: 'Resolved'

Changed status from 'Open' to: 'Resolved'
Brecht Van Lommel self-assigned this 2017-09-08 15:11:45 +02:00

I can confirm it solves the crash on macOS, also koro and victor seems to render ok still.

I can confirm it solves the crash on macOS, also koro and victor seems to render ok still.

@brecht, the change is only done in release branch, master is still broken... For master i'd prefer to get rid of --fast-math. But might also apply same fix there for the time being. Any strong preference from your side?

@brecht, the change is only done in release branch, master is still broken... For master i'd prefer to get rid of `--fast-math`. But might also apply same fix there for the time being. Any strong preference from your side?

I'll commit the fast math changes in a minute, so no need to apply the same fix in master I think.

I'll commit the fast math changes in a minute, so no need to apply the same fix in master I think.
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
5 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#52635
No description provided.