Non-deterministic geometry statistics when baking (race condition?) #48061
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
6 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#48061
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
System Information
Windows 10 x64, 2x Radeon R9 270X
Blender Version
Broken: 2.77
Short description of error
When baking the supplied .blend, the geometry counts (vertices, faces, edges) jump around. From looking at the values, I think the counts vary from three duplicated objects, which are grouped into different groups for selective smoke simulation, but where the domains and the smoke emitters share the same mesh (not object).
So the setup is: 3 Objects are in 3 distinct groups, but share 1 mesh; the objects are children of a base object, and have different animations.
The problem is: Geometry counts jump between a few hard values, from ~7k vertices (which seems to be the basic scene geometry) and ~80k vertices (which is the more than the whole scene should contain).
Additionally, the 3 emitters sometimes do not emit smoke, or have "gaps" when emitting; also, when using left/right arrow keys to step through the animations, the objects' positions are occassionally not synchronized, meaning an object is lagging behind.
I have a crashdump (Windows) lying around and can submit it when needed.
Other CPU- and RAM-heavy applications run fine, and memtest86 didn't report any errors.
Exact steps for others to reproduce the error
Open the .blend, select a domain, and hit "Update to current frame" in the physics tab. Occassionally blender also crashes when doing this.
2nd_reality.blend
Changed status to: 'Open'
Added subscriber: @stohrendorf
Added subscriber: @burton
Added subscriber: @Sergey
Jumping frames is not really a bug. Baking could have some run-time created geometry for intermediate calculation. You don't see it in the viewport because interface is actually all locked.
Crash shouldn't happen tho, but it i can not reproduce here. Please test with 2.77a and latest builds from builder.blender.org, there were some fixes in related areas.
If you'll still have crash, please try isolating it and make more obvious to happen.
First results with 2.77a, now relatively reproducable with the above blend:
Only one domain is showing smoke/flames, but all three should. This isn't a display error, as it's also not rendered.
BTW, I'm running the RenderSheep client in parallel, which also renders using 2.77a, and there are no problems as far as I can tell.
I will continue testing nightly builds, and also do some debugging to narrow the problem down.
Added subscriber: @mont29
Actually, here with latest master and asan debug build got a crash nearly immediately on first baking of right domain, here is TB (note that I also tried with
-t 1
mono-threaded option, got same result but from non-main thread too…):From a quick glance at the code, I'd say that in
FLUID_3D_STATIC.cpp:241
the array values are overwritten with uninitialized data unless the array has been allocated with a size that contains the (probably virtual) borders.In this specific case,
WTURBULENCE::_resSm
is (1,1,1), so the float array passed toFLUID_3D::copyBorderX
has a size of 1;res
is here (1,1,1),zBegin
is 0 andzEnd
is 1. This leads to the first index value ofindex = y * res- [x] + z * slabSize = 0 * 1 + 0 * 1 = 0
, which results in a out-of-bounds access atfield[index] = field[index + 1]
. Putting a fewif
s around the field accesses resolved the crashes, but there are still some some troubles with multiple emitters.Here, again only one domain shows some smoke, and it also starts within the midst of the simulation:
And in this blend (icetunnel.blend), the right sphere's smoke should be much more like the left one's (which is cloned from the right one, without changes to the smoke physics):
After doing a larger refactoring on the smoke code (results here, containing the whole
blender/intern/smoke
folder: smoke.7z), the crashes disappeared.The reason for the strange smoke behaviour above seems to be that the influence maps aren't properly used, leading to the fact that the smoke's temperature isn't applied to the cells' heat values. In fact, there are non-zero influence values calculated, but in
apply_inflow_fields
, theemission_value
parameter is always zero for the right sphere. Deleting the left sphere didn't change anything.Also (which is also strange), there's no noise applied to the smoke emitted from the right sphere, and the low-resolution smoke isn't visible at all.
I double-checked the emitter smoke settings, and they are all the same for both spheres.
I think I finally found the problem: Smoke is emitted to both the low-res and the high-res maps, but because it only gets successfully emitted in the high-res map and not in the low-res maps due to the low maximum surface distance, it doesn't get processed further, except for dissolving effects. Increasing the domain's resolution or the emitters' maximum surface distance solves problem.
Although I presume that the whole smoke simulation code will be revamped with the current GSOC project aiming the consolidation of fluid and smoke simulation, or at least for 2.8, I'd really like to see a fix for the out-of-bounds accesses in the next release. Still, there's the problem I mentioned in my previous comment, which also looks a bit like a memory corruption, because when I use the patched code with the
2nd_reality
blend I initially posted, emission of smoke is still non-deterministic, and totally changes behaviour when selecting different domains to do a full scene bake (although that could simply be the case because I messed up the code in some other place).Added subscriber: @angavrilov
The reason it crashes seems to be that somebody simply forgot to add a check for an empty smoke domain for the highres step. Doing the same check as can be found in the lowres case fixes it:
Now, the more interesting question is why there is so little fire and smoke that the adaptive domain collapses to nothing. It turns out that it may be a bad idea to use the vertex group option to separate emitting and totally non-emitting faces, especially if the low res domain cell size is bigger or comparable to the size of the emitting area on the object. This is because the emitting code first finds the closest face to the center of the domain cell, and only then checks the vertex group. If that face turns out to be non-emitting - tough luck, even if there are emitting faces within range.
If you separate the emitting faces of those ships into a separate object, suddenly they look like they are on fire and about to explode.
This issue was referenced by
d7bd64df5d
Changed status from 'Open' to: 'Resolved'
Many, many thanks for resolving this!