Page MenuHome

Assert/crash during COW phase of depsgraph evaluation due to mismatch mask info in code updating mesh data from multires subdiv ccg runtime data.
Confirmed, NormalPublicTO DO

Description

Found after fixing T84002: Sculpt: Masking operations crash if multires is in play.

BLI_assert(grid_element.mask != NULL) in multires_reshape_assign_final_coords_from_ccg() will fail after undoing from Sculpt mode to Object mode, since in Object mode mesh data does not have paint mask cdlayers yet (it is added when switching to Sculpt mode), while runtime subdiv_ccg coming from sculpt undo step does.

Will commit a temp workaround to make Blender usable, but not sure at all if this is actually valid/expected situation, or if there are more layers of worms in that giant barrel. Need @Sergey Sharybin (sergey)'s enlightenment here, both on depsgraph side and multires side of the question.

Steps to reproduce:

  1. Default startup cube
  2. Add multires, subdivide at least once.
  3. Switch to sculpt mode, draw a sculpt stroke.
  4. Undo twice to object mode

And it asserts:

1   __GI_raise                                                                                                                                                                                                                                                                                                  raise.c                           50   0x7fae6ac53c81 
2   __GI_abort                                                                                                                                                                                                                                                                                                  abort.c                           79   0x7fae6ac3d537 
3   _BLI_assert_abort                                                                                                                                                                                                                                                                                           BLI_assert.c                      50   0x290e6d84     
4   multires_reshape_assign_final_coords_from_ccg                                                                                                                                                                                                                                                               multires_reshape_ccg.c            78   0x1091bc37     
5   multiresModifier_reshapeFromCCG                                                                                                                                                                                                                                                                             multires_reshape.c                158  0x1091613b     
6   object_update_from_subsurf_ccg                                                                                                                                                                                                                                                                              object.c                          1476 0x109bcc4d     
7   BKE_object_free_derived_caches                                                                                                                                                                                                                                                                              object.c                          1557 0x109bd60d     
8   blender::deg::ObjectRuntimeBackup::restore_to_object                                                                                                                                                                                                                                                        deg_eval_runtime_backup_object.cc 112  0x28d41ddc     
9   blender::deg::RuntimeBackup::restore_to_id                                                                                                                                                                                                                                                                  deg_eval_runtime_backup.cc        97   0x28d3584f     
10  blender::deg::deg_update_copy_on_write_datablock                                                                                                                                                                                                                                                            deg_eval_copy_on_write.cc         954  0x28c8f38f     
11  blender::deg::deg_evaluate_copy_on_write                                                                                                                                                                                                                                                                    deg_eval_copy_on_write.cc         1088 0x28c902bf     
12  std::__invoke_impl<void, void ( *&)(Depsgraph *, blender::deg::IDNode const *), Depsgraph *, blender::deg::IDNode *&>                                                                                                                                                                                       invoke.h                          60   0x28d19f6d     
13  std::__invoke<void ( *&)(Depsgraph *, blender::deg::IDNode const *), Depsgraph *, blender::deg::IDNode *&>                                                                                                                                                                                                  invoke.h                          95   0x28d153cf     
14  std::_Bind<void ( *(std::_Placeholder<1>, blender::deg::IDNode *))(Depsgraph *, blender::deg::IDNode const *)>::__call<void, Depsgraph *&&, 0ul, 1ul>(std::tuple<Depsgraph *&&>&&, std::_Index_tuple<0ul, 1ul>)                                                                                             functional                        416  0x28d0cea4     
15  std::_Bind<void ( *(std::_Placeholder<1>, blender::deg::IDNode *))(Depsgraph *, blender::deg::IDNode const *)>::operator()<Depsgraph *, void>(Depsgraph *&&)                                                                                                                                                functional                        499  0x28d05809     
16  std::__invoke_impl<void, std::_Bind<void ( *(std::_Placeholder<1>, blender::deg::IDNode *))(Depsgraph *, blender::deg::IDNode const *)>&, Depsgraph *>(std::__invoke_other, std::_Bind<void ( *(std::_Placeholder<1>, blender::deg::IDNode *))(Depsgraph *, blender::deg::IDNode const *)>&, Depsgraph *&&) invoke.h                          60   0x28cfca8c     
17  std::__invoke_r<void, std::_Bind<void ( *(std::_Placeholder<1>, blender::deg::IDNode *))(Depsgraph *, blender::deg::IDNode const *)>&, Depsgraph *>(std::_Bind<void ( *(std::_Placeholder<1>, blender::deg::IDNode *))(Depsgraph *, blender::deg::IDNode const *)>&, Depsgraph *&&)                         invoke.h                          110  0x28cf1cca     
18  std::_Function_handler<void (Depsgraph *), std::_Bind<void ( *(std::_Placeholder<1>, blender::deg::IDNode *))(Depsgraph *, blender::deg::IDNode const *)>>::_M_invoke(std::_Any_data const&, Depsgraph *&&)                                                                                                 std_function.h                    291  0x28ce9b7c     
19  std::function<void (Depsgraph *)>::operator()(Depsgraph *) const                                                                                                                                                                                                                                            std_function.h                    622  0x28e2c749     
20  blender::deg::(anonymous namespace)::evaluate_node                                                                                                                                                                                                                                                          deg_eval.cc                       113  0x28e27e58     
... <More>

Event Timeline

Bastien Montagne (mont29) changed the task status from Needs Triage to Confirmed.Dec 23 2020, 4:00 PM
Bastien Montagne (mont29) triaged this task as High priority.
Bastien Montagne (mont29) created this task.
Bastien Montagne (mont29) changed the subtype of this task from "Report" to "Bug".

Based on the fact that reasonable amount of time was spent on looking into this issue and into the original report, few thinks which will make development process more efficient:

  • Always explicitly include steps to reproduce the issue. Including steps related on modifying the code to trigger the assert/crash again.
  • Include all findings gathered during investigation. For example, why/where is it exactly something went out of sync.

Doing this will avoid doing a duplicate work on digging into roots of the issue, and often will even help seeing other usecases which might be relevant but not immediately clear for one developer.

I am confused about priority of the report. Ignoring the state of the code, there is no crash/assert. If there is still something crashes for people, the information about it is to be added to the description. If it's purely code-keeping, then I do not think it should be considered a high priority.

Another confusing aspect, is why a workaround is done in the first place. If this is a regression caused by expected changes somewhere else this information is essential to be known. If it's not a recent regression, then it is much better to not hide code which catches invalid usages.


Now, to the actual solution here.

To me the root cause seems to be sculpt mode abusing the undo system:

  • It (sculpt mode) does has local undo stack.
  • Entering sculpt mode can modify object data (via BKE_sculpt_mask_layers_ensure), without creating undo step for the modification.
  • Sculpt undo node always marks SubdivCCG as modified, even when it is returned to a non-modified state.
  • Sculpt mode never updates SubdivCCG other than via DEG_ID_RECALC_GEOMETRY
  • As a result of 3 above, leaving sculpt mode using undo will have SubdivCCG un-intuitively marked as modified, containing both coordinate and mask data, and will try to update object data with modifications from SubdivCCG.

The proper fix would be the sculpt mode to be more sane about the undo stack, and ensure state is always consistent rather than relying on fragile workarounds in other areas of code. This is something I would expect the sculpt team to take care of.

Since the proper fix is likely to take some time, i would consider the committed workaround good enough for the time being. I would, however, update the comment to state actual root of the problem, something like this P1856.

This, of course, is only applicable if there are no known user-level issues with the workaround you've committed.

Bastien Montagne (mont29) lowered the priority of this task from High to Normal.Dec 28 2020, 2:20 PM
Bastien Montagne (mont29) changed the subtype of this task from "Bug" to "To Do".

Committed suggested comment changes.

Also from @Sergey Sharybin (sergey)'s comments this could actually be caused by first Sculpt undo step not being properly undone (which is one of the issues to be addressed by T83806: Undo - Current Status and Important Fixes Needed).

So might be worth waiting for T83806 to be addressed first, before checking again on this issue.

Bastien Montagne (mont29) renamed this task from Assert/crash during COW phase of depsgraph evaluation due to mismtach mask info in code updating mesh data from multires subdiv ccg runtime data. to Assert/crash during COW phase of depsgraph evaluation due to mismatch mask info in code updating mesh data from multires subdiv ccg runtime data..Dec 28 2020, 2:25 PM

@Bastien Montagne (mont29), Seems like this is a task assigned assigned to me, but I'm not sure how can I help here. Is there anything particular you expect me to take care of here?

Woops no sorry, was a left-over from initial own understanding of the issue. no reason for this to be assigned to you anymore indeed.