Page MenuHome

Duplicate Object, then Linked Duplicate Object, then Repeat Last Tool causes crash
Closed, ResolvedPublicBUG

Description

System Information
Operating system: Windows-10-10.0.18362-SP0 64 Bits
Graphics card: GeForce RTX 2070 SUPER/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 441.20

Blender Version
Broken: version: 2.90.0 Alpha, branch: master, commit date: 2020-06-12 17:01, hash: rBfd8d245e6a80
Worked: 2.83.0

Short description of error
Crash when redoing Duplicate Linked several times in a row

Exact steps for others to reproduce the error

  • On a new scene, with default cube, Shift D (move and confirm),
  • then Alt D (move and confirm)
  • then Shift R R R R ... repeatedly until it crashes

Event Timeline

I can confirm this crash here as well using a slightly newer build even. However, it sometimes doesn't happen during the first shift-d, alt-d, repeat cycle. It usually happens on the second attempt for me.

Also, I cannot get a good stack trace for the issue. Of the 3 times I attempted it, I'm getting crashes in 3 different areas... 2 of those have BLI_task_* functions in the stack somewhere so there's a good chance this is threading related :-/

[EDIT] Actually, my 3 crashes happen when undo'ing and not strictly with repeating last.

Germano Cavalcante (mano-wii) changed the task status from Needs Triage to Needs Information from User.Jun 16 2020, 11:53 PM

I cannot reproduce this with the current development versions of Blender:

Go to File → Defaults → Load Factory Settings and then load your file to see if you still can reproduce this issue.

If the problem persists, please give us more clear instructions on how to reproduce it from scratch.

Here's a video showing the exact steps.
Using today's daily build from builder.blender.org and load factory settings.

Germano Cavalcante (mano-wii) changed the task status from Needs Information from User to Needs Triage.Jun 17 2020, 12:34 AM
Germano Cavalcante (mano-wii) changed the task status from Needs Triage to Confirmed.Jun 18 2020, 3:13 PM
Germano Cavalcante (mano-wii) triaged this task as High priority.
Germano Cavalcante (mano-wii) changed the subtype of this task from "Report" to "Bug".

I can confirm in Release build only.
But I'm still not sure what causes the crash.
For the traceback it seems to be on Depsgraph.

>	blender.exe!MEM_lockfree_mallocN(unsigned __int64 len, const unsigned char * str) Line 273	C
 	blender.exe!DEG::DepsNodeFactoryImpl<DEG::OperationNode>::create_node(const ID * id, const char * subdata, const char * name) Line 52	C++
 	blender.exe!DEG::ComponentNode::add_operation(const std::function<void __cdecl(Depsgraph *)> & op, DEG::OperationCode opcode, const char * name, int name_tag) Line 180	C++
 	blender.exe!DEG::DepsgraphNodeBuilder::add_operation_node(DEG::ComponentNode * comp_node, DEG::OperationCode opcode, const std::function<void __cdecl(Depsgraph *)> & op, const char * name, int name_tag) Line 215	C++
 	blender.exe!DEG::DepsgraphNodeBuilder::add_operation_node(ID * id, DEG::NodeType comp_type, DEG::OperationCode opcode, const std::function<void __cdecl(Depsgraph *)> & op, const char * name, int name_tag) Line 249	C++
 	blender.exe!DEG::DepsgraphNodeBuilder::build_object_transform(Object * object) Line 828	C++
 	blender.exe!DEG::DepsgraphNodeBuilder::build_object(int base_index, Object * object, DEG::eDepsNode_LinkedState_Type linked_state, bool is_visible) Line 601	C++
 	blender.exe!DEG::DepsgraphNodeBuilder::build_view_layer(Scene * scene, ViewLayer * view_layer, DEG::eDepsNode_LinkedState_Type linked_state) Line 117	C++
 	blender.exe!DEG_graph_build_from_view_layer(Depsgraph * graph, Main * bmain, Scene * scene, ViewLayer * view_layer) Line 256	C++
 	blender.exe!scene_graph_update_tagged(Depsgraph * depsgraph, Main * bmain, bool only_if_tagged) Line 1477	C
 	blender.exe!createTransData(bContext * C, TransInfo * t) Line 1143	C
 	blender.exe!initTransform(bContext * C, TransInfo * t, wmOperator * op, const wmEvent * event, int mode) Line 1915	C
 	blender.exe!transformops_data(bContext * C, wmOperator * op, const wmEvent * event) Line 394	C
 	blender.exe!transform_exec(bContext * C, wmOperator * op) Line 488	C
 	blender.exe!wm_macro_exec(bContext * C, wmOperator * op) Line 335	C
 	blender.exe!wm_operator_exec(bContext * C, wmOperator * op, const bool repeat, const bool store) Line 1002	C
 	blender.exe!WM_operator_repeat_last(bContext * C, wmOperator * op) Line 1084	C
 	blender.exe!repeat_last_exec(bContext * C, wmOperator * UNUSED_op) Line 3690	C
 	blender.exe!wm_operator_invoke(bContext * C, wmOperatorType * ot, wmEvent * event, PointerRNA * properties, ReportList * reports, const bool poll_only, bool use_last_properties) Line 1296	C
 	blender.exe!wm_handler_operator_call(bContext * C, ListBase * handlers, wmEventHandler * handler_base, wmEvent * event, PointerRNA * properties, const unsigned char * kmi_idname) Line 2116	C
 	blender.exe!wm_handlers_do_keymap_with_keymap_handler(bContext * C, wmEvent * event, ListBase * handlers, wmEventHandler_Keymap * handler, wmKeyMap * keymap, const bool do_debug_handler) Line 2426	C
 	blender.exe!wm_handlers_do_intern(bContext * C, wmEvent * event, ListBase * handlers) Line 2727	C
 	blender.exe!wm_handlers_do(bContext * C, wmEvent * event, ListBase * handlers) Line 2949	C
 	blender.exe!wm_event_do_handlers(bContext * C) Line 3395	C
 	blender.exe!WM_main(bContext * C) Line 478	C
 	blender.exe!main(int argc, const unsigned char * * UNUSED_argv_c) Line 534	C
 	[External Code]
Germano Cavalcante (mano-wii) raised the priority of this task from High to Unbreak Now!.Jun 18 2020, 3:16 PM

After further testing, it seems that the source of the problem is in draw_manager.
Commenting on that line seems to have resolved so far:

diff --git a/source/blender/draw/intern/draw_cache_extract_mesh.c b/source/blender/draw/intern/draw_cache_extract_mesh.c
index f3dc8f0fd2a..37bedd811d3 100644
--- a/source/blender/draw/intern/draw_cache_extract_mesh.c
+++ b/source/blender/draw/intern/draw_cache_extract_mesh.c
@@ -440,7 +440,7 @@ static void mesh_render_data_free(MeshRenderData *mr)
   MEM_SAFE_FREE(mr->loop_normals);
 
   MEM_SAFE_FREE(mr->lverts);
-  MEM_SAFE_FREE(mr->ledges);
+  //MEM_SAFE_FREE(mr->ledges);
 
   MEM_freeN(mr);
 }

@Jeroen Bakker (jbakker), does this have anything to do with recent changes to fix the wires drawing problem?

Germano Cavalcante (mano-wii) changed the task status from Confirmed to Needs Information from User.Jun 18 2020, 9:13 PM
Germano Cavalcante (mano-wii) lowered the priority of this task from Unbreak Now! to Normal.

I can't reproduce the problem anymore.
I think this problem was related to the same problem described in T78004 (which was recently fixed).

A build with the fix will be available tomorrow.
Please check and confirm as soon as available: https://builder.blender.org/download/

Jesse Y (deadpin) added a comment.EditedJun 18 2020, 10:50 PM

Unfortunately I can now repro more reliably (without having to use Undo; but still involving tbb / threading) with latest master rBb89898cbd381. Here's the full crash report:

Stack trace:
blender.exe         :0x00007FF6387767B0  extract_pos_nor_loop_mesh F:\source\blender-git\blender\source\blender\draw\intern\draw_cache_extract_mesh.c:1597
blender.exe         :0x00007FF63877B7F0  extract_run F:\source\blender-git\blender\source\blender\draw\intern\draw_cache_extract_mesh.c:4708
blender.exe         :0x00007FF63877BE20  extract_single_threaded_task_node_exec F:\source\blender-git\blender\source\blender\draw\intern\draw_cache_extract_mesh.c:4804
tbb.dll             :0x00007FFBE25F51D0  tbb::interface7::internal::isolate_within_arena
blender.exe         :0x00007FF63E9B8BA0  tbb::interface7::internal::isolate_impl<void,<lambda_03c216e6db53fa8f9f63abfae8a04589> const > F:\source\blender-git\lib\win64_vc15\tbb\include\tbb\task_arena.h:160
blender.exe         :0x00007FF63E9BA160  TaskNode::run F:\source\blender-git\blender\source\blender\blenlib\intern\task_graph.cc:98
blender.exe         :0x00007FF63E9B8640  std::_Call_binder<std::_Unforced,0,1,tbb::flow::interface11::continue_msg (__cdecl TaskNode::*)(tbb C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.26.28801\include\functional:1415
blender.exe         :0x00007FF63E9B93C0  tbb::flow::interface11::internal::function_body_leaf<tbb::flow::interface11::continue_msg,tbb::flow F:\source\blender-git\lib\win64_vc15\tbb\include\tbb\internal\_flow_graph_body_impl.h:147
blender.exe         :0x00007FF63E9B9B20  tbb::flow::interface11::internal::apply_body_task_bypass<tbb::flow::interface11::internal::continue F:\source\blender-git\lib\win64_vc15\tbb\include\tbb\internal\_flow_graph_body_impl.h:312
tbb.dll             :0x00007FFBE26037A0  tbb::recursive_mutex::scoped_lock::internal_try_acquire
tbb.dll             :0x00007FFBE26037A0  tbb::recursive_mutex::scoped_lock::internal_try_acquire
tbb.dll             :0x00007FFBE25F51D0  tbb::interface7::internal::isolate_within_arena
tbb.dll             :0x00007FFBE25FA490  tbb::task_scheduler_init::terminate
tbb.dll             :0x00007FFBE26019C0  tbb::thread_bound_filter::try_process_item
tbb.dll             :0x00007FFBE26019C0  tbb::thread_bound_filter::try_process_item

Still getting the same crash with same Shift D, Alt D, Shift RRRRRRRRRR steps as before.
Using today's build (Windows) 6899cb3c073e

Germano Cavalcante (mano-wii) changed the task status from Needs Information from User to Needs Triage.Jun 19 2020, 12:40 AM
Germano Cavalcante (mano-wii) changed the task status from Needs Triage to Confirmed.Jun 19 2020, 5:11 AM
Germano Cavalcante (mano-wii) triaged this task as High priority.

Also crashes when right clicking a collection in outliner > Instance to scene, then Shift D the newly instanced collection.
However does not crash when using Alt D on instanced collection

Can someone retest this issue. There were several issues solved in this area and I ain't able to reproduce it. I also checked related tasks.

Still reproducible in 2.90.0 Alpha, branch: master, commit date: 2020-07-13 15:08, hash: rB29da019cb353

The condition of the assert from my merged report T78448 still fails. It's BLI_assert(mr->cache->surface_per_mat[0]->elem == ibo) from extract_tris_finish in draw_cache_extract_mesh.c. It's condition fails reliably in release build whenever the Duplicate Linked (Alt+D) operation is performed. Tested only on Windows.

Was also able to reproduce the crash using steps from this report. Though for me the probability of reproducing it is and always have been about 0.001. The crash happened in extract_pos_nor_iter_poly_mesh of draw_cache_extract_mesh.c: vert->pos was NULL in copy_v3_v3(vert->pos, mv->co);.

Stack trace of thread 1 (crashed):

>	[Inline Frame] copy_v3_v3() Line 63	C
 	extract_pos_nor_iter_poly_mesh(mr=0x00000281e5abecc8, params, _data=0x00000281f258efd8) Line 1913	C
 	[Inline Frame] mesh_extract_iter(mr=0x00000281e5abecc8, iter_type=MR_ITER_POLY | MR_ITER_LEDGE | MR_ITER_LVERT, start=0, end=2147483647, extract=0x00007ff74c50f510, user_data=0x00000281f258efd8) Line 5130	C
 	extract_run(taskdata=0x00000281f1f41778) Line 5166	C
 	[Inline Frame] extract_init_and_run() Line 5187	C
 	extract_single_threaded_task_node_exec(task_data) Line 5261	C
 	[External Code]	
 	[Inline Frame] tbb::interface7::internal::isolate_impl() Line 160	C++
 	[Inline Frame] tbb::interface7::this_task_arena::isolate() Line 395	C++
 	TaskNode::run(UNUSED_input={...}) Line 97	C++
 	[Inline Frame] std::_Invoker_pmf_pointer::_Call() Line 146	C++
 	[Inline Frame] std::invoke() Line 146	C++
 	[Inline Frame] std::_Invoker_ret<std::_Unforced,0>::_Call() Line 146	C++
 	[Inline Frame] std::_Call_binder() Line 1858	C++
 	[Inline Frame] std::_Binder<std::_Unforced,tbb::flow::interface11::continue_msg (__cdecl TaskNode::*)(tbb::flow::interface11::continue_msg),TaskNode *,std::_Ph<1> const &>::operator()() Line 1914	C++
 	tbb::flow::interface11::internal::function_body_leaf<tbb::flow::interface11::continue_msg,tbb::flow::interface11::continue_msg,std::_Binder<std::_Unforced,tbb::flow::interface11::continue_msg (__cdecl TaskNode::*)(tbb::flow::interface11::continue_msg),TaskNode *,std::_Ph<1> const &> >::operator()(i) Line 147	C++
 	[Inline Frame] tbb::flow::interface11::internal::continue_input<tbb::flow::interface11::continue_msg,tbb::flow::interface11::internal::Policy<void> >::apply_body_bypass() Line 821	C++
 	tbb::flow::interface11::internal::apply_body_task_bypass<tbb::flow::interface11::internal::continue_input<tbb::flow::interface11::continue_msg,tbb::flow::interface11::internal::Policy<void> >,tbb::flow::interface11::continue_msg>::execute() Line 312	C++
 	[External Code]

Stack trace of thread 2 (in parallel with the crashed one):

> 	[External Code]	
	[Inline Frame] gpu_uniformbuffer_update() Line 203	C
 	GPU_uniformbuffer_update(ubo=0x00000281f1c4a398, data=0x00000281f7132060) Line 210	C
 	workbench_update_material_ubos(UNUSED_wpd=0x00000281e5e67f38) Line 331	C
 	workbench_cache_finish(ved) Line 448	C
 	[Inline Frame] drw_engines_cache_finish() Line 1038	C
 	DRW_draw_render_loop_ex(depsgraph=0x00000281dc02b2b8, engine_type=0x00007ff74e1de6c0, region=0x00000281de0854e8, v3d=0x00000281de0861a8, viewport=0x00000281e5b87078, evil_C=0x00000281db51aef8) Line 1495	C
 	DRW_draw_view(C=0x00000281db51aef8) Line 1405	C
 	[Inline Frame] view3d_draw_view() Line 1608	C
 	view3d_main_region_draw(C=0x00000281db51aef8, region=0x00000281de0854e8) Line 1632	C
 	ED_region_do_draw(C=0x00000281db51aef8, region=0x00000281de0854e8) Line 543	C
 	wm_draw_window_offscreen(C=0x00000281db51aef8, win=0x00000281ddb7a668, stereo) Line 713	C
 	wm_draw_window(C=0x00000281db51aef8, win=0x00000281ddb7a668) Line 841	C
 	wm_draw_update(C=0x00000281db51aef8) Line 1042	C
 	WM_main(C=0x00000281db51aef8) Line 482	C
 	main(argc=1, UNUSED_argv_c=0x0000000000000000) Line 534	C
 	[External Code]

Note: The line numbers for draw_cache_extract_mesh.c in the stack traces may be few lines off - that's because I added print statements to check the assert condition in the release build. Nothing else was changed.

I am able to reproduce on linux (release builds) using the repeat last action method. Thanks for the clarification!
I think that the reason why you cannot reproduce this in debug builds is that the threading method is synced to validate if every batch is created. During release mode this isn't the case and the sync is skipped. I will remove this section and see what is going on.

I have been looking into some alternatives to solve this issue.

Alternatives

Solution 1: Calculate single mesh/ob

diff --git a/source/blender/draw/intern/draw_cache_impl_mesh.c b/source/blender/draw/intern/draw_cache_impl_mesh.c
index e69fb795948..bb79df604c1 100644
--- a/source/blender/draw/intern/draw_cache_impl_mesh.c
+++ b/source/blender/draw/intern/draw_cache_impl_mesh.c
@@ -1536,12 +1536,12 @@ void DRW_mesh_batch_cache_create_requested(struct TaskGraph *task_graph,
                                      scene,
                                      ts,
                                      use_hide);
+  BLI_task_graph_work_and_wait(task_graph);
 #ifdef DEBUG
 check:
   /* Make sure all requested batches have been setup. */
   /* TODO(jbakker): we should move this to the draw_manager but that needs refactoring and
    * additional looping.*/
-  BLI_task_graph_work_and_wait(task_graph);
   for (int i = 0; i < sizeof(cache->batch) / sizeof(void *); i++) {
     BLI_assert(!DRW_batch_requested(((GPUBatch **)&cache->batch)[i], 0));
   }
  • + This fixes the issue
  • - but has a huge performance penalty. (02_020_A.anim.blend 12.5fps => 11fps)
  • - IBO's are throw away before any usage

Solution 2: Check for validity when extracting the IBO

  • - code not clean could lead to usage after free
  • - IBO's are throw away before any usage
  • - Can still fail due to threading issues, just less likely

Solution 3: Split DRW_mesh_batch_cache_create_requested

split DRW_mesh_batch_cache_create_requested so it only does (create requested). This means that it is only valid
to be called when all meshes are processed. Creates a GHash of all meshes and what needs to be done based on the
given object. schedule and start the extraction when all meshes have been checked.

NOTE: This could be optimized if we are 100% sure there is only a single user of the object. If that is the case there is only a single object and we know that another object won't discard the IBO/VBO that are being calculated at that moment.

Solution 4: Combine Surface and Surface per material batch creation

Solution 3 would still work around the hack in the code-base. Solution 4 works on the idea that surface per material is most likely what users want. So why don't we combine the two batch creations so we could remove the hack.

NOTE: This solution could also add two fast paths for meshes with one or two materials.