Page MenuHome

Depsgraph: execute all COPY_ON_WRITE nodes first as a separate stage.
ClosedPublic

Authored by Alexander Gavrilov (angavrilov) on Mon, Dec 3, 8:55 PM.

Details

Summary

COW nodes in the graph are mostly connected via a relation type
that doesn't propagate the update flags. Unfortunately, due to
the scheduling implementation that means the relations don't
actually guarantee execution order for indirect dependencies.
Relations also don't guarantee order in case of cycles.

As mentioned in IRC, the simplest way to fix possible problems
is to execute all COW nodes as a separate execution stage. This
seems to fix crashes with Data Transfer modifier in a cycle.

Diff Detail

Repository
rB Blender

Event Timeline

Would need to have a closer look into exact scheduling, just to be sure dependencies between CoW operations are nicely preserved and such.
For now just quick pass of review.

source/blender/blenlib/BLI_task.h
101–102 ↗(On Diff #12748)

For the clarity of the API i'd prefer have this is a separate call to reset the pool. All those _ex functions are only causing madness in the amount of arguments being passed around.

What we can do, is to wrap all those settings into PoolSettings passed to pool creation/"resetion".

source/blender/depsgraph/intern/eval/deg_eval.cc
76

is_cow_stage.

source/blender/blenlib/BLI_task.h
101–102 ↗(On Diff #12748)

Alternatively, can accept name BLI_task_pool_work_wait_and_reset().

Alexander Gavrilov (angavrilov) marked 3 inline comments as done.Tue, Dec 4, 9:33 AM

The scheduling is really simple: during COW phase it skips actually scheduling any tasks for non-COW nodes, and then at the start of next phase it goes through all nodes again with schedule_graph and schedules everything that is now ready.

source/blender/blenlib/intern/task.c
927 ↗(On Diff #12751)

One thing about this do_work thing and handle_local_queue at the end of BLI_task_pool_work_and_wait: I wonder if there is actually a bug here, because leaving do_work on allows adding stuff to the local queue of the main thread when it's not ready to handle it; and executing any tasks outside the wait loop seems fishy. Without using the _reset function the depsgraph evaluation isn't just slower due to no suspend, it actually breaks and fails to execute some tasks occasionally in the second stage.

Moved do_work = false to main function.

Harbormaster completed remote builds in B2579: Diff 12752.
This revision is now accepted and ready to land.Tue, Dec 4, 11:37 AM
This revision was automatically updated to reflect the committed changes.