Status: 1st milestone almost complete, need final optimizations.
Team
Commissioner: @Brecht Van Lommel (brecht)
Project leader: @Bastien Montagne (mont29)
Project members: @Sebastian Parborg (zeddb)
Big picture: In heavy scenes, new features in Blender 2.8x are making the inherently linear (O(N)), when not quadratic (O(N²)), complexity of most operations over data-blocks more problematic than in 2.7x era. The general goal is to bring those operations back to a constant (O(1)) or logarithmic (O(log(N))) complexity.
Description
Use cases:
Need specific test cases.
- Undo changes in object mode in a heavy scene.
- Undo changes in pose mode in a heavy scene.
- Duplicating objects in a heavy scene.
- Adding many objects in a scene.
Note: There are two types of “heavyness” in scenes (that can be combined of course):
- Scenes that have many objects and/or collections.
- Scenes that have very heavy geometry, either in the geometry data itself, or generated from modifiers like subdivision surface.
Design:
- Dependency graph should not have to re-evaluate objects that did not change.
- This is mainly a problem currently during global undo, as all data-blocks are replaced by new ones on each undo step.
- Handling naming of data-blocks should be O(log(N)) (it is currently close to O(N²), or O(N log(N)) in best cases).
- This will require caching currently used names.
- We will probably also have to handle smartly the numbering in that cache, if we really want to be efficient in typical “worst case” scenarii (addition of thousands of objects with same base name e.g.).
- This will require caching currently used names.
- Caches/runtime data helpers should be always either:
- Lazily-built/updated on-demand (code affecting related data should only ensure that the 'dirty' flag is set then).
- This approach can usually be easily managed in a mostly lock-less way in a threaded context (proper locking is only needed when actually rebuilding a dirty cache).
- Kept valid/in sync all the time (code affecting related data should take care of updating it itself, code using the cache can always assume it is valid and up-to-date).
- Such cache should be easy to incrementally update (it should almost never have to be rebuilt from scratch).
- This implies that the cache is highly local (changes on a data-block only ever affect that data-block and a well-defined, easy to reach small number of “neighbors”).
- In general keeping this kind of cache valid will always be harder and more error-prone than approach #A.
- This approach often needs proper complete locking (mutexes & co) in a threaded context.
Note: The ViewLayer's collections/objects cache e.g. currently uses the worst mix of #A and #B - it is always assumed valid, but updating it requires a complete rebuild from scratch almost all the time.
Note: Approach #B is usually interesting only if a complete rebuild of the cache is very costly, and/or if the cache is invalidated very often while being used.
- Such cache should be easy to incrementally update (it should almost never have to be rebuilt from scratch).
- Lazily-built/updated on-demand (code affecting related data should only ensure that the 'dirty' flag is set then).
Engineer plan: -
Work plan
Milestone 1 - Optimized per-datablock global undo
Time estimate: 6 months
- Re-use as much as possible existing data, and try to swap newly read (modified) data to the original memory, during 'memfile' undo step. Updating the depsgraph should then be much less work.
Milestone 2 - Lazy collection synchronizations
Time estimate: 1 month
- Fix view layer collection syncing (to not require looping over all objects, or to be ran much less often...).
Milestone 3 - Data-blocks management performances with many of them
Time estimate: 5 months
- Investigate how to best handle naming conflicts issues.
- Most obvious idea would be store all used base-names in a GHash (one per ID type), along with used indices. Needs further design though.
- T73412: Improve name conflict handling in ID management
- Investigate the best way to cache bi-directional relationships info between data-blocks.
- We already have a start of this with BKE_main_relations_create() & co, but it is heavily under-used and needs better design to be usable more widely.
- T73413: Cache data-block relationships info
- Fix/Improve handling of the Bone pointers (which are sub-data of Armature ID) stored in poses (which are sub-data of Object ID).
- Ideally such horrible pointers should not exist ever. They are a very solid recurrent source of bugs in Blender.
- Not sure how to best deal with them, at the very least we could add some generic ID API to ensure those kind of caches are up-to-date?
See also T68938: Blender editing performance with many datablocks.
Later
- Dependency graph rebuild to be O(1) or O(log(N)) (it is also O(N) at the moment). ???? would assume linear would already be nice performance for such a code?
Notes: -