Use Copy on Write in more places. #95845
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
11 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#95845
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The goal is to improve performance and reduce memory consumption by allowing for shared immutable ownership for more pieces of data.
General Idea of Copy-on-Write
This kind of system allows avoiding unnecessary copies at the cost of a more complex ownership model which requires more checks when data is modified. However, it also makes places that modify data more explicit.
Current State
We use copy-on-write in varying degrees in a few places:
CustomData
allows sharing data layers with "original" data blocks (CD_REFERENCE
).GeometrySet
. Only when a shared geometry is modified, a copy is made.BKE_volume_grid_openvdb_for_write
.Proposal
The goal of the proposal is to provide a somewhat general system for using copy-on-write in Blender.
It should be possible to use the same system to share data between e.g. Blender and Cycles.
At the core there is a new simple DNA struct in
DNA_copy_on_write.h
:int user_count;
In theory, just a single integer without a struct around it could work as well, but having the struct makes it more explicit and simplifies documentation.
On top of that struct, there are some utilities for C and C++ to make working with the copy-on-write system more explicit and less error prone. Those utilities are in
BLI_copy_on_write.h
.A work-in-progress version of that proposal is available in D14139.
Usage for Attributes
This section describes how
bCopyOnWrite
could be used to implement sharing attribute layers between different data blocks.Currently, when a generated mesh is copied in geometry nodes, all attributes have to be copied as well. This is wasteful when there is no intention to modify the copied data.
To avoid the copy, we can attach a
bCopyOnWrite
to each attribute layer.Practically, that means that a
bCopyOnWrite *
has to be embedded inCustomDataLayer
. Note, it has to be a pointer, because otherwise differentCustomDataLayer
s couldn't share the same user count.Before attributes can be shared effectively, the custom data api has to be refactored a bit. Every
CustomData_get_*
function has to be split up into a*_for_read
and*_for_write
method. More details are available in T95842.I'm not entirely sure if it is necessary, but in theory Cycles could also take ownership of the attribute layers to avoid copying them. As long as Cycles adds itself as owner to the
bCopyOnWrite
for every attribute, it can be sure that these arrays won't change anymore.It's unclear whether we want to share attribute layers in original data. We could make sure that original data blocks never shares data, however it might be easier to just allow it. For the most part it should just work when the
CustomData
api is used correctly. Maybe a separate copy for each data block is necessary when writing to a .blend file. For undo it might actually be quite useful to share attribute arrays between multiple undo steps. Not sure if duplicated attribute arrays are optimized today already.What I described above is specific to attribute layers, but the same approach can work for all kinds of shared data.
Added subscriber: @JacquesLucke
Changed status from 'Needs Triage' to: 'Confirmed'
Added subscriber: @brecht
This sounds great. It could help gain back some memory that was lost when we stopped using
CD_REFERENCE
to original data in 2.8.User counting with atomics is not free, but I guess it will be worth it here.
This would be really nice at some point.
The way file read/write works, I think you will automatically get a separate copy per datablock. These kind of data arrays and their pointer mapping are per datablock.
There is delta compression between undo steps, so for an unchanging attribute in a single datablock there should only a single copy in undo memory. But with this you could optimize it so there is no copy at all.
Added subscriber: @Garek
Added subscriber: @HooglyBoogly
Added subscriber: @jmztn
I also think this would be a great improvement.
I wonder if ideally, the CoW user count and the data would be allocated together, to avoid the need to de-reference two pointers for each access. I know it's really small overhead at that point, but we'll probably end up using this quite a lot, so the overhead might add up?
In practice it's probably not worth it, especially for
CustomData
, but just a thought.The convenience pointers like
Mesh.mvert
are a bit related here. I guess we would still have to call some function like "ensure writeable" before changing the data they point to.In the long term we could get rid of them and use a method like
CurvesGeometry
with apositions()
method that handles accessing the attribute layers for you.In practice it's probably not worth it, especially for CustomData, but just a thought.
There are cases where the data can be reasonably allocated together with the data. See e.g. how
bCopyOnWrite
is embedded inGeometryComponent
in D14139. However, this does not work in general. It's well possible that some lower level utility or data structure allocates and manages the memory, and only later it becomes managed by the copy-on-write system. We could add a bool tobCopyOnWrite
which indicates whether it is part of the same allocation as the referenced data or not, but I'm not convinced that this is worth the additional complexity. In case you didn't know, this would be a bit similar tostd::shared_ptr
actually (in that the reference count can be stored in the same or separate allocation depending on how it's created).What might be more worth it, is to use the copy-on-write system only for sufficiently large resources. E.g. having a
bCopyOnWrite *
that isnullptr
could indicate that one is the single owner of the resource. Only when we start sharing the data, thebCopyOnWrite
could be allocated.As a side note, we probably should allocate
bCopyOnWrite
with an alignment of 64 bytes to avoid overhead due to false sharing.That overhead I'm not worried about at all, because it only happens when the data is "set up" for an actual computation.
That sounds reasonable.
Definitely not free, but (1) it shouldn't be done in hot loops, (2) is probably cheaper than copying the data and (3) if we find that for some specific piece of data it is cheaper to copy it, we can still do that.
Added subscriber: @mont29
This looks like a great improvement to me as well.
I would never, ever allow that. We already suffer (a lot!) from the very few places where some internal IDs sub-data are shared with others (e.g. the
Armature
bones referenced by theObject
pose data), I do not want to see other cases like that added without an extremely good reason to break the 'isolation' between IDs. Sharing data by addresses between IDs essentially makes the whole ID management code a nightmare.I would be very surprised is cost of atomic user count was ever more expansive than useless data copying indeed, so not worried here either.
Added subscriber: @GeorgiaPacific
I get that normally, and agree. But I wonder if this case is different, since the user-counting is done automatically.
The problem seems to be that the data referenced from another ID is owned by the other ID.
With these copy-on-write cases, the data is owned by both IDs, completely transparently.
Maybe there's some aspect I'm missing. I would bet there's a lot of cases where sharing layers between original IDs would
help a lot. i.e. duplicate a large sculpt with a bunch of painted colors, do some deformation to test something out, etc.
Added subscriber: @hzuika
Added subscriber: @Sergey
Added subscriber: @LukasTonne
Added subscriber: @EAW
Added subscriber: @AlexeyAdamitsky
The main part of this proposal is implemented now. See
7eee378ecc
.