Use Copy on Write in more places. #95845

Closed
opened 2022-02-17 17:23:18 +01:00 by Jacques Lucke · 19 comments
Member

The goal is to improve performance and reduce memory consumption by allowing for shared immutable ownership for more pieces of data.

General Idea of Copy-on-Write

  • A piece of data can be owned by an arbitrary number of owners, that don't have to know each other.
  • If a piece of data has more than one owner, it is immutable for all owners.
  • If a piece of data has a single owner, it is mutable.
  • If a piece of data has zero owners, it has to be freed.
  • If an owner wants to modify the data, it first has to make sure that the data only has a single owner. If the data was shared, a single-owner copy has to be made.

This kind of system allows avoiding unnecessary copies at the cost of a more complex ownership model which requires more checks when data is modified. However, it also makes places that modify data more explicit.

Current State

We use copy-on-write in varying degrees in a few places:

  • Depsgraph copies the data blocks that it intends to modify.
  • CustomData allows sharing data layers with "original" data blocks (CD_REFERENCE).
    • Original data block is the only owner.
    • This is more like an immutable borrow.
  • Geometry components can be owned by multiple GeometrySet. Only when a shared geometry is modified, a copy is made.
  • Volume grids are only copied when write access is requested with BKE_volume_grid_openvdb_for_write.

Proposal

The goal of the proposal is to provide a somewhat general system for using copy-on-write in Blender.
It should be possible to use the same system to share data between e.g. Blender and Cycles.

At the core there is a new simple DNA struct in DNA_copy_on_write.h:

typedef struct bCopyOnWrite {

int user_count;

} bCopyOnWrite;

In theory, just a single integer without a struct around it could work as well, but having the struct makes it more explicit and simplifies documentation.

On top of that struct, there are some utilities for C and C++ to make working with the copy-on-write system more explicit and less error prone. Those utilities are in BLI_copy_on_write.h.

A work-in-progress version of that proposal is available in D14139.

Usage for Attributes

This section describes how bCopyOnWrite could be used to implement sharing attribute layers between different data blocks.

Currently, when a generated mesh is copied in geometry nodes, all attributes have to be copied as well. This is wasteful when there is no intention to modify the copied data.
To avoid the copy, we can attach a bCopyOnWrite to each attribute layer.
Practically, that means that a bCopyOnWrite * has to be embedded in CustomDataLayer. Note, it has to be a pointer, because otherwise different CustomDataLayers couldn't share the same user count.

Before attributes can be shared effectively, the custom data api has to be refactored a bit. Every CustomData_get_* function has to be split up into a *_for_read and *_for_write method. More details are available in T95842.

I'm not entirely sure if it is necessary, but in theory Cycles could also take ownership of the attribute layers to avoid copying them. As long as Cycles adds itself as owner to the bCopyOnWrite for every attribute, it can be sure that these arrays won't change anymore.

It's unclear whether we want to share attribute layers in original data. We could make sure that original data blocks never shares data, however it might be easier to just allow it. For the most part it should just work when the CustomData api is used correctly. Maybe a separate copy for each data block is necessary when writing to a .blend file. For undo it might actually be quite useful to share attribute arrays between multiple undo steps. Not sure if duplicated attribute arrays are optimized today already.

What I described above is specific to attribute layers, but the same approach can work for all kinds of shared data.

The goal is to improve performance and reduce memory consumption by allowing for shared immutable ownership for more pieces of data. ## General Idea of Copy-on-Write * A piece of data can be owned by an arbitrary number of owners, that don't have to know each other. * If a piece of data has more than one owner, it is immutable for all owners. * If a piece of data has a single owner, it is mutable. * If a piece of data has zero owners, it has to be freed. * If an owner wants to modify the data, it first has to make sure that the data only has a single owner. If the data was shared, a single-owner copy has to be made. This kind of system allows avoiding unnecessary copies at the cost of a more complex ownership model which requires more checks when data is modified. However, it also makes places that modify data more explicit. ## Current State We use copy-on-write in varying degrees in a few places: * Depsgraph copies the data blocks that it intends to modify. * `CustomData` allows sharing data layers with "original" data blocks (`CD_REFERENCE`). * Original data block is the only owner. * This is more like an immutable borrow. * Geometry components can be owned by multiple `GeometrySet`. Only when a shared geometry is modified, a copy is made. * Volume grids are only copied when write access is requested with `BKE_volume_grid_openvdb_for_write`. ## Proposal The goal of the proposal is to provide a somewhat general system for using copy-on-write in Blender. It should be possible to use the same system to share data between e.g. Blender and Cycles. At the core there is a new simple DNA struct in `DNA_copy_on_write.h`: ```lang=c++ typedef struct bCopyOnWrite { ``` int user_count; ``` } bCopyOnWrite; ``` In theory, just a single integer without a struct around it could work as well, but having the struct makes it more explicit and simplifies documentation. On top of that struct, there are some utilities for C and C++ to make working with the copy-on-write system more explicit and less error prone. Those utilities are in `BLI_copy_on_write.h`. A work-in-progress version of that proposal is available in D14139. ## Usage for Attributes This section describes how `bCopyOnWrite` could be used to implement sharing attribute layers between different data blocks. Currently, when a generated mesh is copied in geometry nodes, all attributes have to be copied as well. This is wasteful when there is no intention to modify the copied data. To avoid the copy, we can attach a `bCopyOnWrite` to each attribute layer. Practically, that means that a `bCopyOnWrite *` has to be embedded in `CustomDataLayer`. Note, it has to be a pointer, because otherwise different `CustomDataLayer`s couldn't share the same user count. Before attributes can be shared effectively, the custom data api has to be refactored a bit. Every `CustomData_get_*` function has to be split up into a `*_for_read` and `*_for_write` method. More details are available in T95842. I'm not entirely sure if it is necessary, but in theory Cycles could also take ownership of the attribute layers to avoid copying them. As long as Cycles adds itself as owner to the `bCopyOnWrite` for every attribute, it can be sure that these arrays won't change anymore. It's unclear whether we want to share attribute layers in original data. We could make sure that original data blocks never shares data, however it might be easier to just allow it. For the most part it should just work when the `CustomData` api is used correctly. Maybe a separate copy for each data block is necessary when writing to a .blend file. For undo it might actually be quite useful to share attribute arrays between multiple undo steps. Not sure if duplicated attribute arrays are optimized today already. What I described above is specific to attribute layers, but the same approach can work for all kinds of shared data.
Author
Member

Added subscriber: @JacquesLucke

Added subscriber: @JacquesLucke
Author
Member

Changed status from 'Needs Triage' to: 'Confirmed'

Changed status from 'Needs Triage' to: 'Confirmed'

Added subscriber: @brecht

Added subscriber: @brecht

This sounds great. It could help gain back some memory that was lost when we stopped using CD_REFERENCE to original data in 2.8.

User counting with atomics is not free, but I guess it will be worth it here.

I'm not entirely sure if it is necessary, but in theory Cycles could also take ownership of the attribute layers to avoid copying them. As long as Cycles adds itself as owner to the bCopyOnWrite for every attribute, it can be sure that these arrays won't change anymore.

This would be really nice at some point.

Maybe a separate copy for each data block is necessary when writing to a .blend file.

The way file read/write works, I think you will automatically get a separate copy per datablock. These kind of data arrays and their pointer mapping are per datablock.

For undo it might actually be quite useful to share attribute arrays between multiple undo steps. Not sure if duplicated attribute arrays are optimized today already.

There is delta compression between undo steps, so for an unchanging attribute in a single datablock there should only a single copy in undo memory. But with this you could optimize it so there is no copy at all.

This sounds great. It could help gain back some memory that was lost when we stopped using `CD_REFERENCE` to original data in 2.8. User counting with atomics is not free, but I guess it will be worth it here. > I'm not entirely sure if it is necessary, but in theory Cycles could also take ownership of the attribute layers to avoid copying them. As long as Cycles adds itself as owner to the bCopyOnWrite for every attribute, it can be sure that these arrays won't change anymore. This would be really nice at some point. > Maybe a separate copy for each data block is necessary when writing to a .blend file. The way file read/write works, I think you will automatically get a separate copy per datablock. These kind of data arrays and their pointer mapping are per datablock. > For undo it might actually be quite useful to share attribute arrays between multiple undo steps. Not sure if duplicated attribute arrays are optimized today already. There is delta compression between undo steps, so for an unchanging attribute in a single datablock there should only a single copy in undo memory. But with this you could optimize it so there is no copy at all.

Added subscriber: @Garek

Added subscriber: @Garek
Member

Added subscriber: @HooglyBoogly

Added subscriber: @HooglyBoogly
Contributor

Added subscriber: @jmztn

Added subscriber: @jmztn
Member

I also think this would be a great improvement.

I wonder if ideally, the CoW user count and the data would be allocated together, to avoid the need to de-reference two pointers for each access. I know it's really small overhead at that point, but we'll probably end up using this quite a lot, so the overhead might add up?
In practice it's probably not worth it, especially for CustomData, but just a thought.

The convenience pointers like Mesh.mvert are a bit related here. I guess we would still have to call some function like "ensure writeable" before changing the data they point to.
In the long term we could get rid of them and use a method like CurvesGeometry with a positions() method that handles accessing the attribute layers for you.

I also think this would be a great improvement. I wonder if ideally, the CoW user count and the data would be allocated together, to avoid the need to de-reference two pointers for each access. I know it's really small overhead at that point, but we'll probably end up using this quite a lot, so the overhead might add up? In practice it's probably not worth it, especially for `CustomData`, but just a thought. The convenience pointers like `Mesh.mvert` are a bit related here. I guess we would still have to call some function like "ensure writeable" before changing the data they point to. In the long term we could get rid of them and use a method like `CurvesGeometry` with a `positions()` method that handles accessing the attribute layers for you.
Author
Member

I wonder if ideally, the CoW user count and the data would be allocated together, to avoid the need to de-reference two pointers for each access. I know it's really small overhead at that point, but we'll probably end up using this quite a lot, so the overhead might add up?

In practice it's probably not worth it, especially for CustomData, but just a thought.

There are cases where the data can be reasonably allocated together with the data. See e.g. how bCopyOnWrite is embedded in GeometryComponent in D14139. However, this does not work in general. It's well possible that some lower level utility or data structure allocates and manages the memory, and only later it becomes managed by the copy-on-write system. We could add a bool to bCopyOnWrite which indicates whether it is part of the same allocation as the referenced data or not, but I'm not convinced that this is worth the additional complexity. In case you didn't know, this would be a bit similar to std::shared_ptr actually (in that the reference count can be stored in the same or separate allocation depending on how it's created).

What might be more worth it, is to use the copy-on-write system only for sufficiently large resources. E.g. having a bCopyOnWrite * that is nullptr could indicate that one is the single owner of the resource. Only when we start sharing the data, the bCopyOnWrite could be allocated.

As a side note, we probably should allocate bCopyOnWrite with an alignment of 64 bytes to avoid overhead due to false sharing.

to avoid the need to de-reference two pointers for each access

That overhead I'm not worried about at all, because it only happens when the data is "set up" for an actual computation.

The convenience pointers like Mesh.mvert are a bit related here. I guess we would still have to call some function like "ensure writeable" before changing the data they point to.
In the long term we could get rid of them and use a method like CurvesGeometry with a positions() method that handles accessing the attribute layers for you.

That sounds reasonable.

User counting with atomics is not free, but I guess it will be worth it here.

Definitely not free, but (1) it shouldn't be done in hot loops, (2) is probably cheaper than copying the data and (3) if we find that for some specific piece of data it is cheaper to copy it, we can still do that.

> I wonder if ideally, the CoW user count and the data would be allocated together, to avoid the need to de-reference two pointers for each access. I know it's really small overhead at that point, but we'll probably end up using this quite a lot, so the overhead might add up? In practice it's probably not worth it, especially for CustomData, but just a thought. There are cases where the data can be reasonably allocated together with the data. See e.g. how `bCopyOnWrite` is embedded in `GeometryComponent` in [D14139](https://archive.blender.org/developer/D14139). However, this does not work in general. It's well possible that some lower level utility or data structure allocates and manages the memory, and only later it becomes managed by the copy-on-write system. We could add a bool to `bCopyOnWrite` which indicates whether it is part of the same allocation as the referenced data or not, but I'm not convinced that this is worth the additional complexity. In case you didn't know, this would be a bit similar to `std::shared_ptr` actually (in that the reference count can be stored in the same or separate allocation depending on how it's created). What might be more worth it, is to use the copy-on-write system only for sufficiently large resources. E.g. having a `bCopyOnWrite *` that is `nullptr` could indicate that one is the single owner of the resource. Only when we start sharing the data, the `bCopyOnWrite` could be allocated. As a side note, we probably should allocate `bCopyOnWrite` with an alignment of 64 bytes to avoid overhead due to false sharing. > to avoid the need to de-reference two pointers for each access That overhead I'm not worried about at all, because it only happens when the data is "set up" for an actual computation. > The convenience pointers like Mesh.mvert are a bit related here. I guess we would still have to call some function like "ensure writeable" before changing the data they point to. > In the long term we could get rid of them and use a method like CurvesGeometry with a positions() method that handles accessing the attribute layers for you. That sounds reasonable. > User counting with atomics is not free, but I guess it will be worth it here. Definitely not free, but (1) it shouldn't be done in hot loops, (2) is probably cheaper than copying the data and (3) if we find that for some specific piece of data it is cheaper to copy it, we can still do that.

Added subscriber: @mont29

Added subscriber: @mont29

This looks like a great improvement to me as well.

It's unclear whether we want to share attribute layers in original data. We could make sure that original data blocks never shares data, however it might be easier to just allow it.

I would never, ever allow that. We already suffer (a lot!) from the very few places where some internal IDs sub-data are shared with others (e.g. the Armature bones referenced by the Object pose data), I do not want to see other cases like that added without an extremely good reason to break the 'isolation' between IDs. Sharing data by addresses between IDs essentially makes the whole ID management code a nightmare.

User counting with atomics is not free, but I guess it will be worth it here.

Definitely not free, but (1) it shouldn't be done in hot loops, (2) is probably cheaper than copying the data and (3) if we find that for some specific piece of data it is cheaper to copy it, we can still do that.

I would be very surprised is cost of atomic user count was ever more expansive than useless data copying indeed, so not worried here either.

This looks like a great improvement to me as well. > It's unclear whether we want to share attribute layers in original data. We could make sure that original data blocks never shares data, however it might be easier to just allow it. I would never, ever allow that. We already suffer (a lot!) from the very few places where some internal IDs sub-data are shared with others (e.g. the `Armature` bones referenced by the `Object` pose data), I do not want to see other cases like that added without an extremely good reason to break the 'isolation' between IDs. Sharing data by addresses between IDs essentially makes the whole ID management code a nightmare. >> User counting with atomics is not free, but I guess it will be worth it here. > Definitely not free, but (1) it shouldn't be done in hot loops, (2) is probably cheaper than copying the data and (3) if we find that for some specific piece of data it is cheaper to copy it, we can still do that. I would be very surprised is cost of atomic user count was ever more expansive than useless data copying indeed, so not worried here either.

Added subscriber: @GeorgiaPacific

Added subscriber: @GeorgiaPacific
Member

Sharing data by addresses between IDs essentially makes the whole ID management code a nightmare.

I get that normally, and agree. But I wonder if this case is different, since the user-counting is done automatically.
The problem seems to be that the data referenced from another ID is owned by the other ID.
With these copy-on-write cases, the data is owned by both IDs, completely transparently.

Maybe there's some aspect I'm missing. I would bet there's a lot of cases where sharing layers between original IDs would
help a lot. i.e. duplicate a large sculpt with a bunch of painted colors, do some deformation to test something out, etc.

>Sharing data by addresses between IDs essentially makes the whole ID management code a nightmare. I get that normally, and agree. But I wonder if this case is different, since the user-counting is done automatically. The problem seems to be that the data referenced from another ID is *owned* by the other ID. With these copy-on-write cases, the data is owned by *both* IDs, completely transparently. Maybe there's some aspect I'm missing. I would bet there's a lot of cases where sharing layers between original IDs would help a lot. i.e. duplicate a large sculpt with a bunch of painted colors, do some deformation to test something out, etc.

Added subscriber: @hzuika

Added subscriber: @hzuika
Author
Member

Added subscriber: @Sergey

Added subscriber: @Sergey
Member

Added subscriber: @LukasTonne

Added subscriber: @LukasTonne
Member

Added subscriber: @EAW

Added subscriber: @EAW

Added subscriber: @AlexeyAdamitsky

Added subscriber: @AlexeyAdamitsky
Bastien Montagne added this to the Core project 2023-02-09 15:42:54 +01:00
Bastien Montagne removed this from the Core project 2023-02-09 18:20:36 +01:00
Author
Member

The main part of this proposal is implemented now. See 7eee378ecc.

The main part of this proposal is implemented now. See 7eee378eccc8f87e1330b9cfea2799928be9d657.
Blender Bot added
Status
Archived
and removed
Status
Confirmed
labels 2023-04-13 14:59:16 +02:00
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
11 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#95845
No description provided.