Use Copy on Write in more places. #95845

New Issue

Jacques Lucke · 2022-02-17T17:23:18+01:00

Jacques Lucke commented

2022-02-17 17:23:18 +01:00

The goal is to improve performance and reduce memory consumption by allowing for shared immutable ownership for more pieces of data.

General Idea of Copy-on-Write

A piece of data can be owned by an arbitrary number of owners, that don't have to know each other.
If a piece of data has more than one owner, it is immutable for all owners.
If a piece of data has a single owner, it is mutable.
If a piece of data has zero owners, it has to be freed.
If an owner wants to modify the data, it first has to make sure that the data only has a single owner. If the data was shared, a single-owner copy has to be made.

This kind of system allows avoiding unnecessary copies at the cost of a more complex ownership model which requires more checks when data is modified. However, it also makes places that modify data more explicit.

Current State

We use copy-on-write in varying degrees in a few places:

Depsgraph copies the data blocks that it intends to modify.
CustomData allows sharing data layers with "original" data blocks (CD_REFERENCE).
- Original data block is the only owner.
- This is more like an immutable borrow.
Geometry components can be owned by multiple GeometrySet. Only when a shared geometry is modified, a copy is made.
Volume grids are only copied when write access is requested with BKE_volume_grid_openvdb_for_write.

Proposal

The goal of the proposal is to provide a somewhat general system for using copy-on-write in Blender.
It should be possible to use the same system to share data between e.g. Blender and Cycles.

At the core there is a new simple DNA struct in DNA_copy_on_write.h:

typedef struct bCopyOnWrite {

int user_count;

} bCopyOnWrite;

In theory, just a single integer without a struct around it could work as well, but having the struct makes it more explicit and simplifies documentation.

On top of that struct, there are some utilities for C and C++ to make working with the copy-on-write system more explicit and less error prone. Those utilities are in BLI_copy_on_write.h.

A work-in-progress version of that proposal is available in D14139.

Usage for Attributes

This section describes how bCopyOnWrite could be used to implement sharing attribute layers between different data blocks.

Currently, when a generated mesh is copied in geometry nodes, all attributes have to be copied as well. This is wasteful when there is no intention to modify the copied data.
To avoid the copy, we can attach a bCopyOnWrite to each attribute layer.
Practically, that means that a bCopyOnWrite * has to be embedded in CustomDataLayer. Note, it has to be a pointer, because otherwise different CustomDataLayers couldn't share the same user count.

Before attributes can be shared effectively, the custom data api has to be refactored a bit. Every CustomData_get_* function has to be split up into a *_for_read and *_for_write method. More details are available in T95842.

I'm not entirely sure if it is necessary, but in theory Cycles could also take ownership of the attribute layers to avoid copying them. As long as Cycles adds itself as owner to the bCopyOnWrite for every attribute, it can be sure that these arrays won't change anymore.

It's unclear whether we want to share attribute layers in original data. We could make sure that original data blocks never shares data, however it might be easier to just allow it. For the most part it should just work when the CustomData api is used correctly. Maybe a separate copy for each data block is necessary when writing to a .blend file. For undo it might actually be quite useful to share attribute arrays between multiple undo steps. Not sure if duplicated attribute arrays are optimized today already.

What I described above is specific to attribute layers, but the same approach can work for all kinds of shared data.

The goal is to improve performance and reduce memory consumption by allowing for shared immutable ownership for more pieces of data. ## General Idea of Copy-on-Write * A piece of data can be owned by an arbitrary number of owners, that don't have to know each other. * If a piece of data has more than one owner, it is immutable for all owners. * If a piece of data has a single owner, it is mutable. * If a piece of data has zero owners, it has to be freed. * If an owner wants to modify the data, it first has to make sure that the data only has a single owner. If the data was shared, a single-owner copy has to be made. This kind of system allows avoiding unnecessary copies at the cost of a more complex ownership model which requires more checks when data is modified. However, it also makes places that modify data more explicit. ## Current State We use copy-on-write in varying degrees in a few places: * Depsgraph copies the data blocks that it intends to modify. * `CustomData` allows sharing data layers with "original" data blocks (`CD_REFERENCE`). * Original data block is the only owner. * This is more like an immutable borrow. * Geometry components can be owned by multiple `GeometrySet`. Only when a shared geometry is modified, a copy is made. * Volume grids are only copied when write access is requested with `BKE_volume_grid_openvdb_for_write`. ## Proposal The goal of the proposal is to provide a somewhat general system for using copy-on-write in Blender. It should be possible to use the same system to share data between e.g. Blender and Cycles. At the core there is a new simple DNA struct in `DNA_copy_on_write.h`: ```lang=c++ typedef struct bCopyOnWrite { ``` int user_count; ``` } bCopyOnWrite; ``` In theory, just a single integer without a struct around it could work as well, but having the struct makes it more explicit and simplifies documentation. On top of that struct, there are some utilities for C and C++ to make working with the copy-on-write system more explicit and less error prone. Those utilities are in `BLI_copy_on_write.h`. A work-in-progress version of that proposal is available in D14139. ## Usage for Attributes This section describes how `bCopyOnWrite` could be used to implement sharing attribute layers between different data blocks. Currently, when a generated mesh is copied in geometry nodes, all attributes have to be copied as well. This is wasteful when there is no intention to modify the copied data. To avoid the copy, we can attach a `bCopyOnWrite` to each attribute layer. Practically, that means that a `bCopyOnWrite *` has to be embedded in `CustomDataLayer`. Note, it has to be a pointer, because otherwise different `CustomDataLayer`s couldn't share the same user count. Before attributes can be shared effectively, the custom data api has to be refactored a bit. Every `CustomData_get_*` function has to be split up into a `*_for_read` and `*_for_write` method. More details are available in T95842. I'm not entirely sure if it is necessary, but in theory Cycles could also take ownership of the attribute layers to avoid copying them. As long as Cycles adds itself as owner to the `bCopyOnWrite` for every attribute, it can be sure that these arrays won't change anymore. It's unclear whether we want to share attribute layers in original data. We could make sure that original data blocks never shares data, however it might be easier to just allow it. For the most part it should just work when the `CustomData` api is used correctly. Maybe a separate copy for each data block is necessary when writing to a .blend file. For undo it might actually be quite useful to share attribute arrays between multiple undo steps. Not sure if duplicated attribute arrays are optimized today already. What I described above is specific to attribute layers, but the same approach can work for all kinds of shared data.

Jacques Lucke commented

2022-02-17 17:23:18 +01:00

Added subscriber: @JacquesLucke

Jacques Lucke commented

2022-02-17 17:27:45 +01:00

Changed status from 'Needs Triage' to: 'Confirmed'

Brecht Van Lommel commented

2022-02-17 21:32:37 +01:00

Added subscriber: @brecht

Brecht Van Lommel commented

2022-02-17 21:32:37 +01:00

This sounds great. It could help gain back some memory that was lost when we stopped using CD_REFERENCE to original data in 2.8.

User counting with atomics is not free, but I guess it will be worth it here.

I'm not entirely sure if it is necessary, but in theory Cycles could also take ownership of the attribute layers to avoid copying them. As long as Cycles adds itself as owner to the bCopyOnWrite for every attribute, it can be sure that these arrays won't change anymore.

This would be really nice at some point.

Maybe a separate copy for each data block is necessary when writing to a .blend file.

The way file read/write works, I think you will automatically get a separate copy per datablock. These kind of data arrays and their pointer mapping are per datablock.

For undo it might actually be quite useful to share attribute arrays between multiple undo steps. Not sure if duplicated attribute arrays are optimized today already.

There is delta compression between undo steps, so for an unchanging attribute in a single datablock there should only a single copy in undo memory. But with this you could optimize it so there is no copy at all.

This sounds great. It could help gain back some memory that was lost when we stopped using `CD_REFERENCE` to original data in 2.8. User counting with atomics is not free, but I guess it will be worth it here. > I'm not entirely sure if it is necessary, but in theory Cycles could also take ownership of the attribute layers to avoid copying them. As long as Cycles adds itself as owner to the bCopyOnWrite for every attribute, it can be sure that these arrays won't change anymore. This would be really nice at some point. > Maybe a separate copy for each data block is necessary when writing to a .blend file. The way file read/write works, I think you will automatically get a separate copy per datablock. These kind of data arrays and their pointer mapping are per datablock. > For undo it might actually be quite useful to share attribute arrays between multiple undo steps. Not sure if duplicated attribute arrays are optimized today already. There is delta compression between undo steps, so for an unchanging attribute in a single datablock there should only a single copy in undo memory. But with this you could optimize it so there is no copy at all.

Garek commented

2022-02-18 05:05:56 +01:00

Added subscriber: @Garek

Hans Goudey commented

2022-02-18 14:37:59 +01:00

Added subscriber: @HooglyBoogly

Jun Mizutani commented

2022-02-18 16:36:44 +01:00

Added subscriber: @jmztn

Hans Goudey commented

2022-02-18 16:43:00 +01:00

I also think this would be a great improvement.

I wonder if ideally, the CoW user count and the data would be allocated together, to avoid the need to de-reference two pointers for each access. I know it's really small overhead at that point, but we'll probably end up using this quite a lot, so the overhead might add up?
In practice it's probably not worth it, especially for CustomData, but just a thought.

The convenience pointers like Mesh.mvert are a bit related here. I guess we would still have to call some function like "ensure writeable" before changing the data they point to.
In the long term we could get rid of them and use a method like CurvesGeometry with a positions() method that handles accessing the attribute layers for you.

I also think this would be a great improvement. I wonder if ideally, the CoW user count and the data would be allocated together, to avoid the need to de-reference two pointers for each access. I know it's really small overhead at that point, but we'll probably end up using this quite a lot, so the overhead might add up? In practice it's probably not worth it, especially for `CustomData`, but just a thought. The convenience pointers like `Mesh.mvert` are a bit related here. I guess we would still have to call some function like "ensure writeable" before changing the data they point to. In the long term we could get rid of them and use a method like `CurvesGeometry` with a `positions()` method that handles accessing the attribute layers for you.

Jacques Lucke commented

2022-02-19 12:10:42 +01:00

I wonder if ideally, the CoW user count and the data would be allocated together, to avoid the need to de-reference two pointers for each access. I know it's really small overhead at that point, but we'll probably end up using this quite a lot, so the overhead might add up?

In practice it's probably not worth it, especially for CustomData, but just a thought.

There are cases where the data can be reasonably allocated together with the data. See e.g. how bCopyOnWrite is embedded in GeometryComponent in D14139. However, this does not work in general. It's well possible that some lower level utility or data structure allocates and manages the memory, and only later it becomes managed by the copy-on-write system. We could add a bool to bCopyOnWrite which indicates whether it is part of the same allocation as the referenced data or not, but I'm not convinced that this is worth the additional complexity. In case you didn't know, this would be a bit similar to std::shared_ptr actually (in that the reference count can be stored in the same or separate allocation depending on how it's created).

What might be more worth it, is to use the copy-on-write system only for sufficiently large resources. E.g. having a bCopyOnWrite * that is nullptr could indicate that one is the single owner of the resource. Only when we start sharing the data, the bCopyOnWrite could be allocated.

As a side note, we probably should allocate bCopyOnWrite with an alignment of 64 bytes to avoid overhead due to false sharing.

to avoid the need to de-reference two pointers for each access

That overhead I'm not worried about at all, because it only happens when the data is "set up" for an actual computation.

The convenience pointers like Mesh.mvert are a bit related here. I guess we would still have to call some function like "ensure writeable" before changing the data they point to.
In the long term we could get rid of them and use a method like CurvesGeometry with a positions() method that handles accessing the attribute layers for you.

That sounds reasonable.

User counting with atomics is not free, but I guess it will be worth it here.

Definitely not free, but (1) it shouldn't be done in hot loops, (2) is probably cheaper than copying the data and (3) if we find that for some specific piece of data it is cheaper to copy it, we can still do that.

> I wonder if ideally, the CoW user count and the data would be allocated together, to avoid the need to de-reference two pointers for each access. I know it's really small overhead at that point, but we'll probably end up using this quite a lot, so the overhead might add up? In practice it's probably not worth it, especially for CustomData, but just a thought. There are cases where the data can be reasonably allocated together with the data. See e.g. how `bCopyOnWrite` is embedded in `GeometryComponent` in [D14139](https://archive.blender.org/developer/D14139). However, this does not work in general. It's well possible that some lower level utility or data structure allocates and manages the memory, and only later it becomes managed by the copy-on-write system. We could add a bool to `bCopyOnWrite` which indicates whether it is part of the same allocation as the referenced data or not, but I'm not convinced that this is worth the additional complexity. In case you didn't know, this would be a bit similar to `std::shared_ptr` actually (in that the reference count can be stored in the same or separate allocation depending on how it's created). What might be more worth it, is to use the copy-on-write system only for sufficiently large resources. E.g. having a `bCopyOnWrite *` that is `nullptr` could indicate that one is the single owner of the resource. Only when we start sharing the data, the `bCopyOnWrite` could be allocated. As a side note, we probably should allocate `bCopyOnWrite` with an alignment of 64 bytes to avoid overhead due to false sharing. > to avoid the need to de-reference two pointers for each access That overhead I'm not worried about at all, because it only happens when the data is "set up" for an actual computation. > The convenience pointers like Mesh.mvert are a bit related here. I guess we would still have to call some function like "ensure writeable" before changing the data they point to. > In the long term we could get rid of them and use a method like CurvesGeometry with a positions() method that handles accessing the attribute layers for you. That sounds reasonable. > User counting with atomics is not free, but I guess it will be worth it here. Definitely not free, but (1) it shouldn't be done in hot loops, (2) is probably cheaper than copying the data and (3) if we find that for some specific piece of data it is cheaper to copy it, we can still do that.

Bastien Montagne commented

2022-02-21 12:39:35 +01:00

Added subscriber: @mont29

Bastien Montagne commented

2022-02-21 12:39:35 +01:00

This looks like a great improvement to me as well.

It's unclear whether we want to share attribute layers in original data. We could make sure that original data blocks never shares data, however it might be easier to just allow it.

I would never, ever allow that. We already suffer (a lot!) from the very few places where some internal IDs sub-data are shared with others (e.g. the Armature bones referenced by the Object pose data), I do not want to see other cases like that added without an extremely good reason to break the 'isolation' between IDs. Sharing data by addresses between IDs essentially makes the whole ID management code a nightmare.

User counting with atomics is not free, but I guess it will be worth it here.

Definitely not free, but (1) it shouldn't be done in hot loops, (2) is probably cheaper than copying the data and (3) if we find that for some specific piece of data it is cheaper to copy it, we can still do that.

I would be very surprised is cost of atomic user count was ever more expansive than useless data copying indeed, so not worried here either.

This looks like a great improvement to me as well. > It's unclear whether we want to share attribute layers in original data. We could make sure that original data blocks never shares data, however it might be easier to just allow it. I would never, ever allow that. We already suffer (a lot!) from the very few places where some internal IDs sub-data are shared with others (e.g. the `Armature` bones referenced by the `Object` pose data), I do not want to see other cases like that added without an extremely good reason to break the 'isolation' between IDs. Sharing data by addresses between IDs essentially makes the whole ID management code a nightmare. >> User counting with atomics is not free, but I guess it will be worth it here. > Definitely not free, but (1) it shouldn't be done in hot loops, (2) is probably cheaper than copying the data and (3) if we find that for some specific piece of data it is cheaper to copy it, we can still do that. I would be very surprised is cost of atomic user count was ever more expansive than useless data copying indeed, so not worried here either.

Paul Larson commented

2022-02-21 17:11:17 +01:00

Added subscriber: @GeorgiaPacific

Hans Goudey commented

2022-02-21 23:47:37 +01:00

Sharing data by addresses between IDs essentially makes the whole ID management code a nightmare.

I get that normally, and agree. But I wonder if this case is different, since the user-counting is done automatically.
The problem seems to be that the data referenced from another ID is owned by the other ID.
With these copy-on-write cases, the data is owned by both IDs, completely transparently.

Maybe there's some aspect I'm missing. I would bet there's a lot of cases where sharing layers between original IDs would
help a lot. i.e. duplicate a large sculpt with a bunch of painted colors, do some deformation to test something out, etc.

>Sharing data by addresses between IDs essentially makes the whole ID management code a nightmare. I get that normally, and agree. But I wonder if this case is different, since the user-counting is done automatically. The problem seems to be that the data referenced from another ID is *owned* by the other ID. With these copy-on-write cases, the data is owned by *both* IDs, completely transparently. Maybe there's some aspect I'm missing. I would bet there's a lot of cases where sharing layers between original IDs would help a lot. i.e. duplicate a large sculpt with a bunch of painted colors, do some deformation to test something out, etc.

Yuki Hashimoto commented

2022-03-21 13:56:44 +01:00

Added subscriber: @hzuika