This adds BLI_array_store for storing many similar arrays.
From users of this API, its only required to add/access/remove data.
Internally the data is de-duplicated.
Currently this patch is finished ~ in that as far as I know there are no bugs to resolve.
This patch also includes ifdef 'd code in editmesh_undo.c to de-duplicate memory for undo storage.
However there are some possible further work:
- Compare speed of hash lookups with a stride-aligned memstr (in practice this may be faster, since in many cases blocks aren't re-ordered completely).
- Investigate different methods of hashing blocks.
- Some of the constants could be tweaked with real-world data, for eg, chunks are hashed based on the first 7 elements (see: BCHUNK_HASH_TABLE_ACCUMULATE_STEPS), it may end up needing to be more.
Currently array size is limited to ~4gb (an unsigned it is used for offsets), probably should switch to size_t. Have a fast-path for hashing blocks with a stride of 1 (no need for a function call here).
Note, we may want to split this off into its own diff.
This patch makes use of BLI_array_store to reduce memory usage for edit-mesh undo.
The existing undo code, that stores a mesh data-block for each undo step is kept, however all large arrays are moved into a BLI_array_store,
which handles de-duplication.
Before reading these are temporarily expanded into the mesh, then freed.
- Static array-store and task-pool are kept allocated as long as there is at least one edit-mesh undo state.
- An array-store is created for each struct size, as needed.
- All arrays are stored in the array-store (vert/edge/loop/poly customdata, keyblocks and selection history).
- Array-store and array-store thread can be disabled by ifdef's.
- For reading a single undo step this doubles memory use (since it uses memory for the array-store and the expanded arrays).
- For writing a single undo step the memory is moved into the array-store and freed along the way. So peak memory will only increase by the size of the largest array.
- The array-stores themselves have some minimal overhead which isn't taken into account when printing statistics (states themselves, storage for chunks), none are very big and they use memory pools.
- The data for each chunk don't use a memory pool. however their size is set to use the nearest power-of-two (taking guarded-alloc overhead into account), so they will use minimal slop-space.
- While single peak memory usage for a single undo is higher, overall memory usage drops for many steps, though exact numbers depend a lot on the use-case. In own testing anywhere from 25% to 10% of the expanded memory use is quite common.
- Currently the compacting is done in a single, background thread, since it does add some lag and theres no reason this task needs to block.
- The compacting its self could be multi-threaded however its not so slow and would complicate code.
- There is potential to introduce user noticeable lag if the user either does 2x actions very fast (where adding the second undo state waits for the first to finish). Or when undo is called immediately after performing the action.
- In practice, I don't think this is something users will notice, for example, a destructive edits on the dragon_adult.blend take around 0.02 seconds, non-destructive edit (dragging vertices/UV/weight-paint/selection changes) ~0.005 seconds (both in a background thread).
- Expanding the array-store adds some overhead to each undo step, but not much since its just copying memory (~ 0.004 seconds on the dragon_adult.blend, for both destructive and non-destructive edits).