Page MenuHome

Improve name conflict handling in ID management
Confirmed, NormalPublicDESIGN

Description

Commits rBd84065807810, rB2aab72700909a, rB4cc8201a65, and rB46607bc09d5 already made things significantly faster here, at least when dealing with large amounts of data-blocks.

Going further will require caching info about already used names (more precisely, used base names, and used suffix numbers).

Proposal

Use a GHash per ID type, stored in Main, using base names as keys, and storing as values a structure to help quickly finding the best available number.

Using only base names as keys will seriously reduce the memory usage of the whole cache, in case of massive amounts of IDs with same base name. And avoid lots of trial & failure while searching for an available number suffix.

Those hashes will have to be kept up-to-date all the time, as building them on demand would require way too much computations. This should not be too difficult though, code affecting ID's names is small and well confined. ID deletion code shall also update that cache.

Regarding Numbering

Current approach is to ensure we always get the smallest available number up to a specific value (around 1K in current code), then we just use smallest value after last one used (i.e. do not re-use freed numbers, so if e.g. we have Object to Object.2000 fully used by 2k objects, and delete Object.1500, then add a new one, it will get Object.2001 as name).

We can probably efficiently do that for an arbitrary amount of numbers by storing bitflags (using chunks, maybe single 32bits integers, and a mempool) to tag used numbers (that would add about one extra bit of memory per ID). Using a n-tree structure should help with quickly finding an available number then (by greatly reducing the amount of comparisons needed to find the smallest available one).