ID Management: Sanitize and clarify our ID tags #88555

New Issue

Bastien Montagne · 2021-05-25T13:38:50+02:00

Bastien Montagne commented

2021-05-25 13:38:50 +02:00

Our current handling of ID management flags and their meaning is a bit fuzzy currently, and needs cleanup, and documentation.

For example, LIB_TAG_NO_MAIN is often implicitly implying LIB_TAG_NO_USER_REFCOUNT, but not always - and is not always a proper behavior.

The meaning and differences between LIB_TAG_COPIED_ON_WRITE, LIB_TAG_COPIED_ON_WRITE_EVAL_RESULT and LIB_TAG_LOCALIZED is not clear, and not documented at all.

etc.

IMPORTANT: This task is a sister of #90610 (ID Management: Sanitize and clarify our ID creation/copy tags), both will be worked on together.

Existing Tags

Here is a list of current existing tags which roles/meanings are not fully clear, or are inconsistent across our code base, with proposed new definition and behaviors for them if needed.

Proposed Removal

LIB_TAG_LOCAL is set to 0 (so it's not actually a tag), and is only used in one place in readfile.c... This is both confusing and useless, think we should get rid of it. If we really want a named enum for 0, we should name it e.g. LIB_TAG_NONE, but think it's not needed.

Proposed Renames

Those tags have a fairly clear and valid role, but their name could be more precise/specific.

Linked IDs tags
`LIB_TAG_EXTERN`	=>	`LIB_TAG_LINKED_DIRECT`	Directly linked ID.
`LIB_TAG_INDIRECT`	=>	`LIB_TAG_LINKED_INDIRECT`	Indirectly linked ID.
`LIB_TAG_MISSING`	=>	`LIB_TAG_LINKED_MISSING`	Missing linked ID.
readfile only tags
`LIB_TAG_NEED_EXPAND`	=>	`LIB_TAG_READFILE_NEED_EXPAND`	ID has been read, but still needs to be expanded (typically, find and read indirectly linked IDs used by directly linked ones).
`LIB_TAG_ID_LINK_PLACEHOLDER`	=>	`LIB_TAG_READFILE_LINK_PLACEHOLDER`	Temporary tag during readfile process, when the placeholder of a directly linked ID has been read from current file, but actual data still needs to be read from another library file.
`LIB_TAG_NEED_LINK`	=>	`LIB_TAG_READFILE_NEED_LINK`	The ID data has been read, but it still needs to be lib-linked (i.e. its usages of other IDs still need to be updated to proper new pointers).

Proposed Changes

The five tags (NO_MAIN, LOCALIZED, COPIED_ON_WRITE and COPIED_ON_WRITE_EVAL_RESULT, and TEMP_MAIN) used to mark an ID as not in G_MAIN are the most problematic ones, as they are:

Lacking clear definition of their meanings and roles sometimes;
Partially overlapping sometimes;
Not always behaving in a consistent way across the whole codebase.

General Proposed Changes

Decouple fully specific information conveyed by other tags (like NOT_ALLOCATED or NO_USER_REFCOUNT) from the 'no main' tags.
Clearly document which tags are only allowed for 'no main' IDs (see below).
Make all 'no main' tags fully exclusive (i.e. an ID could only be TEMP_MAIN, or NO_MAIN, or LOCALIZED, or COPIED_ON_WRITE).
** This could be made clearer at API level by using a dedicated enum instead of passing tags?
Add a macro to check if an ID is in G_MAIN or not.
Ensure that code allowing to add a 'no main' ID into G_MAIN asserts on any unsupported cases:
** LOCALIZED
** COPIED_ON_WRITE
** COPIED_ON_WRITE_EVAL_RESULT

Regular IDs

For regular IDs (in G_MAIN), the following flags are forbidden:
** NOT_ALLOCATED
** NO_USER_REFCOUNT
** NO_MAIN
** TEMP_MAIN
** LOCALIZED
** COPIED_ON_WRITE
** COPIED_ON_WRITE_EVAL_RESULT

This is already the case in practice, but should be clearly documented in comments of those flags.

`LIB_TAG_TEMP_MAIN`

IDs with this flag are basically regular IDs, they are simply isolated out the G_MAIN.

PROPOSAL

All rules applying to regular IDs apply to those TEMP_MAINones as well.
** Only potential exception: Do we allow NO_USER_REFCOUNT for them? Think we should not

`LIB_TAG_NO_MAIN`

The main issue with this tag is the current behavior it implies regarding ID refcounting. Some code (like ID remapping in lib_remap.c) assumes a NO_MAIN id should not be refcounting its ID usages, but has a flag to force it to do so.

Another issue is that parts of the code add NO_MAIN as some kind of generic 'out of G_MAIN' tag, on top of more specific/restrictive tags.

PROPOSAL:

Fully disconnect NO_MAIN from refcounting ID usages or not. NO_MAIN simply means 'not in a Main struct'.
Code setting this flag is responsible to also set NO_USER_REFCOUNT if necessary.
Code clearing this flag is responsible to ensure NO_USER_REFCOUNT is also cleared.
A NO_MAIN ID should never share any data (like caches etc.) with any other ID, it should be purely independent.
A NO_MAIN ID should always use ID management code for allocation, freeing, etc.
** This implies that NOT_ALLOCATED is forbidden for NO_MAIN IDs.
A NO_MAIN ID should always be in a state that allows adding it back into G_MAIN or another temp Main. In other words, it should always be possible to 'convert' it into a TEMP_MAIN or regular ID with minimal, generic ID management work (like user refcounting).

`LIB_TAG_COPIED_ON_WRITE` & `LIB_TAG_COPIED_ON_WRITE_EVAL_RESULT`

Those flags roles and behaviors are fairly clear and valid. Just need to check and ensure that the exclusion rule with the other 'no main' tags is always applied. Besides that, they are managed by depsgraph and their scope is clearly defined and limited.

`LIB_TAG_LOCALIZED`

This flag usage is by far the worse. It is currently somewhat niche, not very well documented, and extremely fuzzy.

AFAICT, it is currently only effectively assigned through one code paths:

IDs duplicated during preview generation (through the usage of LIB_ID_CREATE_LOCAL flag) (preview_get_localized_world and duplicate_ids in render_preview.c).
**Those generate localized nodetrees (by calling ntreeLocalize in node.cc), even though ID itself is duplicated with LIB_ID_COPY_LOCALIZE, which implies LIB_ID_CREATE_NO_MAIN. bad, very bad.PROPOSAL:
Those IDs are allowed to share data (like caches etc.) with some IDs in G_MAIN. This has to be handled by the each IDType ID management code.
Those IDs should not be convertible (at least by generic ID management code like BKE_libblock_management_main_add) into any other type of ID (in G_MAIN or other 'no main' ones).
Typically those IDs have a short life-time within a well defined and controlled context.
** They are very similar to CoW IDs: code creating them should be responsible to control their life time and destruction.

NOTE: It is unclear currently if there would be more usages (besides preview code) for those LOCALIZED IDs.

Embedded ID case

Embedded IDs should share the same 'ID amangement' tags as their owner. See #69169 (Improve handling of embedded data-blocks (root nodetrees and master collections)) for details.

Our current handling of ID management flags and their meaning is a bit fuzzy currently, and needs cleanup, and documentation. For example, `LIB_TAG_NO_MAIN` is often implicitly implying `LIB_TAG_NO_USER_REFCOUNT`, but not always - and is not always a proper behavior. The meaning and differences between `LIB_TAG_COPIED_ON_WRITE`, `LIB_TAG_COPIED_ON_WRITE_EVAL_RESULT` and `LIB_TAG_LOCALIZED` is not clear, and not documented at all. etc. IMPORTANT: This task is a sister of #90610 (ID Management: Sanitize and clarify our ID creation/copy tags), both will be worked on together. ---------------------- # Existing Tags Here is a list of current existing tags which roles/meanings are not fully clear, or are inconsistent across our code base, with proposed new definition and behaviors for them if needed. ## Proposed Removal `LIB_TAG_LOCAL` is set to `0` (so it's not actually a tag), and is only used in one place in `readfile.c`... This is both confusing and useless, think we should get rid of it. If we **really** want a named enum for `0`, we should name it e.g. `LIB_TAG_NONE`, but think it's not needed. ## Proposed Renames Those tags have a fairly clear and valid role, but their name could be more precise/specific. | *Linked IDs tags* | | | | | -- | -- | -- | -- | | `LIB_TAG_EXTERN` | => | `LIB_TAG_LINKED_DIRECT` | *Directly linked ID.* | | `LIB_TAG_INDIRECT` | => | `LIB_TAG_LINKED_INDIRECT` | *Indirectly linked ID.* | | `LIB_TAG_MISSING` | => | `LIB_TAG_LINKED_MISSING` | *Missing linked ID.* | | *readfile only tags* | | | | | `LIB_TAG_NEED_EXPAND` | => | `LIB_TAG_READFILE_NEED_EXPAND` | *ID has been read, but still needs to be expanded (typically, find and read indirectly linked IDs used by directly linked ones).* | | `LIB_TAG_ID_LINK_PLACEHOLDER` | => | `LIB_TAG_READFILE_LINK_PLACEHOLDER` | *Temporary tag during readfile process, when the placeholder of a directly linked ID has been read from current file, but actual data still needs to be read from another library file.* | | `LIB_TAG_NEED_LINK` | => | `LIB_TAG_READFILE_NEED_LINK` | *The ID data has been read, but it still needs to be lib-linked (i.e. its usages of other IDs still need to be updated to proper new pointers).* | ## Proposed Changes The five tags (`NO_MAIN`, `LOCALIZED`, `COPIED_ON_WRITE` and `COPIED_ON_WRITE_EVAL_RESULT`, and `TEMP_MAIN`) used to mark an ID as not in `G_MAIN` are the most problematic ones, as they are: * Lacking clear definition of their meanings and roles sometimes; * Partially overlapping sometimes; * Not always behaving in a consistent way across the whole codebase. ### General Proposed Changes * Decouple fully specific information conveyed by other tags (like `NOT_ALLOCATED` or `NO_USER_REFCOUNT`) from the 'no main' tags. * Clearly document which tags are only allowed for 'no main' IDs (see below). * Make all 'no main' tags fully exclusive (i.e. an ID could only be `TEMP_MAIN`, or `NO_MAIN`, or `LOCALIZED`, or `COPIED_ON_WRITE`). ** This could be made clearer at API level by using a dedicated enum instead of passing tags? * Add a macro to check if an ID is in `G_MAIN` or not. * Ensure that code allowing to add a 'no main' ID into `G_MAIN` asserts on any unsupported cases: ** `LOCALIZED` ** `COPIED_ON_WRITE` ** `COPIED_ON_WRITE_EVAL_RESULT` ### Regular IDs For regular IDs (in `G_MAIN`), the following flags are forbidden: ** `NOT_ALLOCATED` ** `NO_USER_REFCOUNT` ** `NO_MAIN` ** `TEMP_MAIN` ** `LOCALIZED` ** `COPIED_ON_WRITE` ** `COPIED_ON_WRITE_EVAL_RESULT` This is already the case in practice, but should be clearly documented in comments of those flags. ### `LIB_TAG_TEMP_MAIN` IDs with this flag are basically regular IDs, they are simply isolated out the `G_MAIN`. **PROPOSAL** * All rules applying to regular IDs apply to those `TEMP_MAIN`ones as well. ** Only potential exception: Do we allow `NO_USER_REFCOUNT` for them? *Think we should not* ### `LIB_TAG_NO_MAIN` The main issue with this tag is the current behavior it implies regarding ID refcounting. Some code (like ID remapping in `lib_remap.c`) assumes a `NO_MAIN` id should not be refcounting its ID usages, but has a flag to force it to do so. Another issue is that parts of the code add `NO_MAIN` as some kind of generic 'out of `G_MAIN`' tag, on top of more specific/restrictive tags. **PROPOSAL:** * Fully disconnect `NO_MAIN` from refcounting ID usages or not. `NO_MAIN` simply means 'not in a Main struct'. * Code **setting** this flag is responsible to also set `NO_USER_REFCOUNT` if necessary. * Code **clearing** this flag is responsible to ensure `NO_USER_REFCOUNT` is also cleared. * A `NO_MAIN` ID should **never** share any data (like caches etc.) with any other ID, it should be purely independent. * A `NO_MAIN` ID should always use ID management code for allocation, freeing, etc. ** This implies that `NOT_ALLOCATED` is forbidden for `NO_MAIN` IDs. * A `NO_MAIN` ID should always be in a state that allows adding it back into `G_MAIN` or another temp `Main`. In other words, it should always be possible to 'convert' it into a `TEMP_MAIN` or regular ID with minimal, generic ID management work (like user refcounting). ### `LIB_TAG_COPIED_ON_WRITE` & `LIB_TAG_COPIED_ON_WRITE_EVAL_RESULT` Those flags roles and behaviors are fairly clear and valid. Just need to check and ensure that the exclusion rule with the other 'no main' tags is always applied. Besides that, they are managed by depsgraph and their scope is clearly defined and limited. ### `LIB_TAG_LOCALIZED` This flag usage is by far the worse. It is currently somewhat niche, not very well documented, and extremely fuzzy. AFAICT, it is currently only effectively assigned through one code paths: * IDs duplicated during preview generation (through the usage of `LIB_ID_CREATE_LOCAL` flag) (`preview_get_localized_world` and `duplicate_ids` in `render_preview.c`). **Those generate localized nodetrees (by calling `ntreeLocalize` in `node.cc`), even though ID itself is duplicated with `LIB_ID_COPY_LOCALIZE`, which implies `LIB_ID_CREATE_NO_MAIN`. bad, very bad.**PROPOSAL:** * Those IDs are allowed to share data (like caches etc.) with some IDs in `G_MAIN`. This has to be handled by the each IDType ID management code. * Those IDs should not be convertible (at least by generic ID management code like `BKE_libblock_management_main_add`) into any other type of ID (in `G_MAIN` or other 'no main' ones). * Typically those IDs have a short life-time within a well defined and controlled context. ** They are very similar to CoW IDs: code creating them should be responsible to control their life time and destruction. NOTE: It is unclear currently if there would be more usages (besides preview code) for those `LOCALIZED` IDs. # Embedded ID case Embedded IDs should share the same 'ID amangement' tags as their owner. See #69169 (Improve handling of embedded data-blocks (root nodetrees and master collections)) for details.

Bastien Montagne commented

2021-05-25 13:38:51 +02:00

Changed status from 'Needs Triage' to: 'Confirmed'

Bastien Montagne commented

2021-05-25 13:38:51 +02:00

Added subscriber: @mont29

Bastien Montagne changed title from ~~Sanitize and clarify our ID management flags~~ to ID Management: Sanitize and clarify our ID tags

2021-08-11 16:08:40 +02:00

Jacques Lucke commented

2021-10-27 16:30:36 +02:00

Added subscriber: @JacquesLucke

Jacques Lucke commented

2021-10-27 16:30:36 +02:00

Generally sounds good, haven't looked at everything in detail yet. I have one general question. Have you considered adding an enum for the different "modes" an ID can be in (Main, NoMain, COW, ...), instead of having a flag for each. This way it's easier to guarantee that an ID is in exactly one of these modes.

Bastien Montagne commented

2021-10-27 16:42:38 +02:00

@JacquesLucke we could, but then we'd need to first add a runtime struct to ID... I'd rather not add yet another runtime field to this struct, we already have too many of those. Keeping using flags is less disruptive for now imho?

@JacquesLucke we could, but then we'd need to first add a `runtime` struct to ID... I'd rather not add yet another runtime field to this struct, we already have too many of those. Keeping using flags is less disruptive for now imho?

Jacques Lucke commented

2021-10-27 16:48:45 +02:00

Maybe we could emulate this enum on top of the existing flags. So we'd get and set the "mode" through some helper functions, but internally it still uses all the different flags "for legacy reasons". That would make the API a bit more self-documenting and reduces the risk of setting invalid flag combinations.

Bastien Montagne commented

2021-10-27 17:13:43 +02:00

yes that sounds doable indeed.

Brecht Van Lommel commented

2021-11-02 19:26:39 +01:00

Added subscriber: @brecht

Brecht Van Lommel commented

2021-11-02 19:26:39 +01:00

I like the LIB_TAG_LINKED and LIB_TAG_READFILE names much better.

For the 'no main' tags, it would be good if they could all share a prefix somehow and have these be mutually exclusive. Some ideas for naming.

OWNER_NONE (NO_MAIN)
OWNER_MAIN
OWNER_ASSET_MAIN (TEMP_MAIN, not sure if asset is correct or if this has a wider purpose)
OWNER_DEPSGRAPH (COPIED_ON_WRITE and COPIED_ON_WRITE_EVAL_RESULT)

Then some flags available when OWNER_DEPSGRAPH is set:

LIB_TAG_DEPSGRAPH_LOCALIZED
LIB_TAG_DEPSGRAPH_EVAL_RESULT

Not sure how practical it is to make such changes.

I like the `LIB_TAG_LINKED` and `LIB_TAG_READFILE` names much better. For the 'no main' tags, it would be good if they could all share a prefix somehow and have these be mutually exclusive. Some ideas for naming. * `OWNER_NONE` (NO_MAIN) * `OWNER_MAIN` * `OWNER_ASSET_MAIN` (TEMP_MAIN, not sure if asset is correct or if this has a wider purpose) * `OWNER_DEPSGRAPH` (COPIED_ON_WRITE and COPIED_ON_WRITE_EVAL_RESULT) Then some flags available when `OWNER_DEPSGRAPH` is set: * `LIB_TAG_DEPSGRAPH_LOCALIZED` * `LIB_TAG_DEPSGRAPH_EVAL_RESULT` Not sure how practical it is to make such changes.

Monique Dewanchand commented

2022-11-24 14:27:28 +01:00

Added subscriber: @monique

Philipp Oeser removed the

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

ID Management: Sanitize and clarify our ID tags #88555

Existing Tags

Proposed Removal

Proposed Renames

Proposed Changes

General Proposed Changes

Regular IDs

LIB_TAG_TEMP_MAIN

LIB_TAG_NO_MAIN

LIB_TAG_COPIED_ON_WRITE & LIB_TAG_COPIED_ON_WRITE_EVAL_RESULT

LIB_TAG_LOCALIZED

Embedded ID case

`LIB_TAG_TEMP_MAIN`

`LIB_TAG_NO_MAIN`

`LIB_TAG_COPIED_ON_WRITE` & `LIB_TAG_COPIED_ON_WRITE_EVAL_RESULT`

`LIB_TAG_LOCALIZED`