Page MenuHome

Add 'asset uuid' to IDs.
AbandonedPublic

Authored by Bastien Montagne (mont29) on Mon, Nov 25, 6:21 PM.

Details

Summary

This commit is a subset of the asset-engine branch, only adding the uuid struct to data-blocks, with a basic minimal RNA/Python API to use it.

It does not contain anything regarding asset engines and asset management itself.

Besides being a first step towards full integration of asset engine work into master, it is also the 'minimal requirement' from the studio here for next Cosmos production pipeline (regarding own in-house management tools).

AssetUUIDs are opaque data for Blender, it is the responsibility of the asset engine (or the user code of the lower-level basic UUID API) to generate them, ensure they are valid and unique, free them when not necessary anymore, etc.). Blender does take care of saving and re-reading those UUIDs to/from .blend files, so they are by default persistent data.

AssetUUIDs are supposed to be unique to one ID, so they are not duplicated when copying an ID (as a reminder, by default ID copying code does not copy the content of the ID struct itself, except for a few explicit hand-picked members).

Diff Detail

Repository
rB Blender
Branch
asset-uuid
Build Status
Buildable 5780
Build 5780: arc lint + arc unit

Event Timeline

Sergey Sharybin (sergey) requested changes to this revision.Mon, Nov 25, 8:02 PM

Just some quick pass. The biggest part which I am missing is what is it all about:

  • Who is responsible for generating UUID?
  • What is the livetime?
  • What is supposed to happen with UUID when you duplicate ID (for both user edits and copy-on-write)?

Sure, when someone starts using this they'll have more questions.

Asset engine is something expected for many people to use, and I expect some quality in communication how they are expected to be hooked up to Blender.
There might be some documentation which I am not aware about. Hawing it stated in the patch description would be a great idea.

Without such documentation I can not do proper review and greenlight it.

release/scripts/modules/amber/__init__.py
1 ↗(On Diff #19826)

Is it intended to be used by this change?
If so, how, what is expected to be happening with this Amber thing?

source/blender/blenkernel/intern/asset_engine.c
19

2019

70

Ellipsis is usually indicates an intentional omission of a word, sentence, or whole section from a text without altering its original meaning.

None of those are part of the comment. Comment should state a finished thought or idea.

Either there is something you don't tell in the comment or you are misusing ellipsis.

75

Same as above.

76

Suggestion: pad values to a known width, make logs easier to read by avoiding UUID jumping all over the place in terms of screen space.

107

Did you consider having first 2 bytes dedicated to an ID type to narrow down lookups like this?

source/blender/blenkernel/intern/library.c
1788

In which cases UUID is not allocated?

source/blender/blenloader/BLO_readfile.h
31 ↗(On Diff #19826)

Unused forward declaration.

source/blender/blenloader/intern/readfile.c
2687
/* Make sure runtime fields are always zeroed out. */
9316

Commend it a full sentence, starting with a capital and ending with a fullstop.

9320
/* Make sure runtime fields are always zeroed out. */

Also seems like there should be BKE_uuid_runtime_reset().

11893

Captial.

There is also something missing here explaining why this doesn't cause double-free.

source/blender/editors/space_outliner/outliner_tools.c
700

I can see aggravation, but what would be more helpful is to know what possible issues are and what are the solutions.

source/blender/makesdna/DNA_ID.h
242

What is :|. ?

source/blender/python/intern/bpy_rna_id_collection.c
401
/* TODO: Find a better way of getting `self ` in this case. */
This revision now requires changes to proceed.Mon, Nov 25, 8:02 PM

I did not look at the Python side of this, just the C code.

source/blender/blenloader/intern/readfile.c
2689

Should id->uuid->tag be cleared on file read, since it's runtime only?

9315–9326

Can a placeholder even have an AssetUUID? If so, for what purpose?

It doesn't seem to be written in writefile.c at least.

source/blender/makesdna/DNA_ID.h
248–252

Can you explain why this stores 5 fixed size UUIDs, rather than e.g. a single arbitrary length string or array that the asset engine can encode/decode as needed? I couldn't figure out the rationale from reading T46049: Assets Integration in Blender.

If the assert storage is built from the ground up to use UUIDs for lookup then it's easy (like Amber I guess). But I expect existing system are often based on strings like filenames, USD namespace, etc. Imagine for example an asset engine that can read from an SVN repository.

To make that work you'd need a database mapping between UUIDs and strings. And this database would need to be on the server and be kept in sync with the database that has the assets, a local cache would not be enough. That all seems rather complicated.

256

Can you explain how this preview image differs from PreviewImage found in some datablocks? And why it's part of the UUID?

It does not appear to be saved to file. If it's some preview of an asset loaded by the asset manager at runtime, it's not clear to me why it is stored in the datablock.

259

Use ibuf rather than ibuff as we do elsewhere? Or even preview_width, preview_height and preview_image_buffer?

262

Isn't there a more elegant solution to this?

263

I don't see a way to identify the engine from the AssertUUID, is it there somehow or will it be added later?

282

to indicates that that -> to indicate that

source/blender/makesrna/intern/rna_ID.c
622–626

I think this should either generate a UUID that can be trusted to be universally unique, or not initialize at all. If it's intended to do the former, I would not call this "dummy init".

The code to generate a UUID should be moved to a utility function.

Casting time to uint seems like you'd lose a lot of useful bits, I think it should get more time bits from the system if this intended to be universally unique, and maybe info from other sources too.

source/blender/python/intern/bpy_rna_id_collection.c
465

This functions loops over all IDs to find an asset. If this is used in asset engines or Python scripts in general, it can lead to poor time complexity and performance.

Can you explain what the purpose of this API function, how you expect it to be used?

Bastien Montagne (mont29) marked 21 inline comments as done.
  • Merge branch 'master' into asset-uuid
  • Updates from code review.
  • Some more minor fixes/cleanups.
  • Rename some variables in AssetUUID struct.
  • Remove the Amber asset engine code, this has nothing to do in that patch.

Updated code from review.

release/scripts/modules/amber/__init__.py
1 ↗(On Diff #19826)

Nope, this was not supposed to be here... will nuke it.

source/blender/blenkernel/intern/asset_engine.c
19

Not according to the logs, first commit here goes back to 2015.
not sure what is our politic here, is it actual work that matters, or commit to master?
Could also be 2015-2019.

70

Depending on their context and placement in a sentence, ellipses can indicate an unfinished thought, a leading statement, a slight pause, an echoing voice, or a nervous or awkward silence. Aposiopesis is the use of an ellipsis to trail off into silence—for example: "But I thought he was …" When placed at the beginning or end of a sentence, the ellipsis can also inspire a feeling of melancholy or longing.

Anyway, agree I should avoid literary things in our code comments in general.

107

No, since UUIDs are entirely the responsibility of the asset engine/external user of the API. For blender this is and must remain a totally opaque value.

If speed really becomes an issue here, I'd rather build a runtime GHash.

source/blender/blenkernel/intern/library.c
1788

In a lot of cases, Blender never allocates UUID by itself, it's always third party asset engine (or the minimal low-level RNA API we want to have for Cosmos) that does it. So most IDs should not have any UUID allocated actually.

source/blender/blenloader/intern/readfile.c
9315–9326

Nice catch! Yes, it is mandatory especially for placeholders to keep that info, otherwise all asset references could be very easily lost.

Note that it is written in asset-engine code, was removed by mistake when creating the asset-uuid branch... Added back.

9316

Agreed, but please note that this is just a copy/paste of exact same existing comment few lines below ;)

11893

Indeed this was not clear, and not 100% safe from memleaks either. Fixed.

source/blender/makesdna/DNA_ID.h
242

A moody smiley followed by a dot? :P
Removed, we are serious 'pro' devs now... ;(

248–252

There are several reasons:

  • Common sense: I'd expect any asset engine to use, one way or another, a database (be it a temp mapping built on the fly), this is the only way to get a quick listing and search on a vast amount of assets. You do not want to make requests to the svn server to list and search your assets. And with a database UUIDs come "for free".
  • Abstraction: a USD path or SVN repo reference can change. UUID should never change for a given asset. That way, Blender does not have to care about remapping an asset reference, or updating it... this is entirely the responsibility of the asset engine.
  • Handling arbitrary long strings (even worse with arrays) in RNA is painful (to say the least, in the past it was even impossible with registrable callbacks, not sure how things are currently). And this struct and/or items of this UUID are used all over the place in asset engine registrable class callbacks.
  • Memory usage: not critical, but arbitrary long strings could end up eating a fair share of space.
  • .blend file parsing: it is much, much efficient, especially for an external tool, to parse a pre-defined fixed size set of integers, than a string that is not even written in the struct itself, but 'somewhere' in another block after it.
  • Performances: not critical either, but hashing, comparing, passing around, duplicating a fixed array of integers is much more efficient than managing an arbitrary string.
256

This is used by the asset engines to give the filebrowser (and potentially other pre-loading UI later) a preview of the asset *before* the data-block is actually loaded. This should not be defined in the data-blocks once loaded, only during listing of available assets...

262

Updated description a bit here...
I would really like to keep this, AssetUUID struct is also one of the main communication data used with the asset engines, and it really needs to pass along the linked ID for some callbacks.
Also, its not worse that the ID.newid member, and it usage is similar.

263

This is added as a separate struct in the Library datablock, in the asset-engine branch, yes. Since a same library should never be shared by several asset engines, ever. So no point in storing that info in every data-blocks/assets.

source/blender/makesrna/intern/rna_ID.c
622–626

Adding that basic minimal initialization was a request from Cosmos Pipeline team (aka Andy ;) ), who finds it 'nicer' basically to have a usable UUID by default.

Am also not a huge fan of this, and not really interested in implementing a full, real UUID generator, as explained elsewhere Blender should not have to care about what are those values, they should be fully opaque for it. And strictly under the responsibility of the asset engine/API user code.

So unless there is a strong case for generating a fully valid UUID by default, I'd rather just remove this and accept that Blender only generates NULL UUIDs.

source/blender/python/intern/bpy_rna_id_collection.c
401

That's also a copy-pasted comment from same case above in the code btw.

465

This is very quickly added thing in the idea of not using real asset-engine for Cosmos, but just a basic low-level API access to only UUIDs.

We could easily use a mapping here if performances are really an issue, not so sure about that currently though.

Note that 'proper' asset engines should not need this, unless they want to do some fancy management of already loaded assets in the file...

Bastien Montagne (mont29) edited the summary of this revision. (Show Details)
Bastien Montagne (mont29) edited the summary of this revision. (Show Details)

My understanding from the bf-committers mail was that the initial asset system would be about appending datablocks, where you don't even need to keep any persistent relation between the asset database and the datablock that was added in Blender. This Asset UUID seems to be for the more advanced use case where the asset manager is used as an alternative or extension to the library linking system?

This I guess is the interesting part for Cosmos, but I would like to see clarified if the plan is really to work on that aspect of the asset system already.

The design for linking systems is really complex. After reading the design tasks and looking through the asset-engine branch a bit, the specific use cases and design are still unclear. If there was some deep integration in the library loading code where a datablock was loaded from an asset manager instead of a .blend file, I could maybe see where this is going. But as far as I can tell in the asset-engine branch it's still loading .blend files as usual, and then maybe the asset engine can replace some datablocks afterwards?

It needs to be really well defined where a datablock comes from, what happens when it goes missing, whether edits to it will be saved, and which other .blend files it will affect.

The main problems I see in this patch are as follows.

  • There is no code yet for asset libraries to link the asset to a specific asset engine, which means once this is committed .blend files will be created with datablock UUIDs whose meaning is undefined, and it's unclear which asset engine is supposed to do anything with it.
  • The meaning of a datablock placeholder with UUID different from the original datablock is unclear. It seems like the start of a system to load linked datablocks from an asset engine directly, but the link to another .blend file remains and gets used. So what is this exactly?
  • AssetUUID has both a persistent UUID and runtime metadata about assets. That could be two separate data structures, and maybe the runtime part does not even need to be part of the Blender main database if it's just needed to display information in the asset manager UI.
  • Existing asset systems that might get integrated in Blender generally do not use 128 bit integers as identifiers as far as I know. For example identifiers in USD are strings. Maybe that's not a use case intended to be supported by the asset manager. But if it is, requiring a database that maps between Blender UUIDs and USD identifiers would add a lot of complexity also for users, if all they wanted was to have some USD files and .blend files and link them together.

For copying datablocks, another way to do that is to copy the .blend file or save the .blend with a different name. This isn't uncommon if someone is creating a bunch of assets, or making a variation for a different shot. And then you'd have a conflicting UUID, unless the filepath is taken into account by the asset engine.

My understanding from the bf-committers mail was that the initial asset system would be about appending datablocks, where you don't even need to keep any persistent relation between the asset database and the datablock that was added in Blender. This Asset UUID seems to be for the more advanced use case where the asset manager is used as an alternative or extension to the library linking system?

I don't know where you read that in that mail, but this has certainly never be agreed on. Both append and link should be supported. But indeed asset UUID is only used (beyond filebrowser using asset engines, which is not in this patch) for linked IDs, and/or IDs in library files (to be linked).

This I guess is the interesting part for Cosmos, but I would like to see clarified if the plan is really to work on that aspect of the asset system already.

Yes, plan has always been to work on the part from the beginning, most work in core/low level data management code done in the past 5 years (both in master, like lib remapping, and in asset-engine branch) was done with that goal in mind.

The design for linking systems is really complex. After reading the design tasks and looking through the asset-engine branch a bit, the specific use cases and design are still unclear. If there was some deep integration in the library loading code where a datablock was loaded from an asset manager instead of a .blend file, I could maybe see where this is going. But as far as I can tell in the asset-engine branch it's still loading .blend files as usual, and then maybe the asset engine can replace some datablocks afterwards?

The design is actually pretty simple: we keep linking process (almost) unchanged, and add an optional extra layer of processing handled by the asset engine afterwards. This to both keep loading of working .blend file itself as quick as possible, and allow to keep a reasonably working .blend file even if you do not have access to the asset-engine and/or repositories.

It needs to be really well defined where a datablock comes from, what happens when it goes missing, whether edits to it will be saved, and which other .blend files it will affect.

Think all this is fairly well defined currently - datablocks always come from .blend files (besides the added feature of 'files as data-blocks' used for image files e.g., but this is orthogonal to and fairly independent from the asset project really). When it goes missing, just like currently, we generate a placeholder keeping the uuid references of the asset.

Editing of the assets should be handled by the asset engine really (easiest is just to launch another blender instance with some scripts, although other scenarii are also possible).

Anyway, all this is kind of unrelated to that specific patch? This only adds an (non-blender managed) UUID to the IDs. Its a first step towards the full asset engine project, but...

For copying datablocks, another way to do that is to copy the .blend file or save the .blend with a different name. This isn't uncommon if someone is creating a bunch of assets, or making a variation for a different shot. And then you'd have a conflicting UUID, unless the filepath is taken into account by the asset engine.

This is the problem of the asset engine really, again it is supposed to manage its UUIDs, not Blender.

The main problems I see in this patch are as follows.

  • There is no code yet for asset libraries to link the asset to a specific asset engine, which means once this is committed .blend files will be created with datablock UUIDs whose meaning is undefined, and it's unclear which asset engine is supposed to do anything with it.

This is not really needed currently, since this patch does not add any concept of asset engine at all. later on, once/when an asset engine gets 'ownership' of an ID, it will almost certainly overwrite that UUID if it already exists.

  • The meaning of a datablock placeholder with UUID different from the original datablock is unclear. It seems like the start of a system to load linked datablocks from an asset engine directly, but the link to another .blend file remains and gets used. So what is this exactly?

As said above, asset engine system builds on top, adds an extra layer of processing over linking, it does not replace it. Placeholder UUID being different from linked one is a possibility, in case the asset engine replaces the UUID of the datablock in its owned storage/.blend files. Though I would not recommend it to do that, it might happen, in that case Blender still should keep the UUID it was stored with to query the asset engine for updates, and that engine shall handle the update of the UUID...

  • AssetUUID has both a persistent UUID and runtime metadata about assets. That could be two separate data structures, and maybe the runtime part does not even need to be part of the Blender main database if it's just needed to display information in the asset manager UI.

It could be two different structures, yes, but imho that would make things more confusing and complicated to handle, especially in the whole RNA registrable asset engine structure and its callbacks, for very little benefits... The overhead of storing a few runtime values in IDs is fairly low.

  • Existing asset systems that might get integrated in Blender generally do not use 128 bit integers as identifiers as far as I know. For example identifiers in USD are strings. Maybe that's not a use case intended to be supported by the asset manager. But if it is, requiring a database that maps between Blender UUIDs and USD identifiers would add a lot of complexity also for users, if all they wanted was to have some USD files and .blend files and link them together.

Again, USD paths (and 'name' based identifiers in general) can change, we also have to deal with that with our IDs names, and this is precisely why we want to use (typically never-changing once defined) UUIDs. I would consider that if an asset engine wants to provide assets from a USD repository, it should indeed generate its own mapping/database between USD paths and proper UUIDs.

Also, USD is not really an asset system to me, more like a storage system, like a .blend file, or a file system. You would not use USD files themselves to perform asset search and filtering e.g., you'd need a proper database for that anyway?

Finally, USD also falls (like other formats, OBJ, FBX, etc., and 'raw' files like image or video ones) in the orthogonal 'load file as an ID' project, which has a very minimal initial implementation in the asset-engine branch currently, with the concept of 'virtual' library data-block... But we are again getting way beyond current patch here. ;)

I don't know where you read that in that mail, but this has certainly never be agreed on. Both append and link should be supported. But indeed asset UUID is only used (beyond filebrowser using asset engines, which is not in this patch) for linked IDs, and/or IDs in library files (to be linked).

Then this is missing a design doc that explains a practical use case and linking workflow from the user point of view.

Like, which specific problems is this intended to solve and how? When would a user use an asset engine rather than regular library linking?

The design is actually pretty simple: we keep linking process (almost) unchanged, and add an optional extra layer of processing handled by the asset engine afterwards. This to both keep loading of working .blend file itself as quick as possible, and allow to keep a reasonably working .blend file even if you do not have access to the asset-engine and/or repositories.

This might be simple from an implementation point of view. But from the point of view of an artists or TD, it's not.

From my understanding, you would then have datablocks saved both in the .blend, and in some asset repository. So that means users are now responsible for saving their data in two places, and ensuring they are in sync? For e.g. Cosmos I guess that would be on top of the version control where they already have a local version of the .blend file and one in the svn repository. I have a hard time understanding how as a user you can keep track of all this, which version of the data you're looking at, and ensuring that you've saved it all in the right places.

An asset engine can add additional UI to help with this, but I don't think it can do a good enough job. And if it's supposed to be in sync and the UI helps ensure that, then it goes back to the question of why you would store a datablock in two places at all.

Think all this is fairly well defined currently - datablocks always come from .blend files (besides the added feature of 'files as data-blocks' used for image files e.g., but this is orthogonal to and fairly independent from the asset project really). When it goes missing, just like currently, we generate a placeholder keeping the uuid references of the asset.

In this patch, the UUID on the placeholder always overrides the UUID from the datablock in the original .blend file. That means if you go into the original .blend file, edit the datablock and save a new reversion, then that revision will not be propagated to other .blend files. The placeholder UUID will always refer to the old revision.

When it loads from the .blend file it will get the latest revision. But then if the asset engine gets involved it might load the older revision to replace it? There may be cases where you to specifically use an older revision, but that should be explicitly set by the user.

Anyway, all this is kind of unrelated to that specific patch? This only adds an (non-blender managed) UUID to the IDs. Its a first step towards the full asset engine project, but...

I think it's related. It seems the reason to have this as a persistent data structure is linking workflows, and it's not clear to me that it is designed correctly for that.

It's obviously useful to provide a good user interface for either a local set of .blend files with assets, or a production repository. That means efficient searching, filtering, previews, etc. For that kind of functionality the asset system does not need to affect how linking works or even save any persistent data in .blend files. It can index data in existing files or repositories, and append data or create library links as usual, but with a better UI.

The other useful functionality is more user friendly version control, ability to not have the entire repository on disk, multiple datablock revisions, incremental loading, etc. There I can see justification to modify the .blend file saving and loading flow. But I think this approach with Assert UUIDs is doing it at the wrong level, it's too late after the datablocks have already been loaded from local .blend files.

I get the feeling we keep skipping each other (as in, what we mean) in that discussion, and am also not really ready to talk again about the basic root choices and concepts on which that whole project has been built for the past five years.

Not to mention that talking like that with mail-type communication when we are all basically in the same place does not helps alleviating the frustration here.

Until we can sort that out, putting this in the freezer. Am growing more and more tired about that whole project tbh, too many people tearing it in too many directions with too many different opinions on what it should be/do and how it should do it.