Page MenuHome

VSE cache rewrite
ClosedPublic

Authored by Richard Antalik (ISS) on Mar 4 2019, 1:30 AM.

Details

Summary

Sequencer cache

Overview:
Structure of this cache was modified to allow implementation of "image dumping" (replacement of non-movie proxies) and Prefetching. This means, that sequencer cache no longer uses IMB_moviecache API, but its own. This will duplicate some functionality, wont't introduce unnecessary complexity to IMB_moviecache.

Each Scene owns single sequencer cache.

This cache should be viewed as "3D" space for stored images:

  • each frame of timeline can be cached
  • frame is composed of number of strips, that can be cached individually
  • each strip has number of rendering stages that can be cached

Cached stages

  • SEQ_CACHE_RAW Image as provided by input codec / read from disk.
  • SEQ_CACHE_PREPROCESSED: Image after scaling and color transformations applied, ...
  • SEQ_CACHE_COMPOSITE: Intermediate composited images(blended with another strips)
  • SEQ_CACHE_FINAL_OUT: Final image for each frame. This is a copy of last(topmost) strip composite image, so if both cached, memory usage is about the same as if only 1 of them is cached

Image cost
Each stored image has a cost assigned. Cost is calculated as ratio of time spent on rendering to maximum possible time to keep up with chosen frame rate. The higher the cost, the harder it is to render image.

Maximum image cost is limited to arbitrary value of 10

Cache size limiting
When cache is full, all images of most distant frame from playhead are removed. Modifying property "Recycle Up To Cost" recycle_max_cost can prevent removal of expensive frames, "locking" them in memory.

Temporary cache
Sometimes it is required to use single image multiple times to render single frame. To prevent unnecessary re-rendering, all stages of rendering are stored in cache until the frame is rendered completely and some(depending on storage configuration) are removed before next frame is rendered.

This means, that you can modify topmost strip in stack without need to re-render all strips leading to faster update rate in some cases

Dependency
To ensure, that stages chosen by user are always cached, a dependency system is used. To achieve this, when removing an image from cache we also have to remove all images that were created using this image.

Currently, when image is invalidated we remove the whole frame.

Memory visualization
Stored images can be visualized directly on timeline by setting "Show Cache" show_cache and show_cache_TYPE property to True

Configuration / RNA API
show_cache - Visualize cached images on the timeline
show_cache_TYPE - Visualize TYPE on timeline
cache_TYPE - Enable storage of TYPE
recycle_max_cost - Only frames with cost lower than this value will be recycled
override_cache_settings - Override global cache settings(per sequence)

sequence_editor properties

  • show_cache
  • show_cache_final_out
  • show_cache_raw
  • show_cache_preprocessed
  • show_cache_composite
  • cache_raw
  • cache_preprocessed
  • cache_composite
  • cache_final
  • recycle_max_cost

sequence properties

  • cache_raw
  • cache_preprocessed
  • cache_composite
  • override_cache_settings

Diff Detail

Repository
rB Blender

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
source/blender/imbuf/IMB_moviecache.h
45 ↗(On Diff #13992)

Sorry I didn't know this would be a problem.

I could do that, question is, if it is good idea to introduce such low level features to moviecache API.

So I would need IMB_moviecache_* functions:

  • expose moviecache_keyfree()
  • expose moviecache_valfree()
  • get MovieCacheItem->ibuf to calculate size and manage memory_used
  • get MovieCacheKey->userkey for following key links
  • get cache->hash for iterator
  • get cache->last_key for linking keys

get/set cache->memory_used can be external to cache
is provided by limiter, but I've had to duplicate this functionality as I don't use limiter

cache->cache_mutex can be external to cache

source/blender/makesdna/DNA_sequence_types.h
650–651

Yes, this would be preview image transformed to display colorspace

Not sure if I should add support for this as we can use GLSL, which is fast.
Also if using GLSL I would have to read texture data from GPU to store it in RAM?

The other thing would be then eventual prefetch support to generate this kind of image.

source/blender/makesrna/intern/rna_sequencer.c
1755–1758

Cost is proportional to rendering time, inversely proportional to playback frame rate. Cost of 1 is when render rate of image is the same as playback frame rate.

Is this too long?

Or cost = rendering time / playback frame rate

1776–1780

cost has defined maximum of SEQ_CACHE_COST_MAX which is 10

You would want to set this value if you have part of timeline that renders really slowly, and rest is fast enough. Slow part of timeline stays in cache while rest of available memory is used to cycle "cheap" frames.

When you run out of memory, no new frames will be added to cache. This property should be set to SEQ_CACHE_COST_MAX by default, so no frames are "locked" in cache.

This is easiest method I can think of to achieve this effect. As I said it may be not good enough, but I can not think about anything better.

Brecht Van Lommel (brecht) requested changes to this revision.Mar 6 2019, 6:59 PM
Brecht Van Lommel (brecht) added inline comments.
release/scripts/startup/bl_ui/space_sequencer.py
1150

Indeed all the category names should be capitalized like a title.

Leaving Settings in this case seems fine then.

1195

If it's just a few places might as well. Also fine to commit the change any time before this patch.

source/blender/blenkernel/intern/seqcache.c
423

Not sure I understand, in this patch this function is only called from BKE_sequencer_cache_get_item_for_removal just below.

source/blender/imbuf/IMB_moviecache.h
45 ↗(On Diff #13992)

Would it be possible to fold more features into the general movie cache?

I didn't look at it close enough, but I would guess that the logic from BKE_sequencer_cache_put_if_possible can be made generic. Maybe with some additional API functions or callback functions.

Right now I find it rather hard to understand the cache implementation spread out over two places that are somehow still tightly coupled.

source/blender/makesdna/DNA_sequence_types.h
650–651

For GLSL you would only write a float buffer to GPU memory and let it do the color management on every draw. The result would never be read back to the CPU.

If GLSL is fast enough in practice that seems preferable.

source/blender/makesrna/intern/rna_sequencer.c
1663

For property UI names, every word needs to be capitalized like a title. So this should be Cache Raw, and similar for other properties.

1755–1758

I think something like this could be ok:
Visualize cost of rendering the frame (render time divided by playback frame rate)

Also note descriptions should not have a . at the end.

1776–1780

Ok, I guess what we are trying to control here is relative priority between recent frames and older expensive frames.

Perhaps it should still leave e.g. 30% of the cache for cheaper frames, so it doesn't working entirely? The cost could also be relative. You could consider frame A in the cache to be "expensive" if its cost is at least e.g. 2x as high as frame B you are trying to insert.

If it's an advanced option disabled by default we can experiment with different strategies after this patch is committed though, it doesn't have to be perfect.

This revision now requires changes to proceed.Mar 6 2019, 6:59 PM
Richard Antalik (ISS) marked 11 inline comments as done.Mar 7 2019, 12:14 AM
Richard Antalik (ISS) updated this revision to Diff 14034.

renamed free_after_render -> is_temp_cache
removed some prefetch specific code
Added this text to seqcache.c:

Sequencer cache design notes
Most of the functionality is compatible with IMB_moviecache.
Cache size limiting and "threaded access" is not.

Cache key members:
is_temp_cache - this cache entry will be freed before rendering next frame
creator_id - ID of thread that created entry
cost - In short: render time divided by playback frame rate
link_prev/next - link to another entry created during rendering of the frame

Linking: We use links to reduce number of iterations needed to manage cache.
Only pernament (is_temp_cache = 0) cache entries are linked.

Function:
All images created during rendering are added to cache, even if the cache is already full.
This is because:

  • one image may be needed multiple times during rendering.
  • keeping the last rendered frame allows us for faster re-render when user edits strip in stack
  • we can decide if we keep frame only when it's completely rendered. Otherwise we risk having "holes" in the cache, which can be annoying

If the cache is full all entries for pending frame will have is_temp_cache set.

Only entire frame can be freed to release resources for new entries (recycling).
Once again, this is to reduce number of iterations, but also more controllable than removing
entries one by one in reverse order to their creation.

User can exclude caching of some images. Such entries will have is_temp_cache set.


I am sure I can add functionality to IMB_moviecache, I am just not sure if it would look nice...
From above text it seems that it just needs linking, and quite specific mode for limiter. Also I never did anything in CPP, so

source/blender/blenkernel/intern/seqcache.c
371

Forgot to ask about creator_id

This is to identify which thread is clearing "temp cache"
I heard that some API to identify threads was introduced, can I use that?

Or perhaps I should create enum listing "applications" that can run in background using cache. Seems less prone to erratic function(leftover fragments in cache etc.)

source/blender/makesrna/intern/rna_sequencer.c
1776–1780

This can be considered advanced option. Experimental for sure - lot of development can be done here. I would start with something simple.

The cost could also be relative. You could consider frame A in the cache to be "expensive" if its cost is at least e.g. 2x as high as frame B you are trying to insert.

I had it working this way. Sounds good on paper, but in reality it caused problems and was not reliable. this way you can be sure that further playback wont affect "locked" frames.

Perhaps it should still leave e.g. 30% of the cache for cheaper frames, so it doesn't working entirely?

I can do this, but I would rather implement prefetching first, that way I can better test features to not be in conflict.

Also sequencer does not need cache to work properly. Currently when you fill up the cache in sequencer, frames are not freed until you manually hit the refresh button.

Richard Antalik (ISS) edited the summary of this revision. (Show Details)Mar 7 2019, 12:39 AM

I was thinking about this IMB_moviecache issue we have here - in D3934: VSE cache with frame prefetching (WIP) I pointed to a direction I want to go with cache

Plans for future work
Image dumping:

  • Add MovieCache->dumper
    • This will be callback, that will accept void *userkey, ImBuf *ibuf (anything else?)
  • Sequencer should set callback similar to function seq_proxy_build_frame only it will get SeqCacheKey and ImBuf
  • Save image to <project-path>/<scene-name>/<seq-name>/<cache-type[Number]>/cfra.ext
    • formats to use should be (at least to start with) JPEG, PNG, EXR
    • path MUST be properly escaped (ideally scene/strip names should be limited to be filename compliant)

Cache invalidation:

  • Proper system must be developed, that will determine, what images must be deleted, by provided action and strips affected
  • With Image dumping, We will need a means to manage validity of external files.
    • Should be fast and shouldn't use too much memory. Saving checksums of images is probably most effective solution.
    • Does not have to be 100% correct depending on implementation? This may not be the best idea, as it would mean, that there would have to be a manual full check routine at least for range, which would complicate code significantly...
    • checksum table file must be pernament. Either as external file(slow?), or included in .blend itself as a data structure(may result in larger .blend file). Tables must be resizeable, appendable, randomly accessible.
    • if file is to be read to calculate checksum, can we save time, by reading every n-th byte?

Example:
10min of footage in timeline at 60FPS, SEQ_CACHE_COMPOSITE and SEQ_CACHE_FINAL_OUT are dumped
36000 frames total, ~70KB of data with CRC8 checksum


IMB_moviecache doesn't really need such features, so maybe it would be better to not use IMB_moviecache at all.
This may lead to some code duplication, but I guess cleaner implementation.

Richard Antalik (ISS) marked 11 inline comments as done.Mar 12 2019, 11:01 AM
Richard Antalik (ISS) edited the summary of this revision. (Show Details)
Richard Antalik (ISS) updated this revision to Diff 14124.

Added default values for RNA properties
SeqCache is "self contained" - no reference to IMB_moviecache

Implemented 2 iterators for memview

Ghash

  • fast
  • need to lock cache while iterating

SeqCacheStatsItem array

  • slower (need to alloc memory)
  • no lock needed
  • user has to free memory

With iterators I was considering application such as https://devtalk.blender.org/t/vse-strip-previews/5854
But that may need special handling...

  • change type of BKE_sequencer_cache_put_if_possible to bool
  • implement ghash iterator utilizing callback

This allows for fastest iteration, while managing cache lock internally.

Brecht Van Lommel (brecht) requested changes to this revision.Mar 25 2019, 7:11 PM

Works well in my tests, still some comments which are hopefully easy to address

  • I expected the options to show the cache to be in the View menu, similar to how it is in the timeline.
  • Cache visualization should be more similar to the timeline (just add e.g. a softbody modifier to see it). Thinner, orange and semi-transparent and maybe at the bottom? It's a bit too eye-catching now I think, and drawing similar to the timeline helps users recognize what it is.
  • The little blocks drawing over strips seems rather distracting to me. I'm not sure this is something we should show by default, it seems more like a debugging thing? I'd suggest to have that off by default, or leave it only as a debugging option for developers. Unless there is a purpose that I'm missing.
  • Do you plan to add more fine-grained strip invalidation? For example if you tweak a strip saturation, it could leave the raw cache intact and only clear the preprocessed one? Doesn't have to be part of this commit, just thinking this could go a long way to making parameter tweaking smoother.
release/scripts/startup/bl_ui/space_sequencer.py
1161–1169

This should not use toggle, we only use that for specific cases, not regular booleans.

1164–1167

Looking at this UI, it still seems complicated to explain to users how to configure this, why you would enable certain options and what the pros/cons are.

The defaults seems quite reasonable to me though, so hopefully it won't need much manual tweaking.

1270

Add layout.active = strip.override_cache_settings, so these settings are grayed out when not in use.

1271–1273

Same here, this should not use toggle.

source/blender/blenkernel/intern/seqcache.c
177

This function (except for the IB_PERSISTENT check) could be a general utility function in ImBuf and shared with moviecache?

463

This is not strictly threadsafe, two threads could be creating the sequencer cache at the same time. But it's easy to fix with a global mutex lock, which is fine in this case since it's not a performance bottleneck.

static ThreadMutex cache_create_lock = BLI_MUTEX_INITIALIZER;

BLI_mutex_lock(&cache_create_lock);
if (scene->ed->cache == NULL) {
    SeqCache *cache = MEM_callocN(sizeof(SeqCache), "SeqCache");
    ...
    scene->ed->cache = cache;
}
BLI_mutex_unlock(&cache_create_lock);

return scene->ed->cache;
485

There is no corresponding unlock which is problematic.

But there is no need to lock the cache here, this should never be called from multiple threads. So just remove this line.

source/blender/blenkernel/intern/sequencer.c
2845

sequencer.c:2773:89: warning: unused parameter ‘nr’

3534

Use seq_estimate_render_cost_begin(void) to fix warning.

3685

This line does nothing? I think you can just declare clock_t begin later on instead of the start of the function.

3721

Use seq_estimate_render_cost_begin()

3746

Use seq_estimate_render_cost_begin()

3761

This line does nothing?

3795

You shouldn't have to cast to (Sequence **) here? I think passing , seq_arr) would be correct?

4030

sequencer.c:4030:95: warning: unused parameter ‘invalidate_preprocess

source/blender/blenloader/intern/versioning_280.c
2842–2846

You should increase the subversion in BKE_blender_version.h and add this in a new if (!MAIN_VERSION_ATLEAST(bmain, 280, XX)) { block. Otherwise it won't correctly initialize all cases.

source/blender/editors/space_sequencer/sequencer_draw.c
1671

sequencer_draw.c:1671:16: warning: initialization discards ‘const’ qualifier from pointer target type

1755

sequencer_draw.c:1755:8: warning: unused variable ‘final_stripe_offs’

1766

sequencer_draw.c:1766:48: warning: passing argument 3 of ‘BKE_sequencer_cache_iterate’ from incompatible pointer type

source/blender/editors/space_sequencer/space_sequencer.c
681–682

Remove both the comment and this line I think, if it's no longer needed. Seems like a workaround for some kind of bug.

source/blender/makesdna/DNA_sequence_types.h
675–676

Does this TODO need to be in the commit?

source/blender/makesrna/intern/rna_camera.c
128

I think this should clear the sequencer cache of scenes that are using a scene strip with this camera. So not the active scene that the camera is in.

Simplest solution is just to clear the cache of all scenes which should be good enough.

source/blender/makesrna/intern/rna_color.c
576

This should do all scenes, not just the active one.

582

rna_color.c:582:10: warning: declaration of ‘scene’ shadows a parameter

source/blender/makesrna/intern/rna_scene.c
1810

Same comment here, it's not this scene whose cache needs to be cleared but the one that has the scene strip.

1967

Same comment.

source/blender/makesrna/intern/rna_sequencer.c
1760–1774

Descriptions here could explain a bit more? Perhaps something like this:

  • Cache raw images read from disk, for faster tweaking of strip parameters at the cost of memory usage
  • Cache preprocessed images, for faster tweaking of effects at the cost of memory usage
  • Cache intermediate composited images, for faster tweaking of stacked strips at the cost of memory usage
  • Cache final image for each frame
This revision now requires changes to proceed.Mar 25 2019, 7:11 PM

Works well in my tests, still some comments which are hopefully easy to address

  • I expected the options to show the cache to be in the View menu, similar to how it is in the timeline.
  • Cache visualization should be more similar to the timeline (just add e.g. a softbody modifier to see it). Thinner, orange and semi-transparent and maybe at the bottom? It's a bit too eye-catching now I think, and drawing similar to the timeline helps users recognize what it is.

Not familiar with softbody cache visualization - I can make it similar to that. I took inspiration from movieclip.
Certainly thinner would be OK
Placement on top was chosen due to fact, that that area is more likely to be free.
in 2.79 channel 0 was drawn above the scrollers, and was always empty, so I was drawing there.

I can offset drawing strips above scrollers and remove channel 0. Would that be OK?

  • The little blocks drawing over strips seems rather distracting to me. I'm not sure this is something we should show by default, it seems more like a debugging thing? I'd suggest to have that off by default, or leave it only as a debugging option for developers. Unless there is a purpose that I'm missing.

Should this be controlled by #define or just as a menu item?

  • Do you plan to add more fine-grained strip invalidation? For example if you tweak a strip saturation, it could leave the raw cache intact and only clear the preprocessed one? Doesn't have to be part of this commit, just thinking this could go a long way to making parameter tweaking smoother.

Yes.
Plan is to tackle performance first with prefetch and cache disk dumping, which should cover most (all) cases.
Finer invalidation should follow these changes.

source/blender/blenkernel/intern/seqcache.c
485

A thread may be using cache when it's being freed, so it should be locked.

Here I unlocked the cache after cache is freed, but looking at a way I did this may be flawed.

Not familiar with softbody cache visualization - I can make it similar to that. I took inspiration from movieclip.
Certainly thinner would be OK

Seems we are already a bit inconsistent in Blender itself. I don't really care if it's blue or orange, but this thinner semi-transparent style that both the timeline and movieclip editor use would be good.

Placement on top was chosen due to fact, that that area is more likely to be free.
in 2.79 channel 0 was drawn above the scrollers, and was always empty, so I was drawing there.
I can offset drawing strips above scrollers and remove channel 0. Would that be OK?

To me it seems reasonable. But I don't know what channel 0 is/was even for so don't know the implications of removing it.

  • The little blocks drawing over strips seems rather distracting to me. I'm not sure this is something we should show by default, it seems more like a debugging thing? I'd suggest to have that off by default, or leave it only as a debugging option for developers. Unless there is a purpose that I'm missing.

I'd be inclined to just use a #define, unless you expect users will want to see this.

Plan is to tackle performance first with prefetch and cache disk dumping, which should cover most (all) cases.
Finer invalidation should follow these changes.

I was thinking of a different problem than prefetch and cache disk solve. What I think could be faster is tweaking strip parameters and effect parameters within a single frame. Right now it seems to load raw images and re-render scene strips for that all the time. Using a variation of rna_Sequence_update for certain properties that don't need to clear raw images in the cache would be relatively simple to add I expect. And as I understand it the raw images for the current frame are always in the cache regardless of the cache settings.

source/blender/blenkernel/intern/seqcache.c
485

This function is only called from BKE_sequencer_editing_free which frees all the sequencer data structures, I don't think it's possible a thread would still be using the cache then?

I was thinking of a different problem than prefetch and cache disk solve. What I think could be faster is tweaking strip parameters and effect parameters within a single frame. Right now it seems to load raw images and re-render scene strips for that all the time. Using a variation of rna_Sequence_update for certain properties that don't need to clear raw images in the cache would be relatively simple to add I expect. And as I understand it the raw images for the current frame are always in the cache regardless of the cache settings.

I understand.
This case could be solved easily. I will do separate patch for this.

Richard Antalik (ISS) marked 28 inline comments as done.Apr 15 2019, 12:59 AM
Richard Antalik (ISS) updated this revision to Diff 14731.

Cache visualization is now displayed in similar way to timeline cache visualization.

I decided to visualize raw, preprocessed and composite images inside of strips as before, otherwise it wouldn't make sense, but style is changed to be consistent. Also to see these types you have to manually enable them.

release/scripts/startup/bl_ui/space_sequencer.py
1164–1167

Best to explain in detail in user manual I guess?

source/blender/blenloader/intern/versioning_280.c
2842–2846

I guess I will have to resolve possible conflicts before committing here

source/blender/editors/space_sequencer/sequencer_draw.c
1766

Got no warning here. Should be fixed.

source/blender/makesrna/intern/rna_color.c
582

got no warning here, should be fixed. Even though renaming that variable may be better?

Brecht Van Lommel (brecht) updated this revision to Diff 14935.

Rebase on latest master, apply clang-format, subversion bump.

Fix compiler warnings.

Harbormaster completed remote builds in B3431: Diff 14936.

Fix missing Show Cache submenu in the View menu.

I reviewed the latest changes, and rebased the code along with it. To me this seems ready to go in.

If you commit this, do not use arc land. It might have me as the author due to being the last to update this diff. And regardless, the commit message should be manually edited to follow the guidelines.

release/scripts/startup/bl_ui/space_sequencer.py
1164–1167

Indeed.

This revision is now accepted and ready to land.Apr 25 2019, 9:49 PM

Thanks, I applied clang-format yesterday.

Also found typo: SEQ_CACH_ALL_TYPES will fix that and commit.

This revision was automatically updated to reflect the committed changes.
Richard Antalik (ISS) edited the summary of this revision. (Show Details)Apr 29 2019, 12:29 AM

Committed with modification to SEQUENCER_MT_view_cache, so it is consistent with timeline.
Modified diff description with actual design documentation