Page MenuHome

VSE cache rewrite
Needs ReviewPublic

Authored by Richard Antalik (ISS) on Mar 4 2019, 1:30 AM.

Details

Summary

Part of D3934: VSE cache with frame prefetching (WIP) and a basis for prefetching

there are few changes to D3934: VSE cache with frame prefetching (WIP)

  • removed some unnecessary code
  • induvidual cache settings for sequences
  • toggle for memview to show cost
  • prefetch/store ratio replaced with direct min cost property.
    • prefetch/store ratio behavior felt unnatural to me.
    • Would have to edit some video to test how this works

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
source/blender/blenkernel/BKE_sequencer.h
273–285

A lot of this stuff should not be exposed as public functions, but rather kept as static functions in the file.

Otherwise it's not clear which functions from this long list are ones that you are actually supposed to use.

source/blender/blenkernel/intern/seqcache.c
187–188

This is going to give warnings with some compilers, just assign test on the line above and check if it's != NULL in the if.

193–196

The cache will not be locked here after BKE_sequencer_cache_create() even though the rest of the function expects it.

The cache mutex will need to be moved outside of the cache itself to support this kind of on-demand cache creation.

198

Nitpick: best to move the SeqCacheKey key line right above key.seq = seq, reduces risk of using uninitialized variables.

377

static functions should not have BKE_ prefix.

I guess this function needs a lock too but has no comment about it.

468

shuuldn't -> shouldn't, safe, then -> safe than.

source/blender/blenkernel/intern/sequencer.c
3540–3541

Add a comment like this, since we don't use clock() much and it's not obvious why it is used here: "Estimate time spent by the program rendering the strip. clock() is used so we count time spent by the program itself, not total time passed."

This begin/end time estimation logic is also duplicated a few times, best to wrap that in a function.

clock_t begin = seq_estimate_render_cost_begin();
float cost = seq_estimate_render_cost_end(scene, begin);
3748

Is this ever not true?

source/blender/editors/space_sequencer/sequencer_draw.c
1638

{ on new line.

1644

Just return if this is false so code does need to be indented so much.

1652–1655

I suggest to just use float bg_color[4] = {0.5f, 0.5f, 0.5f, 1.0f}; and Color4f here.

1694

This is too late to clamp to 255, it will already have overflowed since char can't store more than 255. Just use float instead.

source/blender/editors/transform/transform_generics.c
1058–1063

Seems BKE_sequence_tx_fullupdate_test is no longer needed if it does the same either way.

source/blender/imbuf/IMB_moviecache.h
45 ↗(On Diff #13992)

Exposing these structs in the public API is not ideal, can we add a few IMB_moviecache_* functions to keep the implementation hidden?

source/blender/makesdna/DNA_sequence_types.h
650–651

Is this still a TODO?

This revision now requires changes to proceed.Mar 5 2019, 5:36 PM
Richard Antalik (ISS) marked 17 inline comments as done.
Richard Antalik (ISS) added inline comments.
release/scripts/startup/bl_ui/space_sequencer.py
1150

So it should be also Proxy Settings Strip Cache?

Settings may seem redundant, but both proxy and cache panels are split for global settings and per strip settings.

I want this distinction to be as evident as possible

1195

fix in this patch?

source/blender/blenkernel/intern/seqcache.c
112

This key linking system is not obvious and hard to understand just from code.
I should probably write few sentences in comment explaining purpose and functionality.

To answer if BKE_sequencer_cache_put_if_possible can not put 'final' image in the cache, all images that has been put in after last 'final' image will be marked as temporary and freed before next frame is rendered.

basically this is so we don't stop caching in the middle of rendering.

377

This function should be called only from within "locked" function.

I only commented about lock usage in public functions.
Haven't cleaned api though.

423

Needed this for prefetch as it has to have it's own cache handling.

At least in latest stage I don't exit thread, but rather put thread to sleep when memory is full. When freed, BKE_sequencer_cache_recycle_linked is called.

Will rethink this not even sure if I should just "pause" execution or rather exit thread.

source/blender/editors/transform/transform_generics.c
1058–1063

Good catch.
I probably changed this in stage when I was trying to use builtin limiter, which doesn't care of any dependencies.

Now I may be able to restore this functionality.

I have plans to go over whole invalidation system, as it seems more broken then functional to me.

source/blender/imbuf/IMB_moviecache.h
45 ↗(On Diff #13992)

Sorry I didn't know this would be a problem.

I could do that, question is, if it is good idea to introduce such low level features to moviecache API.

So I would need IMB_moviecache_* functions:

  • expose moviecache_keyfree()
  • expose moviecache_valfree()
  • get MovieCacheItem->ibuf to calculate size and manage memory_used
  • get MovieCacheKey->userkey for following key links
  • get cache->hash for iterator
  • get cache->last_key for linking keys

get/set cache->memory_used can be external to cache
is provided by limiter, but I've had to duplicate this functionality as I don't use limiter

cache->cache_mutex can be external to cache

source/blender/makesdna/DNA_sequence_types.h
650–651

Yes, this would be preview image transformed to display colorspace

Not sure if I should add support for this as we can use GLSL, which is fast.
Also if using GLSL I would have to read texture data from GPU to store it in RAM?

The other thing would be then eventual prefetch support to generate this kind of image.

source/blender/makesrna/intern/rna_sequencer.c
1755–1758

Cost is proportional to rendering time, inversely proportional to playback frame rate. Cost of 1 is when render rate of image is the same as playback frame rate.

Is this too long?

Or cost = rendering time / playback frame rate

1776–1780

cost has defined maximum of SEQ_CACHE_COST_MAX which is 10

You would want to set this value if you have part of timeline that renders really slowly, and rest is fast enough. Slow part of timeline stays in cache while rest of available memory is used to cycle "cheap" frames.

When you run out of memory, no new frames will be added to cache. This property should be set to SEQ_CACHE_COST_MAX by default, so no frames are "locked" in cache.

This is easiest method I can think of to achieve this effect. As I said it may be not good enough, but I can not think about anything better.

Brecht Van Lommel (brecht) requested changes to this revision.Mar 6 2019, 6:59 PM
Brecht Van Lommel (brecht) added inline comments.
release/scripts/startup/bl_ui/space_sequencer.py
1150

Indeed all the category names should be capitalized like a title.

Leaving Settings in this case seems fine then.

1195

If it's just a few places might as well. Also fine to commit the change any time before this patch.

source/blender/blenkernel/intern/seqcache.c
423

Not sure I understand, in this patch this function is only called from BKE_sequencer_cache_get_item_for_removal just below.

source/blender/imbuf/IMB_moviecache.h
45 ↗(On Diff #13992)

Would it be possible to fold more features into the general movie cache?

I didn't look at it close enough, but I would guess that the logic from BKE_sequencer_cache_put_if_possible can be made generic. Maybe with some additional API functions or callback functions.

Right now I find it rather hard to understand the cache implementation spread out over two places that are somehow still tightly coupled.

source/blender/makesdna/DNA_sequence_types.h
650–651

For GLSL you would only write a float buffer to GPU memory and let it do the color management on every draw. The result would never be read back to the CPU.

If GLSL is fast enough in practice that seems preferable.

source/blender/makesrna/intern/rna_sequencer.c
1663

For property UI names, every word needs to be capitalized like a title. So this should be Cache Raw, and similar for other properties.

1755–1758

I think something like this could be ok:
Visualize cost of rendering the frame (render time divided by playback frame rate)

Also note descriptions should not have a . at the end.

1776–1780

Ok, I guess what we are trying to control here is relative priority between recent frames and older expensive frames.

Perhaps it should still leave e.g. 30% of the cache for cheaper frames, so it doesn't working entirely? The cost could also be relative. You could consider frame A in the cache to be "expensive" if its cost is at least e.g. 2x as high as frame B you are trying to insert.

If it's an advanced option disabled by default we can experiment with different strategies after this patch is committed though, it doesn't have to be perfect.

This revision now requires changes to proceed.Mar 6 2019, 6:59 PM
Richard Antalik (ISS) marked 11 inline comments as done.

renamed free_after_render -> is_temp_cache
removed some prefetch specific code
Added this text to seqcache.c:

Sequencer cache design notes
Most of the functionality is compatible with IMB_moviecache.
Cache size limiting and "threaded access" is not.

Cache key members:
is_temp_cache - this cache entry will be freed before rendering next frame
creator_id - ID of thread that created entry
cost - In short: render time divided by playback frame rate
link_prev/next - link to another entry created during rendering of the frame

Linking: We use links to reduce number of iterations needed to manage cache.
Only pernament (is_temp_cache = 0) cache entries are linked.

Function:
All images created during rendering are added to cache, even if the cache is already full.
This is because:

  • one image may be needed multiple times during rendering.
  • keeping the last rendered frame allows us for faster re-render when user edits strip in stack
  • we can decide if we keep frame only when it's completely rendered. Otherwise we risk having "holes" in the cache, which can be annoying

If the cache is full all entries for pending frame will have is_temp_cache set.

Only entire frame can be freed to release resources for new entries (recycling).
Once again, this is to reduce number of iterations, but also more controllable than removing
entries one by one in reverse order to their creation.

User can exclude caching of some images. Such entries will have is_temp_cache set.


I am sure I can add functionality to IMB_moviecache, I am just not sure if it would look nice...
From above text it seems that it just needs linking, and quite specific mode for limiter. Also I never did anything in CPP, so

source/blender/blenkernel/intern/seqcache.c
371

Forgot to ask about creator_id

This is to identify which thread is clearing "temp cache"
I heard that some API to identify threads was introduced, can I use that?

Or perhaps I should create enum listing "applications" that can run in background using cache. Seems less prone to erratic function(leftover fragments in cache etc.)

source/blender/makesrna/intern/rna_sequencer.c
1776–1780

This can be considered advanced option. Experimental for sure - lot of development can be done here. I would start with something simple.

The cost could also be relative. You could consider frame A in the cache to be "expensive" if its cost is at least e.g. 2x as high as frame B you are trying to insert.

I had it working this way. Sounds good on paper, but in reality it caused problems and was not reliable. this way you can be sure that further playback wont affect "locked" frames.

Perhaps it should still leave e.g. 30% of the cache for cheaper frames, so it doesn't working entirely?

I can do this, but I would rather implement prefetching first, that way I can better test features to not be in conflict.

Also sequencer does not need cache to work properly. Currently when you fill up the cache in sequencer, frames are not freed until you manually hit the refresh button.

Richard Antalik (ISS) edited the summary of this revision. (Show Details)Mar 7 2019, 12:39 AM

I was thinking about this IMB_moviecache issue we have here - in D3934: VSE cache with frame prefetching (WIP) I pointed to a direction I want to go with cache

Plans for future work
Image dumping:

  • Add MovieCache->dumper
    • This will be callback, that will accept void *userkey, ImBuf *ibuf (anything else?)
  • Sequencer should set callback similar to function seq_proxy_build_frame only it will get SeqCacheKey and ImBuf
  • Save image to <project-path>/<scene-name>/<seq-name>/<cache-type[Number]>/cfra.ext
    • formats to use should be (at least to start with) JPEG, PNG, EXR
    • path MUST be properly escaped (ideally scene/strip names should be limited to be filename compliant)

Cache invalidation:

  • Proper system must be developed, that will determine, what images must be deleted, by provided action and strips affected
  • With Image dumping, We will need a means to manage validity of external files.
    • Should be fast and shouldn't use too much memory. Saving checksums of images is probably most effective solution.
    • Does not have to be 100% correct depending on implementation? This may not be the best idea, as it would mean, that there would have to be a manual full check routine at least for range, which would complicate code significantly...
    • checksum table file must be pernament. Either as external file(slow?), or included in .blend itself as a data structure(may result in larger .blend file). Tables must be resizeable, appendable, randomly accessible.
    • if file is to be read to calculate checksum, can we save time, by reading every n-th byte?

Example:
10min of footage in timeline at 60FPS, SEQ_CACHE_COMPOSITE and SEQ_CACHE_FINAL_OUT are dumped
36000 frames total, ~70KB of data with CRC8 checksum


IMB_moviecache doesn't really need such features, so maybe it would be better to not use IMB_moviecache at all.
This may lead to some code duplication, but I guess cleaner implementation.

Richard Antalik (ISS) marked 11 inline comments as done.
Richard Antalik (ISS) edited the summary of this revision. (Show Details)

Added default values for RNA properties
SeqCache is "self contained" - no reference to IMB_moviecache

Implemented 2 iterators for memview

Ghash

  • fast
  • need to lock cache while iterating

SeqCacheStatsItem array

  • slower (need to alloc memory)
  • no lock needed
  • user has to free memory

With iterators I was considering application such as https://devtalk.blender.org/t/vse-strip-previews/5854
But that may need special handling...

  • change type of BKE_sequencer_cache_put_if_possible to bool
  • implement ghash iterator utilizing callback

This allows for fastest iteration, while managing cache lock internally.

Brecht Van Lommel (brecht) requested changes to this revision.Mar 25 2019, 7:11 PM

Works well in my tests, still some comments which are hopefully easy to address

  • I expected the options to show the cache to be in the View menu, similar to how it is in the timeline.
  • Cache visualization should be more similar to the timeline (just add e.g. a softbody modifier to see it). Thinner, orange and semi-transparent and maybe at the bottom? It's a bit too eye-catching now I think, and drawing similar to the timeline helps users recognize what it is.
  • The little blocks drawing over strips seems rather distracting to me. I'm not sure this is something we should show by default, it seems more like a debugging thing? I'd suggest to have that off by default, or leave it only as a debugging option for developers. Unless there is a purpose that I'm missing.
  • Do you plan to add more fine-grained strip invalidation? For example if you tweak a strip saturation, it could leave the raw cache intact and only clear the preprocessed one? Doesn't have to be part of this commit, just thinking this could go a long way to making parameter tweaking smoother.
release/scripts/startup/bl_ui/space_sequencer.py
1161–1169

This should not use toggle, we only use that for specific cases, not regular booleans.

1164–1167

Looking at this UI, it still seems complicated to explain to users how to configure this, why you would enable certain options and what the pros/cons are.

The defaults seems quite reasonable to me though, so hopefully it won't need much manual tweaking.

1270

Add layout.active = strip.override_cache_settings, so these settings are grayed out when not in use.

1271–1273

Same here, this should not use toggle.

source/blender/blenkernel/intern/seqcache.c
177

This function (except for the IB_PERSISTENT check) could be a general utility function in ImBuf and shared with moviecache?

463

This is not strictly threadsafe, two threads could be creating the sequencer cache at the same time. But it's easy to fix with a global mutex lock, which is fine in this case since it's not a performance bottleneck.

static ThreadMutex cache_create_lock = BLI_MUTEX_INITIALIZER;

BLI_mutex_lock(&cache_create_lock);
if (scene->ed->cache == NULL) {
    SeqCache *cache = MEM_callocN(sizeof(SeqCache), "SeqCache");
    ...
    scene->ed->cache = cache;
}
BLI_mutex_unlock(&cache_create_lock);

return scene->ed->cache;
485

There is no corresponding unlock which is problematic.

But there is no need to lock the cache here, this should never be called from multiple threads. So just remove this line.

source/blender/blenkernel/intern/sequencer.c
2845

sequencer.c:2773:89: warning: unused parameter ‘nr’

3534

Use seq_estimate_render_cost_begin(void) to fix warning.

3685

This line does nothing? I think you can just declare clock_t begin later on instead of the start of the function.

3721

Use seq_estimate_render_cost_begin()

3746

Use seq_estimate_render_cost_begin()

3761

This line does nothing?

3795

You shouldn't have to cast to (Sequence **) here? I think passing , seq_arr) would be correct?

4030

sequencer.c:4030:95: warning: unused parameter ‘invalidate_preprocess

source/blender/blenloader/intern/versioning_280.c
2842–2846

You should increase the subversion in BKE_blender_version.h and add this in a new if (!MAIN_VERSION_ATLEAST(bmain, 280, XX)) { block. Otherwise it won't correctly initialize all cases.

source/blender/editors/space_sequencer/sequencer_draw.c
1671

sequencer_draw.c:1671:16: warning: initialization discards ‘const’ qualifier from pointer target type

1755

sequencer_draw.c:1755:8: warning: unused variable ‘final_stripe_offs’

1766

sequencer_draw.c:1766:48: warning: passing argument 3 of ‘BKE_sequencer_cache_iterate’ from incompatible pointer type

source/blender/editors/space_sequencer/space_sequencer.c
681–682

Remove both the comment and this line I think, if it's no longer needed. Seems like a workaround for some kind of bug.

source/blender/makesdna/DNA_sequence_types.h
675–676

Does this TODO need to be in the commit?

source/blender/makesrna/intern/rna_camera.c
128

I think this should clear the sequencer cache of scenes that are using a scene strip with this camera. So not the active scene that the camera is in.

Simplest solution is just to clear the cache of all scenes which should be good enough.

source/blender/makesrna/intern/rna_color.c
576

This should do all scenes, not just the active one.

582

rna_color.c:582:10: warning: declaration of ‘scene’ shadows a parameter

source/blender/makesrna/intern/rna_scene.c
1810

Same comment here, it's not this scene whose cache needs to be cleared but the one that has the scene strip.

1967

Same comment.

source/blender/makesrna/intern/rna_sequencer.c
1760–1774

Descriptions here could explain a bit more? Perhaps something like this:

  • Cache raw images read from disk, for faster tweaking of strip parameters at the cost of memory usage
  • Cache preprocessed images, for faster tweaking of effects at the cost of memory usage
  • Cache intermediate composited images, for faster tweaking of stacked strips at the cost of memory usage
  • Cache final image for each frame
This revision now requires changes to proceed.Mar 25 2019, 7:11 PM

Works well in my tests, still some comments which are hopefully easy to address

  • I expected the options to show the cache to be in the View menu, similar to how it is in the timeline.
  • Cache visualization should be more similar to the timeline (just add e.g. a softbody modifier to see it). Thinner, orange and semi-transparent and maybe at the bottom? It's a bit too eye-catching now I think, and drawing similar to the timeline helps users recognize what it is.

Not familiar with softbody cache visualization - I can make it similar to that. I took inspiration from movieclip.
Certainly thinner would be OK
Placement on top was chosen due to fact, that that area is more likely to be free.
in 2.79 channel 0 was drawn above the scrollers, and was always empty, so I was drawing there.

I can offset drawing strips above scrollers and remove channel 0. Would that be OK?

  • The little blocks drawing over strips seems rather distracting to me. I'm not sure this is something we should show by default, it seems more like a debugging thing? I'd suggest to have that off by default, or leave it only as a debugging option for developers. Unless there is a purpose that I'm missing.

Should this be controlled by #define or just as a menu item?

  • Do you plan to add more fine-grained strip invalidation? For example if you tweak a strip saturation, it could leave the raw cache intact and only clear the preprocessed one? Doesn't have to be part of this commit, just thinking this could go a long way to making parameter tweaking smoother.

Yes.
Plan is to tackle performance first with prefetch and cache disk dumping, which should cover most (all) cases.
Finer invalidation should follow these changes.

source/blender/blenkernel/intern/seqcache.c
485

A thread may be using cache when it's being freed, so it should be locked.

Here I unlocked the cache after cache is freed, but looking at a way I did this may be flawed.

Not familiar with softbody cache visualization - I can make it similar to that. I took inspiration from movieclip.
Certainly thinner would be OK

Seems we are already a bit inconsistent in Blender itself. I don't really care if it's blue or orange, but this thinner semi-transparent style that both the timeline and movieclip editor use would be good.

Placement on top was chosen due to fact, that that area is more likely to be free.
in 2.79 channel 0 was drawn above the scrollers, and was always empty, so I was drawing there.

I can offset drawing strips above scrollers and remove channel 0. Would that be OK?

To me it seems reasonable. But I don't know what channel 0 is/was even for so don't know the implications of removing it.

  • The little blocks drawing over strips seems rather distracting to me. I'm not sure this is something we should show by default, it seems more like a debugging thing? I'd suggest to have that off by default, or leave it only as a debugging option for developers. Unless there is a purpose that I'm missing.

I'd be inclined to just use a #define, unless you expect users will want to see this.

Plan is to tackle performance first with prefetch and cache disk dumping, which should cover most (all) cases.
Finer invalidation should follow these changes.

I was thinking of a different problem than prefetch and cache disk solve. What I think could be faster is tweaking strip parameters and effect parameters within a single frame. Right now it seems to load raw images and re-render scene strips for that all the time. Using a variation of rna_Sequence_update for certain properties that don't need to clear raw images in the cache would be relatively simple to add I expect. And as I understand it the raw images for the current frame are always in the cache regardless of the cache settings.

source/blender/blenkernel/intern/seqcache.c
485

This function is only called from BKE_sequencer_editing_free which frees all the sequencer data structures, I don't think it's possible a thread would still be using the cache then?

I was thinking of a different problem than prefetch and cache disk solve. What I think could be faster is tweaking strip parameters and effect parameters within a single frame. Right now it seems to load raw images and re-render scene strips for that all the time. Using a variation of rna_Sequence_update for certain properties that don't need to clear raw images in the cache would be relatively simple to add I expect. And as I understand it the raw images for the current frame are always in the cache regardless of the cache settings.

I understand.
This case could be solved easily. I will do separate patch for this.

Richard Antalik (ISS) marked 28 inline comments as done.

Cache visualization is now displayed in similar way to timeline cache visualization.

I decided to visualize raw, preprocessed and composite images inside of strips as before, otherwise it wouldn't make sense, but style is changed to be consistent. Also to see these types you have to manually enable them.

release/scripts/startup/bl_ui/space_sequencer.py
1164–1167

Best to explain in detail in user manual I guess?

source/blender/blenloader/intern/versioning_280.c
2842–2846

I guess I will have to resolve possible conflicts before committing here

source/blender/editors/space_sequencer/sequencer_draw.c
1766

Got no warning here. Should be fixed.

source/blender/makesrna/intern/rna_color.c
582

got no warning here, should be fixed. Even though renaming that variable may be better?