VSE 2.0: Performance, Cache System #80278

Open
opened 2020-08-31 14:47:29 +02:00 by Sergey Sharybin · 25 comments

Video Sequencer Cache

NOTE: This is the first pass of the design. It will be worked a bit more after discussion within the module, and presentation and diagrams will become more clear.

This document describes caching system which design and implementation for the VSE 2.0 project (#78986).
There is some intersection of performance topics listed in the VSE 2.0: Performance (#78992).

User level design

On user level cache system should follow zero configuration principle: the video playback and tweaking should be as fast as possible without user spending time on fine-tuning per-project settings.

The only settings which user should be interacting with are in the User Preferences:

  • In-memory cache size limit
  • Disk cache size limit
  • Disk cache folder

These settings are set once in the User Preferences and are used by all sequencer projects. The default values are based on the minimal [[ https://www.blender.org/download/requirements/ | hardware
requirements ]].

Code design

Levels of caching

For best editing and playback performance multiple levels of cache are needed.
The most important ones are:

  • Cache of strip input (colormanaged using general Blender color space rules)
This allows to faster, without lag, move image strip in time, adjust strip settings like transform, by avoiding need to re-read file on every modification. Lets call this cache level `STRIP_INPUT`.
  • Cache of the final sequencer result.
This allows to have realtime playback after the sequencer got cached. Lets call this cache level `SEQUENCER_FINAL`.

The simplified flow of the image from disk to the artist is presented in the following diagram:

CacheFlow.png


NOTE

Need to think about whether having strip output cache is helpful. If the stack rendering is fast, having extra levels of cache will have negative affect due to less final frames fitting into the memory.


Cache referencing

Cache levels are to utilize reference counting as much as possible. For example, when having single Image Strip without modifications set up in the strip the final sequencer frame in the SEQUENCER_FINAL cache is to reference the image from STRIP_INPUT cache. This allows to minimize memory footprint, playback performance for the story boarding type of tasks performed in the sequencer.

The following example visualizes cache frame referencing in the following scenario:

  • Sequencer have single Image Strip using HappyFroggo.png as an input. The strip has length of 4.

CacheReference.png

In Blender terms, the cache contains a single copy of ImBuf created for HappyForggo.png. All the sequencer cache entries are referencing this ImBuf for lowest possible memory footprint.

Cache resolution

In a typical video editing scenario an artist views the sequencer result in a rather small area of the screen layout:

VSELayout.png

This behavior can be exploited in the following way: the sequencer processing and caching can happen in the lower resolution. This is something what current proxies design is solving, but does in the fully manual manner.

There is a possibility to make proxies behavior more automatic, by performing downscale on an image after it was read from disk, but before it gets put to the STRIP_INPUT cache. Default behavior could be something like:

  • Use closest power-of-two downscaling (similar to mipmaps)
  • The target resolution is 50% of the window resolution, but no more than 1080p.

In order to support workflows where an artist needs to investigate in a close-up manner the final result, there will be a display option Best Quality (but defaulting to Best Performance). This could fit into existing Proxy Display Size menu.

In the future more automated input resolution selection is possible to be implemented. For example, it is possible to automatically switch to the
Best Quality mode when zoom-in is detected.

Image scaling with a power-of-two scale factor can be implemented very efficiently using threading and vectorization.

On a performance aspect, for image sequences such scale down will be an extra computation, which will only pay off if effects/transformation is used.

For the movie files, this step will actually make things faster because it is required to convert color spaces (happening in sws_scale), which is not
threadable. The scale down will be done together with color space conversion, which is expected to give better performance compared to the current state of the sequencer playback.


Task progress

# Video Sequencer Cache NOTE: This is the first pass of the design. It will be worked a bit more after discussion within the module, and presentation and diagrams will become more clear. This document describes caching system which design and implementation for the `VSE 2.0 project` (#78986). There is some intersection of performance topics listed in the `VSE 2.0: Performance ` (#78992). ## User level design On user level cache system should follow zero configuration principle: the video playback and tweaking should be as fast as possible without user spending time on fine-tuning per-project settings. The only settings which user should be interacting with are in the `User Preferences`: * In-memory cache size limit * Disk cache size limit * Disk cache folder These settings are set once in the User Preferences and are used by all sequencer projects. The default values are based on the minimal [[ https://www.blender.org/download/requirements/ | hardware requirements ]]. ## Code design ### Levels of caching For best editing and playback performance multiple levels of cache are needed. The most important ones are: * Cache of strip input (colormanaged using general Blender color space rules) ``` This allows to faster, without lag, move image strip in time, adjust strip settings like transform, by avoiding need to re-read file on every modification. Lets call this cache level `STRIP_INPUT`. ``` * Cache of the final sequencer result. ``` This allows to have realtime playback after the sequencer got cached. Lets call this cache level `SEQUENCER_FINAL`. ``` The simplified flow of the image from disk to the artist is presented in the following diagram: ![CacheFlow.png](https://archive.blender.org/developer/F8825513/CacheFlow.png) --- **NOTE** Need to think about whether having strip output cache is helpful. If the stack rendering is fast, having extra levels of cache will have negative affect due to less final frames fitting into the memory. --- ### Cache referencing Cache levels are to utilize reference counting as much as possible. For example, when having single Image Strip without modifications set up in the strip the final sequencer frame in the `SEQUENCER_FINAL` cache is to reference the image from `STRIP_INPUT` cache. This allows to minimize memory footprint, playback performance for the story boarding type of tasks performed in the sequencer. The following example visualizes cache frame referencing in the following scenario: * Sequencer have single Image Strip using `HappyFroggo.png` as an input. The strip has length of 4. ![CacheReference.png](https://archive.blender.org/developer/F8825521/CacheReference.png) In Blender terms, the cache contains a single copy of `ImBuf` created for `HappyForggo.png`. All the sequencer cache entries are referencing this `ImBuf` for lowest possible memory footprint. ### Cache resolution In a typical video editing scenario an artist views the sequencer result in a rather small area of the screen layout: ![VSELayout.png](https://archive.blender.org/developer/F8825524/VSELayout.png) This behavior can be exploited in the following way: the sequencer processing and caching can happen in the lower resolution. This is something what current proxies design is solving, but does in the fully manual manner. There is a possibility to make proxies behavior more automatic, by performing downscale on an image after it was read from disk, but before it gets put to the `STRIP_INPUT` cache. Default behavior could be something like: * Use closest power-of-two downscaling (similar to mipmaps) * The target resolution is 50% of the window resolution, but no more than 1080p. In order to support workflows where an artist needs to investigate in a close-up manner the final result, there will be a display option `Best Quality` (but defaulting to `Best Performance`). This could fit into existing `Proxy Display Size` menu. In the future more automated input resolution selection is possible to be implemented. For example, it is possible to automatically switch to the `Best Quality` mode when zoom-in is detected. Image scaling with a power-of-two scale factor can be implemented very efficiently using threading and vectorization. On a performance aspect, for image sequences such scale down will be an extra computation, which will only pay off if effects/transformation is used. For the movie files, this step will actually make things faster because it is required to convert color spaces (happening in `sws_scale`), which is not threadable. The scale down will be done together with color space conversion, which is expected to give better performance compared to the current state of the sequencer playback. --- ## Task progress - [x] Cache referencing - d837923a56 - c74086376f - [x] User level design - [x] Levels of caching - f448ff2afe - 445ebcaa30 - [ ] Cache resolution - [D9414: VSE: Render in size nearest to preview image](https://archive.blender.org/developer/D9414) - Update/followup above patch to work on images
Author
Owner

Changed status from 'Needs Triage' to: 'Confirmed'

Changed status from 'Needs Triage' to: 'Confirmed'
Author
Owner

Added subscribers: @Sergey, @iss, @fsiddi

Added subscribers: @Sergey, @iss, @fsiddi

User level design

I agree. Though tools like prefetching and baking I think should exist and be controlled by user. I guess that is out of scope here anyway.

Code design - Levels of caching

First of all we must have clear definition what STRIP_INPUT is. I understand it that STRIP_INPUT is image as read, possibly prescaled.

For best editing and playback performance multiple levels of cache are needed.

This highly depends on rendering implementation. For example now we cache only final images because if you use a lot of effects processing is very slow. You can use cache to store rendered images and you can store maximum images when you store only final images. Downside is that cache is invalidated with any modification.

Storing raw or STRIP_INPUT doesn't really make too much sense in current state if you can save 10s or less of original footage with not so crazy timeline. It may save you few seconds of rendering, but on the other hand you can't store as much rendered frames in RAM.

Need to think about whether having strip output cache is helpful. If the stack rendering is fast, having extra levels of cache will have negative affect due to less final frames fitting into the memory.

This is actually important thing to consider. If we could significantly improve processing speed, We could better focus on "optimizing" IO operations with cache.
Personally I would rather work on processing performance and then have only STRIP_INPUT type cache. In some cases like processing in GPU you can't really have other types.

Code design - Cache referencing

Cache levels are to utilize reference counting as much as possible. For example, when having single Image Strip without modifications set up in the strip the final sequencer frame in the SEQUENCER_FINAL cache is to reference the image from STRIP_INPUT cache. This allows to minimize memory footprint, playback performance for the story boarding type of tasks performed in the sequencer.

This already happens in current cache design.

The following example visualizes cache frame referencing in the following scenario:

  • Sequencer have single Image Strip using HappyFroggo.png as an input. The strip has length of 4.

CacheReference.png

In Blender terms, the cache contains a single copy of ImBuf created for HappyForggo.png. All the sequencer cache entries are referencing this ImBuf for lowest possible memory footprint.

This change needs to be done partially in rendering code - lookup file and frame we are reading.
Change in cache would have to be to hash images against input file instead of strip. I am not sure if this would require own design unless we just use filepath, which would be probably sufficient.

Cache resolution

If I understand it correctly, with Best Performance, if we have media with resolution that doesn't match any fraction of project resolution, we prescale fast to closest resolution and then we use this image as if it was original?

Should we then drop 75% preview fraction as it is not close to power of 2 fraction? Or keep it for case where we are willing to build proxies at that size?

These changes are definitely possible, but they have little to do with cache? Regardless of that, this is not bad idea. I guess this would also require movie rendering to be handled in a bit different way to other images (passing desired resolution as argument at least)

### User level design I agree. Though tools like prefetching and baking I think should exist and be controlled by user. I guess that is out of scope here anyway. ### Code design - Levels of caching First of all we must have clear definition what `STRIP_INPUT` is. I understand it that `STRIP_INPUT` is image as read, possibly prescaled. >For best editing and playback performance multiple levels of cache are needed. This highly depends on rendering implementation. For example now we cache only final images because if you use a lot of effects processing is very slow. You can use cache to store rendered images and you can store maximum images when you store only final images. Downside is that cache is invalidated with any modification. Storing raw or `STRIP_INPUT` doesn't really make too much sense in current state if you can save 10s or less of original footage with not so crazy timeline. It may save you few seconds of rendering, but on the other hand you can't store as much rendered frames in RAM. >Need to think about whether having strip output cache is helpful. If the stack rendering is fast, having extra levels of cache will have negative affect due to less final frames fitting into the memory. This is actually important thing to consider. If we could significantly improve processing speed, We could better focus on "optimizing" IO operations with cache. Personally I would rather work on processing performance and then have only `STRIP_INPUT` type cache. In some cases like processing in GPU you can't really have other types. ### Code design - Cache referencing > Cache levels are to utilize reference counting as much as possible. For example, when having single Image Strip without modifications set up in the strip the final sequencer frame in the `SEQUENCER_FINAL` cache is to reference the image from `STRIP_INPUT` cache. This allows to minimize memory footprint, playback performance for the story boarding type of tasks performed in the sequencer. This already happens in current cache design. > The following example visualizes cache frame referencing in the following scenario: > > * Sequencer have single Image Strip using `HappyFroggo.png` as an input. The strip has length of 4. > > ![CacheReference.png](https://archive.blender.org/developer/F8825521/CacheReference.png) > > In Blender terms, the cache contains a single copy of `ImBuf` created for `HappyForggo.png`. All the sequencer cache entries are referencing this `ImBuf` for lowest possible memory footprint. This change needs to be done partially in rendering code - lookup file and frame we are reading. Change in cache would have to be to hash images against input file instead of strip. I am not sure if this would require own design unless we just use filepath, which would be probably sufficient. ### Cache resolution If I understand it correctly, with Best Performance, if we have media with resolution that doesn't match any fraction of project resolution, we prescale fast to closest resolution and then we use this image as if it was original? Should we then drop 75% preview fraction as it is not close to power of 2 fraction? Or keep it for case where we are willing to build proxies at that size? These changes are definitely possible, but they have little to do with cache? Regardless of that, this is not bad idea. I guess this would also require movie rendering to be handled in a bit different way to other images (passing desired resolution as argument at least)

Added subscriber: @brecht

Added subscriber: @brecht

The only settings which user should be interacting with are in the User Preferences:

Are you proposing to remove the Cache Settings panel entirely? Or is the plan only to change the default settings and behavior so that there is less need to tweak these settings?

Currently there is "Recycle Up To Cost" to prioritize keeping entries that took a long time to compute (e.g. scene strips) in the cache. That's not mentioned in this design doc, so I'm not sure if you intend to keep it?

Cache of strip input (colormanaged using general Blender color space rules)
Cache of the final sequencer result.

If I understand correctly, this corresponds to current Raw and Final caches.

Need to think about whether having strip output cache is helpful. If the stack rendering is fast, having extra levels of cache will have negative affect due to less final frames fitting into the memory.

I believe the current system always caches all strip outputs at the current frame. That seems like a good thing to keep at least.

> The only settings which user should be interacting with are in the User Preferences: Are you proposing to remove the Cache Settings panel entirely? Or is the plan only to change the default settings and behavior so that there is less need to tweak these settings? Currently there is "Recycle Up To Cost" to prioritize keeping entries that took a long time to compute (e.g. scene strips) in the cache. That's not mentioned in this design doc, so I'm not sure if you intend to keep it? > Cache of strip input (colormanaged using general Blender color space rules) > Cache of the final sequencer result. If I understand correctly, this corresponds to current Raw and Final caches. > Need to think about whether having strip output cache is helpful. If the stack rendering is fast, having extra levels of cache will have negative affect due to less final frames fitting into the memory. I believe the current system always caches all strip outputs at the current frame. That seems like a good thing to keep at least.
Author
Owner

@iss,

Though tools like prefetching and baking I think should exist and be controlled by user. I guess that is out of scope here anyway.

Implementation is out of this design scope indeed. But the mind set should be the same: zero configuration, best editor experience.

Not sure what you mean by baking here. You shouldn't bake anything to be able to edit videos.

Cache levels are to utilize reference counting as much as possible. For example, when having single Image Strip without modifications set up in the strip the final sequencer frame in the SEQUENCER_FINAL cache is to reference the image from STRIP_INPUT cache. This allows to minimize memory footprint, playback performance for the story boarding type of tasks performed in the sequencer.

This already happens in current cache design.

Does it happen in the design or implementation? With the attached file I expect the heavy operation to be only performed once, and the followup scrubbing and frame navigation should be realtime, and the memory footprint is to stay constant.

vse_cache_reference.zip

This is not the behavior I'm getting with this file.

Storing raw or STRIP_INPUT doesn't really make too much sense in current state if you can save 10s or less of original footage with not so crazy timeline. It may save you few seconds of rendering, but on the other hand you can't store as much rendered frames in RAM.

Keep in mind, in video editor you don't only playback or render, but also perform correction operations. Those must not be clogging the interface communication.
If the input data is not cached, I don't see how you can keep responsive interface while tweaking settings.

This change needs to be done partially in rendering code - lookup file and frame we are reading.
Change in cache would have to be to hash images against input file instead of strip. I am not sure if this would require own design unless we just use filepath, which would be probably sufficient.

To me this is an implementation detail.
Cache does not exist on its own, strip render does not exist on its own. They work together. Also this task is not about "changes are only done in seqcache.c and nowhere else".

If I understand it correctly, with Best Performance, if we have media with resolution that doesn't match any fraction of project resolution, we prescale fast to closest resolution and then we use this image as if it was original?

The behavior is similar to mipmaps.

@brecht,

Are you proposing to remove the Cache Settings panel entirely? Or is the plan only to change the default settings and behavior so that there is less need to tweak these settings?

Remove entirely.

I do not see any editor to go and really fine-tune settings for a specific project. You would need to come with really strong argument to move me from the option extermination mode ;)

Currently there is "Recycle Up To Cost" to prioritize keeping entries that took a long time to compute (e.g. scene strips) in the cache. That's not mentioned in this design doc, so I'm not sure if you intend to keep it?

This option slipped through my radars. I find this counter-intuitive option, which I don't know why it exists. Remove it.

If I understand correctly, this corresponds to current Raw and Final caches.

Indeed.
But currently we have quite too much of cache levels. And their interaction seems to be broken.

I believe the current system always caches all strip outputs at the current frame. That seems like a good thing to keep at least.

Is there a design doc explaining exact behavior of the current system?
When this cache of all outputs happens" During playback? After playback has stopped? Is it limited to outputs or also does inputs?


From reading this two replies, seems like all the required building blocks are either implemented or were intended in the original cache design. Meaning, this step in the project should be simple and straight-forward, right? :)

@iss, > Though tools like prefetching and baking I think should exist and be controlled by user. I guess that is out of scope here anyway. Implementation is out of this design scope indeed. But the mind set should be the same: zero configuration, best editor experience. Not sure what you mean by baking here. You shouldn't bake anything to be able to edit videos. >> Cache levels are to utilize reference counting as much as possible. For example, when having single Image Strip without modifications set up in the strip the final sequencer frame in the SEQUENCER_FINAL cache is to reference the image from STRIP_INPUT cache. This allows to minimize memory footprint, playback performance for the story boarding type of tasks performed in the sequencer. > This already happens in current cache design. Does it happen in the design or implementation? With the attached file I expect the heavy operation to be only performed once, and the followup scrubbing and frame navigation should be realtime, and the memory footprint is to stay constant. [vse_cache_reference.zip](https://archive.blender.org/developer/F8831896/vse_cache_reference.zip) This is not the behavior I'm getting with this file. > Storing raw or `STRIP_INPUT` doesn't really make too much sense in current state if you can save 10s or less of original footage with not so crazy timeline. It may save you few seconds of rendering, but on the other hand you can't store as much rendered frames in RAM. Keep in mind, in video editor you don't only playback or render, but also perform correction operations. Those must not be clogging the interface communication. If the input data is not cached, I don't see how you can keep responsive interface while tweaking settings. > This change needs to be done partially in rendering code - lookup file and frame we are reading. > Change in cache would have to be to hash images against input file instead of strip. I am not sure if this would require own design unless we just use filepath, which would be probably sufficient. To me this is an implementation detail. Cache does not exist on its own, strip render does not exist on its own. They work together. Also this task is not about "changes are only done in `seqcache.c` and nowhere else". > If I understand it correctly, with Best Performance, if we have media with resolution that doesn't match any fraction of project resolution, we prescale fast to closest resolution and then we use this image as if it was original? The behavior is similar to mipmaps. @brecht, > Are you proposing to remove the Cache Settings panel entirely? Or is the plan only to change the default settings and behavior so that there is less need to tweak these settings? Remove entirely. I do not see any editor to go and really fine-tune settings for a specific project. You would need to come with really strong argument to move me from the option extermination mode ;) > Currently there is "Recycle Up To Cost" to prioritize keeping entries that took a long time to compute (e.g. scene strips) in the cache. That's not mentioned in this design doc, so I'm not sure if you intend to keep it? This option slipped through my radars. I find this counter-intuitive option, which I don't know why it exists. Remove it. > If I understand correctly, this corresponds to current Raw and Final caches. Indeed. But currently we have quite too much of cache levels. And their interaction seems to be broken. > I believe the current system always caches all strip outputs at the current frame. That seems like a good thing to keep at least. Is there a design doc explaining exact behavior of the current system? When this cache of all outputs happens" During playback? After playback has stopped? Is it limited to outputs or also does inputs? --- From reading this two replies, seems like all the required building blocks are either implemented or were intended in the original cache design. Meaning, this step in the project should be simple and straight-forward, right? :)

In #80278#1007188, @Sergey wrote:

I believe the current system always caches all strip outputs at the current frame. That seems like a good thing to keep at least.

Is there a design doc explaining exact behavior of the current system?
When this cache of all outputs happens" During playback? After playback has stopped? Is it limited to outputs or also does inputs?

There are some notes about this in seqcache.c:

 * All images created during rendering are added to cache, even if the cache is already full.
 * This is because:
 *  - one image may be needed multiple times during rendering.
 *  - keeping the last rendered frame allows us for faster re-render when user edits strip in stack
 *  - we can decide if we keep frame only when it's completely rendered. Otherwise we risk having
 *    "holes" in the cache, which can be annoying
 * If the cache is full all entries for pending frame will have is_temp_cache set.

...

  bool is_temp_cache; /* this cache entry will be freed before rendering next frame */

I believe it's for all cache levels. And freeing happens right before another frame is rendered, it's not related to playback specifically.

> In #80278#1007188, @Sergey wrote: >> I believe the current system always caches all strip outputs at the current frame. That seems like a good thing to keep at least. > > Is there a design doc explaining exact behavior of the current system? > When this cache of all outputs happens" During playback? After playback has stopped? Is it limited to outputs or also does inputs? There are some notes about this in `seqcache.c`: ``` * All images created during rendering are added to cache, even if the cache is already full. * This is because: * - one image may be needed multiple times during rendering. * - keeping the last rendered frame allows us for faster re-render when user edits strip in stack * - we can decide if we keep frame only when it's completely rendered. Otherwise we risk having * "holes" in the cache, which can be annoying * If the cache is full all entries for pending frame will have is_temp_cache set. ... bool is_temp_cache; /* this cache entry will be freed before rendering next frame */ ``` I believe it's for all cache levels. And freeing happens right before another frame is rendered, it's not related to playback specifically.
Author
Owner

Is it correct that putting images to the non-final-cache only happens on sequencer "re-render" (when user changes strip property)?

I think at this point it's better to let @iss to have a look into the file I've attached before. If it will be possible to move the strip under the playhead without any latency and lags, and have a single image stored in the cache that would address a lot of points from this design.

Is it correct that putting images to the non-final-cache only happens on sequencer "re-render" (when user changes strip property)? I think at this point it's better to let @iss to have a look into the file I've attached before. If it will be possible to move the strip under the playhead without any latency and lags, and have a single image stored in the cache that would address a lot of points from this design.

In #80278#1007326, @Sergey wrote:
Is it correct that putting images to the non-final-cache only happens on sequencer "re-render" (when user changes strip property)?

I don't think so, there is no distinction between render and re-render here as far as I know.

I think at this point it's better to let @iss to have a look into the file I've attached before. If it will be possible to move the strip under the playhead without any latency and lags, and have a single image stored in the cache that would address a lot of points from this design.

The cache key is the sequence strip + scene frame + cache type. To make single images and dragging strips work that scene frame would need to be replaced with a local frame within the strip, and that will probably require a bunch of changes to invalidation, clearing and drawing of the cache.

> In #80278#1007326, @Sergey wrote: > Is it correct that putting images to the non-final-cache only happens on sequencer "re-render" (when user changes strip property)? I don't think so, there is no distinction between render and re-render here as far as I know. > I think at this point it's better to let @iss to have a look into the file I've attached before. If it will be possible to move the strip under the playhead without any latency and lags, and have a single image stored in the cache that would address a lot of points from this design. The cache key is the sequence strip + scene frame + cache type. To make single images and dragging strips work that scene frame would need to be replaced with a local frame within the strip, and that will probably require a bunch of changes to invalidation, clearing and drawing of the cache.

Added subscriber: @ChristopherAnderssarian

Added subscriber: @ChristopherAnderssarian

In #80278#1007188, @Sergey wrote:

Are you proposing to remove the Cache Settings panel entirely? Or is the plan only to change the default settings and behavior so that there is less need to tweak these settings?

Remove entirely. I do not see any editor to go and really fine-tune settings for a specific project. You would need to come with really strong argument to move me from the option extermination mode ;)

One thing I would argue for, is the ability to disallow proxy data form being cached. As they're already optimised for fast decode they really shouldn't need to be cached (especially when primarily doing temporal editing).
But my concern is that they will over use resources that would benefit other types of strips (strips that: can't be proxied or those generated in Blender).

> In #80278#1007188, @Sergey wrote: >> Are you proposing to remove the Cache Settings panel entirely? Or is the plan only to change the default settings and behavior so that there is less need to tweak these settings? > Remove entirely. I do not see any editor to go and really fine-tune settings for a specific project. You would need to come with really strong argument to move me from the option extermination mode ;) One thing I would argue for, is the ability to disallow proxy data form being cached. As they're already optimised for fast decode they really shouldn't need to be cached (especially when primarily doing temporal editing). But my concern is that they will over use resources that would benefit other types of strips (strips that: can't be proxied or those generated in Blender).

In #80278#1007188, @Sergey wrote:

Cache levels are to utilize reference counting as much as possible. For example, when having single Image Strip without modifications set up in the strip the final sequencer frame in the SEQUENCER_FINAL cache is to reference the image from STRIP_INPUT cache. This allows to minimize memory footprint, playback performance for the story boarding type of tasks performed in the sequencer.

This already happens in current cache design.

Does it happen in the design or implementation? With the attached file I expect the heavy operation to be only performed once, and the followup scrubbing and frame navigation should be realtime, and the memory footprint is to stay constant.

vse_cache_reference.zip

This is not the behavior I'm getting with this file.

No, because image is not without modification. It is scaled to preview resolution. So This would happen, if we cache images after scaling to preview resolution.
That's why I was asking for more concrete definition of STRIP_INPUT, but I guess that would be implementation detail as long as it would "just work"? By design this can be last operation that user have no control of. So "mipmap" in case of this design.

Storing raw or STRIP_INPUT doesn't really make too much sense in current state if you can save 10s or less of original footage with not so crazy timeline. It may save you few seconds of rendering, but on the other hand you can't store as much rendered frames in RAM.

Keep in mind, in video editor you don't only playback or render, but also perform correction operations. Those must not be clogging the interface communication.
If the input data is not cached, I don't see how you can keep responsive interface while tweaking settings.

As Brecht explained, currently we use "temp cache" for 1 currently displayed frame where we store each possible cache level for fastest possible tweaking. The way it works is, that image you change and above are invalidated. Everything that is needed to render image you are changing is cached. When you change frame, images are discarded and cache is filled with new images.

This is currently hacked in into overall cache, because it is convinient, but it could be own cache or it could be not used at all. It benefits you when you change images close to final output, but also when you build up stack of effects.

See this example - playback speed is quite bad, but moving whole image with last transform strip is swift.
temp_cache.blend

From reading this two replies, seems like all the required building blocks are either implemented or were intended in the original cache design. Meaning, this step in the project should be simple and straight-forward, right? :)

I would say yes, but I have some doubts about that power of 2 prescaling. I would probably need to see it in action and then evaluate results.
I always edit videos with uniform resolution so I am not best case for testing this. I will have to do artificial tests.


In #80278#1007370, @ChristopherAnderssarian wrote:
One thing I would argue for, is the ability to disallow proxy data form being cached. As they're already optimised for fast decode they really shouldn't need to be cached (especially when primarily doing temporal editing).
But my concern is that they will over use resources that would benefit other types of strips (strips that: can't be proxied or those generated in Blender).

This was done as part of bugfix (5372924983) and I am not sure if it should be changed. Also off-topic a bit :)

> In #80278#1007188, @Sergey wrote: >>> Cache levels are to utilize reference counting as much as possible. For example, when having single Image Strip without modifications set up in the strip the final sequencer frame in the SEQUENCER_FINAL cache is to reference the image from STRIP_INPUT cache. This allows to minimize memory footprint, playback performance for the story boarding type of tasks performed in the sequencer. >> This already happens in current cache design. > > Does it happen in the design or implementation? With the attached file I expect the heavy operation to be only performed once, and the followup scrubbing and frame navigation should be realtime, and the memory footprint is to stay constant. > > [vse_cache_reference.zip](https://archive.blender.org/developer/F8831896/vse_cache_reference.zip) > > This is not the behavior I'm getting with this file. No, because image is not without modification. It is scaled to preview resolution. So This would happen, if we cache images after scaling to preview resolution. That's why I was asking for more concrete definition of STRIP_INPUT, but I guess that would be implementation detail as long as it would "just work"? By design this can be last operation that user have no control of. So "mipmap" in case of this design. >> Storing raw or `STRIP_INPUT` doesn't really make too much sense in current state if you can save 10s or less of original footage with not so crazy timeline. It may save you few seconds of rendering, but on the other hand you can't store as much rendered frames in RAM. > > Keep in mind, in video editor you don't only playback or render, but also perform correction operations. Those must not be clogging the interface communication. > If the input data is not cached, I don't see how you can keep responsive interface while tweaking settings. As Brecht explained, currently we use "temp cache" for 1 currently displayed frame where we store each possible cache level for fastest possible tweaking. The way it works is, that image you change and above are invalidated. Everything that is needed to render image you are changing is cached. When you change frame, images are discarded and cache is filled with new images. This is currently hacked in into overall cache, because it is convinient, but it could be own cache or it could be not used at all. It benefits you when you change images close to final output, but also when you build up stack of effects. See this example - playback speed is quite bad, but moving whole image with last transform strip is swift. [temp_cache.blend](https://archive.blender.org/developer/F8833103/temp_cache.blend) > From reading this two replies, seems like all the required building blocks are either implemented or were intended in the original cache design. Meaning, this step in the project should be simple and straight-forward, right? :) I would say yes, but I have some doubts about that power of 2 prescaling. I would probably need to see it in action and then evaluate results. I always edit videos with uniform resolution so I am not best case for testing this. I will have to do artificial tests. ------- > In #80278#1007370, @ChristopherAnderssarian wrote: > One thing I would argue for, is the ability to disallow proxy data form being cached. As they're already optimised for fast decode they really shouldn't need to be cached (especially when primarily doing temporal editing). > But my concern is that they will over use resources that would benefit other types of strips (strips that: can't be proxied or those generated in Blender). This was done as part of bugfix (5372924983) and I am not sure if it should be changed. Also off-topic a bit :)

In #80278#1007455, @iss wrote:

In #80278#1007370, @ChristopherAnderssarian wrote:
One thing I would argue for, is the ability to disallow proxy data form being cached. As they're already optimised for fast decode they really shouldn't need to be cached (especially when primarily doing temporal editing).
But my concern is that they will over use resources that would benefit other types of strips (strips that: can't be proxied or those generated in Blender).

This was done as part of bugfix (5372924983) and I am not sure if it should be changed. Also off-topic a bit :)

I see that it doesn't cache for Raw but is (strip) Cache pre-prepossessed images & Cache Final Image not supposed to cache proxy data too? Because it does. *(you can see with the sample file from #80060 )//

All I'm saying is there should be the ability (either hard coded or preference) to not have proxies pointlessly cached, not sure how that's off topic for a task about the cache system...

> In #80278#1007455, @iss wrote: >> In #80278#1007370, @ChristopherAnderssarian wrote: >> One thing I would argue for, is the ability to disallow proxy data form being cached. As they're already optimised for fast decode they really shouldn't need to be cached (especially when primarily doing temporal editing). >> But my concern is that they will over use resources that would benefit other types of strips (strips that: can't be proxied or those generated in Blender). > > This was done as part of bugfix (5372924983) and I am not sure if it should be changed. Also off-topic a bit :) I see that it doesn't cache for `Raw` but is (strip) `Cache pre-prepossessed images` & `Cache Final Image` not supposed to cache proxy data too? Because it does. *(you can see with the sample file from [#80060 ](https:*developer.blender.org/T80060))// All I'm saying is there should be the ability (either hard coded or preference) to not have proxies pointlessly cached, not sure how that's off topic for a task about the cache system...
Author
Owner

The cache key is the sequence strip + scene frame + cache type. To make single images and dragging strips work that scene frame would need to be replaced with a local frame within the strip, and that will probably require a bunch of changes to invalidation, clearing and drawing of the cache.

I see. Indeed the cache key in my proposal would behave the way you've described it.
Is this something we agree that is reasonable to do?

That's why I was asking for more concrete definition of STRIP_INPUT, but I guess that would be implementation detail as long as it would "just work"? By design this can be last operation that user have no control of. So "mipmap" in case of this design.

I think what you've described here is a good definition. At least, I can not currently think of a case when this definition "breaks".

As Brecht explained, currently we use "temp cache" for 1 currently displayed frame where we store each possible cache level for fastest possible tweaking.

This is great.

What I'm trying to understand is what exactly happens in the following scenario:

  • You tweeak strip settings
  • You played few frames forward (so that all "intermediate" cache is invalidated)
  • You tweak setting again

At which point the input needed for the tweak will be loaded?
Not sure if it matters too much for this specific design task, but I kind of want to understand existing behavior better :)

I would say yes, but I have some doubts about that power of 2 prescaling.

The power of two is for the scaling performance.
It doesn't need any trickery for average weights, can be vecoriszed and so things like this.

> The cache key is the sequence strip + scene frame + cache type. To make single images and dragging strips work that scene frame would need to be replaced with a local frame within the strip, and that will probably require a bunch of changes to invalidation, clearing and drawing of the cache. I see. Indeed the cache key in my proposal would behave the way you've described it. Is this something we agree that is reasonable to do? > That's why I was asking for more concrete definition of STRIP_INPUT, but I guess that would be implementation detail as long as it would "just work"? By design this can be last operation that user have no control of. So "mipmap" in case of this design. I think what you've described here is a good definition. At least, I can not currently think of a case when this definition "breaks". > As Brecht explained, currently we use "temp cache" for 1 currently displayed frame where we store each possible cache level for fastest possible tweaking. This is great. What I'm trying to understand is what exactly happens in the following scenario: - You tweeak strip settings - You played few frames forward (so that all "intermediate" cache is invalidated) - You tweak setting again At which point the input needed for the tweak will be loaded? Not sure if it matters too much for this specific design task, but I kind of want to understand existing behavior better :) > I would say yes, but I have some doubts about that power of 2 prescaling. The power of two is for the scaling performance. It doesn't need any trickery for average weights, can be vecoriszed and so things like this.

In #80278#1007564, @Sergey wrote:

The cache key is the sequence strip + scene frame + cache type. To make single images and dragging strips work that scene frame would need to be replaced with a local frame within the strip, and that will probably require a bunch of changes to invalidation, clearing and drawing of the cache.

I see. Indeed the cache key in my proposal would behave the way you've described it.
Is this something we agree that is reasonable to do?

Yes. I would add, that in case of static image strips (image, color, text), the local frame should always point to frame 1 for STRIP_INPUT type.
I think we already translate cfra to local frame anyway in seq_cache_cfra_to_frame_index.

That's why I was asking for more concrete definition of STRIP_INPUT, but I guess that would be implementation detail as long as it would "just work"? By design this can be last operation that user have no control of. So "mipmap" in case of this design.

I think what you've described here is a good definition. At least, I can not currently think of a case when this definition "breaks".

As Brecht explained, currently we use "temp cache" for 1 currently displayed frame where we store each possible cache level for fastest possible tweaking.

This is great.

What I'm trying to understand is what exactly happens in the following scenario:

  • You tweeak strip settings
  • You played few frames forward (so that all "intermediate" cache is invalidated)
  • You tweak setting again

At which point the input needed for the tweak will be loaded?
Not sure if it matters too much for this specific design task, but I kind of want to understand existing behavior better :)

During rendering BKE_sequencer_give_ibuf:

  • "intermediate" cache is freed (any frame that is not current). BKE_sequencer_cache_free_temp_cache
  • Each image from every possible stage is put into cache. All will be stored (some imges temporary, some for long term).

When strip is changed:

  • Final image and changed strip output is invalidated so we need to render strip stack to display changed image. seq_render_strip_stack
  • Strip stack is traversed backwards, if image is not in cache, it is rendered (at this point most images are still there)

We start rendering in "forward direction" starting, where first image is missing (changed strip).

That is pretty much whole cycle.

This feature requires image to be cached at all used states, that's why we also have SEQ_CACHE_STORE_PREPROCESSED and SEQ_CACHE_STORE_PREPROCESSED as stack works with composited images and effects expects preprocessed image that haven't been composited

I would say yes, but I have some doubts about that power of 2 prescaling.

The power of two is for the scaling performance.
It doesn't need any trickery for average weights, can be vecoriszed and so things like this.

Yes, I just can imagine situations, where it would be very useful and also situations where this could be detremental. And question is rather how to distinguish these situations, and where to draw line(in runtime) between do prescaling and don't, so it works fairly well.

But I definitely want to look at this method.

> In #80278#1007564, @Sergey wrote: >> The cache key is the sequence strip + scene frame + cache type. To make single images and dragging strips work that scene frame would need to be replaced with a local frame within the strip, and that will probably require a bunch of changes to invalidation, clearing and drawing of the cache. > > I see. Indeed the cache key in my proposal would behave the way you've described it. > Is this something we agree that is reasonable to do? Yes. I would add, that in case of static image strips (image, color, text), the local frame should always point to frame 1 for `STRIP_INPUT` type. I think we already translate cfra to local frame anyway in `seq_cache_cfra_to_frame_index`. >> That's why I was asking for more concrete definition of STRIP_INPUT, but I guess that would be implementation detail as long as it would "just work"? By design this can be last operation that user have no control of. So "mipmap" in case of this design. > > I think what you've described here is a good definition. At least, I can not currently think of a case when this definition "breaks". > >> As Brecht explained, currently we use "temp cache" for 1 currently displayed frame where we store each possible cache level for fastest possible tweaking. > > This is great. > > What I'm trying to understand is what exactly happens in the following scenario: > > - You tweeak strip settings > - You played few frames forward (so that all "intermediate" cache is invalidated) > - You tweak setting again > > At which point the input needed for the tweak will be loaded? > Not sure if it matters too much for this specific design task, but I kind of want to understand existing behavior better :) During rendering `BKE_sequencer_give_ibuf`: - "intermediate" cache is freed (any frame that is not current). `BKE_sequencer_cache_free_temp_cache` - Each image from every possible stage is put into cache. All will be stored (some imges temporary, some for long term). When strip is changed: - Final image and changed strip output is invalidated so we need to render strip stack to display changed image. `seq_render_strip_stack` - Strip stack is traversed backwards, if image is not in cache, it is rendered (at this point most images are still there) # We start rendering in "forward direction" starting, where first image is missing (changed strip). That is pretty much whole cycle. This feature requires image to be cached at all used states, that's why we also have `SEQ_CACHE_STORE_PREPROCESSED` and `SEQ_CACHE_STORE_PREPROCESSED` as stack works with composited images and effects expects preprocessed image that haven't been composited >> I would say yes, but I have some doubts about that power of 2 prescaling. > > The power of two is for the scaling performance. > It doesn't need any trickery for average weights, can be vecoriszed and so things like this. Yes, I just can imagine situations, where it would be very useful and also situations where this could be detremental. And question is rather how to distinguish these situations, and where to draw line(in runtime) between do prescaling and don't, so it works fairly well. But I definitely want to look at this method.

Added subscriber: @tintwotin

Added subscriber: @tintwotin

Could it be considered to implement caching for the playback functions needed for navigating a specific frame by using shortcut keys? Normally(industry-standard) when working with video you would use J(reverse), K(stop), L(forward) to increase and decrease(more key presses for more/less) playback speed and direction, meaning that prefetch caching should also cache reverse, and in steps for fast forward/reverse.

When the prefetch cache reaches the end of the range, it should automatically continue at the beginning of the range.

The Prefetch Cache is quite resource hungry, and most VSE operations will slow down while it's on(ex. "dragging" values), so a more strict pause-prefetch-caching regime could be implemented.

Also a way to cache for preview using blade-tool and trim-tools could be thought into the cache system.

Could it be considered to implement caching for the playback functions needed for navigating a specific frame by using shortcut keys? Normally(industry-standard) when working with video you would use J(reverse), K(stop), L(forward) to increase and decrease(more key presses for more/less) playback speed and direction, meaning that prefetch caching should also cache reverse, and in steps for fast forward/reverse. When the prefetch cache reaches the end of the range, it should automatically continue at the beginning of the range. The Prefetch Cache is quite resource hungry, and most VSE operations will slow down while it's on(ex. "dragging" values), so a more strict pause-prefetch-caching regime could be implemented. Also a way to cache for preview using blade-tool and trim-tools could be thought into the cache system.

@Sergey my plan of action would be:

  • Implement fast prescaling for image strips and movie strips
  • Make sure, that static image strips reference only one frame. This won't apply to generator effects(color, text) until we have animation cache for them. Fcurve lookup is slow.
  • Keep current cache levels for current "temp cache" optimization, but use only STRIP_INPUT and/or SEQUENCER_FINAL
    • I will try to find good solution where we can check if it makes sense to use SEQUENCER_FINAL type to conserve memory usage. STRIP_INPUT would be prioritized
  • Remove cache settings panels, and remove "Use Disk Cache" checkbox and compression enum.
    • I would like to still have control over caching from panel, because I use it for debugging performance and caching issues though.

This order is chosen to ensure, that more risky changes get in first, so they can be tested properly

@tintwotin I would suggest to report this as a bug. It doesn't really fit into this design document.

@Sergey my plan of action would be: - Implement fast prescaling for image strips and movie strips - Make sure, that static image strips reference only one frame. This won't apply to generator effects(color, text) until we have animation cache for them. Fcurve lookup is slow. - Keep current cache levels for current "temp cache" optimization, but use only `STRIP_INPUT` and/or `SEQUENCER_FINAL` - I will try to find good solution where we can check if it makes sense to use `SEQUENCER_FINAL` type to conserve memory usage. `STRIP_INPUT` would be prioritized - Remove cache settings panels, and remove "Use Disk Cache" checkbox and compression enum. - I would like to still have control over caching from panel, because I use it for debugging performance and caching issues though. This order is chosen to ensure, that more risky changes get in first, so they can be tested properly @tintwotin I would suggest to report this as a bug. It doesn't really fit into this design document.

Added subscriber: @clayhands

Added subscriber: @clayhands

Added subscriber: @AndreaMonzini

Added subscriber: @AndreaMonzini

Hello i agree with Peter with JKL forward/reverse cache.

For now i enable "Prefetch Frames" in the area where i want to cache and then i disable "Prefetch Frames" to be able to smoothly scrubbing ( forward/reverse ) around the area of interest.
A solution could be add an option to dynamically "center " the cache around the playhead :

jkl.png

Hello i agree with Peter with JKL forward/reverse cache. For now i enable "Prefetch Frames" in the area where i want to cache and then i disable "Prefetch Frames" to be able to smoothly scrubbing ( forward/reverse ) around the area of interest. A solution could be add an option to dynamically "center " the cache around the playhead : ![jkl.png](https://archive.blender.org/developer/F8940024/jkl.png)

Added subscriber: @Pipeliner

Added subscriber: @Pipeliner

Added subscriber: @erjiang-3

Added subscriber: @erjiang-3

This issue was referenced by 445ebcaa30

This issue was referenced by 445ebcaa308eebfd7f4c3e3e63bfb2cd1dab6a2d

Added subscriber: @wknowleskellett

Added subscriber: @wknowleskellett
Richard Antalik added this to the Video Sequencer project 2023-02-09 21:00:44 +01:00
Philipp Oeser removed the
Interest
VFX & Video
label 2023-02-10 09:31:59 +01:00
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No Assignees
11 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#80278
No description provided.