- refactors sequencer cache and allows configuration of cache from UI
- implements memory visualization tool for viewing cache content
- implements prefetching for sequencer
- refactors draw_image_seq - separate patch: D4315: Refactor drawing preview
These items should be (quite) functionally independent, so the patch can be split.
Each Scene owns 1 sequencer cache of MovieCache type which is used by sequencer belonging to this scene.
This cache should be viewed as "3D" space for stored images:
- each frame of timeline can be cached
- frame is composed of number of strips, that can be cached individually
- each strip has number of rendering stages of which 3 can be cached
- per strip
- SEQ_CACHE_RAW raw image: image as provided by input codec
- SEQ_CACHE_PREPROCESSED pre-processed: after scaling and color transformations, ...
- SEQ_CACHE_COMPOSITE composite: combined with another strip(s)
- sequencer strip stack output
- SEQ_CACHE_FINAL_OUT this is a copy of last(topmost) strip composite image, so if both cached, memory usage is about the same as if only 1 of them is cached
- SEQ_CACHE_FINAL_COLOR_MANAGED color corrected strip stack output: this is the image, that is displayed in preview window
User can enable/disable storage of any of these types, to allocate resources efficiently.
Each stored image(SeqcacheKey->cost) has a cost assigned. Cost is calculated as ratio of time spent on rendering to maximum possible time to keep up with chosen frame rate. The higher the cost, the harder it is to render image.
Initially frames with cost less then 1 are considered to be cheap. This value is recalculated each time the cache is full or when prefetch process is started.
Cache size limiting
To limit cache size, a custom method is used because of rather complex cache arrangement. "Stock" cache limiter is currently used only for setting used / free memory in UI. (TODO - definitely not ideal)
When cache is full, most distant cheap frame from playhead is removed. With prefetching enabled most distant cheap frame is removed, with frames behind playhead taking precedence.
Function BKE_sequencer_cache_recycle_item is defining behavior of removing "old" frames
Sometimes it is required to use 1 image multiple times to render single frame. To prevent unnecessary re-rendering, all stages of rendering are stored in cache until the frame is rendered completely. Images created in stages with disabled cache storage has SeqCacheKey property free_after_render set and are then freed after a frame is rendered.
To ensure, that stages chosen by user are always cached, a dependency system is used. To achieve this, when removing an image from cache we also have to remove all images that were created using this image.
Currently, this process is simplified a bit by removing only final images of particular frame. When removing final image(has SeqCacheKey->is_base_frame set) we follow SeqCacheKey->prev or SeqCacheKey->next links to remove all images used to create final image
To avoid iterating content of the whole cache, cache keys are linked by pointers in both ways
In case of speed strip, one image can be used to render multiple frames. This is not adressed yet.
- moved to header:
- Typedefs MovieCache, MovieCacheKey, MovieCacheItem, to avoid redefinitions
- IMB_moviecache_get_mem_total, IMB_moviecache_get_mem_used, IMB_get_size_in_memory
- added to API
- void IMB_moviecache_limiter_enable(MovieCache *cache)
- void IMB_moviecache_limiter_disable(MovieCache *cache)
- added MovieCache members:
- MEM_CacheLimiterC *limiter - can be set to null to avoid limiter usage
- float expensive_min_val - for evaluating value of stored images
- MovieCacheKey *last_key, void *last_userkey - can be used to build dependencies between stored images
- size_t memory_used - To track memory usage of cache (limiter has to update this value)
- ThreadMutex cache_mutex - To protect ghash iterator / user limiting application. Maybe user_mutex would be a better name?
SEQ_CACHE_FINAL_COLOR_MANAGED is not implemented yet. First draw_image_seq must be refactored
I should also check how GLSL color transformation works - I am not sure if I can read transformed image back. I guess it is possible to read back generated texture?
Plans for future work
Setting cached stages per strip:
- may be actually included in this patch - not too much code
- Add MovieCache->dumper
- This will be callback, that will accept void *userkey, ImBuf *ibuf (anything else?)
- Sequencer should set callback similar to function seq_proxy_build_frame only it will get SeqCacheKey and ImBuf
- Save image to <project-path>/<scene-name>/<seq-name>/<cache-type[Number]>/cfra.ext
- formats to use should be (at least to start with) JPEG, PNG, EXR
- path MUST be properly escaped (ideally scene/strip names should be limited to be filename compliant)
- Proper system must be developed, that will determine, what images must be deleted, by provided action and strips affected
- With Image dumping, We will need a means to manage validity of external files.
- Should be fast and shouldn't use too much memory. Saving checksums of images is probably most effective solution.
- Does not have to be 100% correct depending on implementation? This may not be the best idea, as it would mean, that there would have to be a manual full check routine at least for range, which would complicate code significantly...
- checksum table file must be pernament. Either as external file(slow?), or included in .blend itself as a data structure(may result in larger .blend file). Tables must be resizeable, appendable, randomly accessible.
- if file is to be read to calculate checksum, can we save time, by reading every n-th byte?
10min of footage in timeline at 60FPS, SEQ_CACHE_COMPOSITE and SEQ_CACHE_FINAL_OUT are dumped
36000 frames total, ~70KB of data with CRC8 checksum
Memory visualization tool
This is a tool for visualizing cache content inspired by Movie Clip editor.
When enabled, grey stripe is drawn in channel 0. Cached images are represented by 1 frame wide line either on strip to which they belong to or in channel 0 for types SEQ_CACHE_FINAL_OUT and SEQ_CACHE_FINAL_COLOR_MANAGED.
In strip itself types SEQ_CACHE_RAW, SEQ_CACHE_PREPROCESSED and SEQ_CACHE_COMPOSITE are drawn in order from bottom to top.
Represented image color tint can vary from blue(cheap) to red(expensive) according to calculated cost with cost of 1 being a crossing point from cheap to expensive.
Color scheme is not the best one, cost calculation is not protected from overflowing + method may be flawed
This tool can be enabled or disabled by setting memview_enable RNA property.
Function draw_cache_memview is used to draw this tool
- cost calculation is not protected from overflowing + method may be flawed. Will have to review this
- Is ugly...
Prefetching is background process, that has a goal to fill the cache with frames, that are (most likely) to be displayed. Each Scene owns 1 PrefetchJob struct, that contain neccessary data to run prefetching job.
To run 2 or more rendering threads at once, one copy of scene per thread needs to be created. This is to allow threaded animdata evaluation and animating strip properties without affecting playback in the main thread.
Only data required by sequencer are copied and scene is copied outside of bmain struct, so user can not access it.
Currently, only 1 prefetch job can run at a moment.
Prefetch thread is running until stop bit is set. When there is no work to be done, pthread_cond_wait is used to suspend prefetch thread.
To start prefetching:
- Enable prefetching
- Some Cache storage must be enabled
- Prefetch thread must not be running.
- User must not scrub timeline
- "expensive" footage must not be played
- Rendering a frame preview (move playhead) will try to start prefetching
Memory usage is analyzed during run time to determine when to pause prefetching.
Invalidation of scene copy
When User edits or affects a strip's output, strip's cache is invalidated. This process was appended by prefetch scene invalidation. Old scene copy is freed and replaced by new one. If scene strips are to be supported, This should be optimized by more selective invalidation or shallow copy + lazy loading in style of depsgraph. Note that cache invalidation in sequencer is not complete yet and some edits will go unnoticed.
User has control over prefetching by following controls:
- prefetch_enable (bool) - enables/disables prefetching
- prefetch_store_ratio (float 0 to 1) - ratio of new prefetched frames to expensive frames already in cache
- prefetch_offset (float 0 to 1) - after number of frames to be prefetched is calculated, it is multiplied by this number, so cached frames are trailing behind playhead
While prefetch process is running and memview is enabled, WM is "forced" to redraw sequencer by BKE_sequencer_prefetch_wm_notify, so user can see progress in real time.
- old prefetch code removed.
- Limit multithreaded effects to keep 1 thread free, when called from prefetch thread. This is because using all possible threads lead to framerate drops and unresponsive UI.
Plans for future work
- performance of most strips is quite constant so we can spare resources on movies with proxies and focus on more distant parts of timeline, that needs actual prefetching.
- strip worker as a replacement of proxy for scenes and other strips
- depends on cache dumping images to disk and proper / better cache invalidation(including management of dumped files)
- "Thread collision" in blf lib while rendering text strips
- Scene copying unfinished
Prior to this patch only (equivalent to) SEQ_CACHE_PREPROCESSED and SEQ_CACHE_COMPOSITE were cached.
Caching only SEQ_CACHE_FINAL_OUT or SEQ_CACHE_FINAL_COLOR_MANAGED will result in *consistent* length of cached content. With "Temporary cache" providing means to quite responsive editing of strips(as long as editing last in chain).
Memview is quite nice tool, provides essential info to user, so it should be enabled. Problem may be, that it's ugly.
Cost visualization should be disabled by default.
prefetch_store_ratio is questionable. About 0.5 is probably fine.
prefetch_offset should be 0
Biggest question is if prefetching should be enabled by default. I would agree to this, but rather not in first release and only after setting default cache size limit to about 512MB.
With footage optimized for fast loading you can fill whole RAM in just a few seconds. There are a lot of users with 4GB of RAM and less.
If I accidentaly left 4GB limit as is set now, and started prefetching, my system wasn't able to recover from such event.
Even if I realized, what I just did, there was no time to react, and stop prefetching