- refactors sequencer cache and allows configuration of cache from UI
- implements memory visualization tool for viewing cache content
- implements prefetching for sequencer
- refactors draw_image_seq - I didn't understand, what's going on in that function completely. Refactoring did help me. Now I don't need it to be refactored :)
These 3 items should be (quite) functionally independent, so the patch can be split.
Each Scene owns 1 sequencer cache of MovieCache type which is used by sequencer belonging to this scene.
This cache should be viewed as "3D" space for stored images:
- each frame of timeline can be cached
- frame is composed of number of strips, that can be cached individually
- each strip has number of rendering stages of which 3 can be cached
- per strip
- SEQ_CACHE_RAW raw image: image as provided by input codec
- SEQ_CACHE_PREPROCESSED pre-processed: after scaling and color transformations, ...
- SEQ_CACHE_COMPOSITE composite: combined with another strip(s)
- sequencer strip stack output
- SEQ_CACHE_FINAL_OUT this is a copy of last(topmost) strip composite image, so if both cached, memory usage is about the same as if only 1 of them is cached
- SEQ_CACHE_FINAL_COLOR_MANAGED color corrected strip stack output: this is the image, that is displayed in preview window
User can enable/disable storage of any of these types, to allocate resources efficiently.
Each stored image(SeqcacheKey->cost) has a cost assigned. Cost is calculated as ratio of time spent on rendering to maximum possible time to keep up with chosen frame rate. The higher the cost, the harder it is to render image.
Initially frames with cost less then 1 are considered to be cheap. This value is recalculated each time the cache is full or when prefetch process is started.
Cache size limiting
To limit cache size, a custom method is used because of rather complex cache arrangement. "Stock" cache limiter is currently used only for setting used / free memory in UI. (TODO - definitely not ideal)
When cache is full, most distant cheap frame from playhead is removed. With prefetching enabled most distant cheap frame is removed, with frames behind playhead taking precedence.
Function BKE_sequencer_cache_recycle_item is defining behavior of removing "old" frames
Sometimes it is required to use 1 image multiple times to render single frame. To prevent unnecessary re-rendering, all stages of rendering are stored in cache until the frame is rendered completely. Images created in stages with disabled cache storage has SeqCacheKey property free_after_render set and are then freed after a frame is rendered.
To ensure, that stages chosen by user are always cached, a dependency system is used. To achieve this, when removing an image from cache we also have to remove all images that were created using this image.
Currently, this process is simplified a bit by removing only final images of particular frame. When removing final image(has SeqCacheKey->is_base_frame set) we follow SeqCacheKey->prev or SeqCacheKey->next links to remove all images used to create final image
To avoid iterating content of the whole cache, cache keys are linked by pointers in both ways
In case of speed strip, one image can be used to render multiple frames. This is not adressed yet.
- moved to header:
- Typedefs MovieCache, MovieCacheKey, MovieCacheItem, to avoid redefinitions
- IMB_moviecache_get_mem_total, IMB_moviecache_get_mem_used, IMB_get_size_in_memory
- added MovieCache members:
- MEM_CacheLimiterC *limiter - can be set to null to avoid limiter usage
- float expensive_min_val - for evaluating value of stored images
- bool insert_allowed - when false, cache will not accept new images (even re-entry)
- MovieCacheKey *last_key, void *last_userkey - can be used to build dependencies between stored images
- bool use_limiter_to_free - when false, limiter wont be used to free stored images (this one is probably a bad idea)
- size_t memory_used - To track memory usage of cache (limiter has to update this value)
SEQ_CACHE_FINAL_COLOR_MANAGED is not implemented yet. First draw_image_seq must be refactored
I should also check how GLSL color transformation works - I am not sure if I can read transformed image back. I guess it is possible to read back generated texture?
On rare occasions, imbuf's are not freed.
It seems like ~1 in 1000 ibufs is unfreed - I will be able to track this probably.
A huge leak, when no ibuf was freed happened to me also few times, Not sure if this was due to mishandling refcount or due to bad code. Will have to run more tests to confirm this.
Plans for future work
Setting cached stages per strip:
- may be actually included in this patch - not too much code
- Add MovieCache->dumper
- This will be callback, that will accept void *userkey, ImBuf *ibuf (anything else?)
- Sequencer should set callback similar to function seq_proxy_build_frame only it will get SeqCacheKey and ImBuf
- Save image to <project-path>/<scene-name>/<seq-name>/<cache-type[Number]>/cfra.ext
- formats to use should be (at least to start with) JPEG, PNG, EXR
- path MUST be properly escaped (ideally scene/strip names should be limited to be filename compliant)
- Proper system must be developed, that will determine, what images must be deleted, by provided action and strips affected
- With Image dumping, We will need a means to manage validity of external files.
- Should be fast and shouldn't use too much memory. Saving checksums of images is probably most effective solution.
- Does not have to be 100% correct depending on implementation? This may not be the best idea, as it would mean, that there would have to be a manual full check routine at least for range, which would complicate code significantly...
- checksum table file must be pernament. Either as external file(slow?), or included in .blend itself as a data structure(may result in larger .blend file). Tables must be resizeable, appendable, randomly accessible.
- if file is to be read to calculate checksum, can we save time, by reading every n-th byte?
10min of footage in timeline at 60FPS, SEQ_CACHE_COMPOSITE and SEQ_CACHE_FINAL_OUT are dumped
36000 frames total, ~70KB of data with CRC8 checksum
Memory visualization tool
This is a tool for visualizing cache content inspired by Movie Clip editor.
When enabled, grey stripe is drawn in channel 0. Cached images are represented by 1 frame wide line either on strip to which they belong to or in channel 0 for types SEQ_CACHE_FINAL_OUT and SEQ_CACHE_FINAL_COLOR_MANAGED.
In strip itself types SEQ_CACHE_RAW, SEQ_CACHE_PREPROCESSED and SEQ_CACHE_COMPOSITE are drawn in order from bottom to top.
Represented image color tint can vary from blue(cheap) to red(expensive) according to calculated cost with cost of 1 being a crossing point from cheap to expensive.
Color scheme is not the best one, cost calculation is not protected from overflowing + method may be flawed
This tool can be enabled or disabled by setting memview_enable RNA property.
Function draw_cache_memview is used to draw this tool
- cost calculation is not protected from overflowing + method may be flawed. Will have to review this
- Is ugly...
Prefetching is background process, that has a goal to fill the cache with frames, that are (most likely) to be displayed. Each Scene owns 1 PrefetchJob struct, that contain neccessary data to run prefetching job.
To run 2 or more rendering threads at once, one copy of scene per thread needs to be created. This is to allow threaded animdata evaluation and animating strip properties without affecting playback in the main thread.
Only data required by sequencer are copied and scene is copied outside of bmain struct, so user can not access it.
Currently, only 1 prefetch job can run at a moment.
Initially job exists for longer period of time, until cache is completely filled with data. Then job restarts for each frame, which is not ideal (doesn't seem to be too problematic either).
To start prefetching:
- Enable prefetching
- Some Cache storage must be enabled
- Prefetch thread must not be running.
- User must not scrub timeline
- "expensive" footage must not be played
- Rendering a frame preview (move playhead) will try to start prefetching
Number of frames to be prefetched is calculated by simulated run of prefetch. Image size is approximated by examining strip inputs, settings and view settings. This is mainly to prevent unnecessary restarts of the prefetching process, but disadvantage is need for single purpose code, that has to be maintained.
This part of code is not polished, and may (hopefully) be removed.
Function BKE_sequencer_prefetch_set_target gets the range of frames, that can be prefetched
Invalidation of scene copy
When User edits or affects a strip's output, strip's cache is invalidated. This process was appended by prefetch scene invalidation. Old scene copy is freed and replaced by new one. If scene strips are to be supported, This should be optimized by more selective invalidation or shallow copy + lazy loading in style of depsgraph. Note that cache invalidation in sequencer is not complete yet and some edits will go unnoticed.
User has control over prefetching by following controls:
- prefetch_enable (bool) - enables/disables prefetching
- prefetch_store_ratio (float 0 to 1) - ratio of new prefetched frames to expensive frames already in cache
- prefetch_offset (float 0 to 1) - after number of frames to be prefetched is calculated, it is multiplied by this number, so cached frames are trailing behind playhead
While prefetch process is running and memview is enabled, WM is "forced" to redraw sequencer by BKE_sequencer_prefetch_wm_notify, so user can see progress in real time.
- old prefetch code removed.
- Limit multithreaded effects to keep 1 thread free, when called from prefetch thread. This is because using all possible threads lead to framerate drops and unresponsive UI.
Plans for future work
- performance of most strips is quite constant so we can spare resources on movies with proxies and focus on more distant parts of timeline, that needs actual prefetching.
- strip worker as a replacement of proxy for scenes and other strips
- depends on cache dumping images to disk and proper / better cache invalidation(including management of dumped files)
- "Thread collision" in blf lib while rendering text strips
- Scene copying unfinished
- prefetch thread "communicates" through 2 volatile bools... Prefetch job can run, until stopped by user or change in data occurs, in which case we need to recreate scene copy and restart the thread. otherwise it can be very independent.
Prior to this patch only (equivalent to) SEQ_CACHE_PREPROCESSED and SEQ_CACHE_COMPOSITE were cached.
Caching only SEQ_CACHE_FINAL_OUT or SEQ_CACHE_FINAL_COLOR_MANAGED will result in *consistent* length of cached content. With "Temporary cache" providing means to quite responsive editing of strips(as long as editing last in chain).
Memview is quite nice tool, provides essential info to user, so it should be enabled. Problem may be, that it's ugly.
Cost visualization should be disabled by default.
prefetch_store_ratio is questionable. About 0.5 is probably fine.
prefetch_offset should be 0
Biggest question is if prefetching should be enabled by default. I would agree to this, but rather not in first release and only after setting default cache size limit to about 512MB.
With footage optimized for fast loading you can fill whole RAM in just a few seconds. There are a lot of users with 4GB of RAM and less.
If I accidentaly left 4GB limit as is set now, and started prefetching, my system wasn't able to recover from such event.
Even if I realized, what I just did, there was no time to react, and stop prefetching