Blender 2.8: New Blender Cycles Client for dividing the workflow of rendering to three parts: Preprocessing (Blender), Rendering (Blender Client), Postprocessing (Blender)
Needs RevisionPublic

Authored by Milan Jaros (jar091) on Sep 14 2018, 10:45 AM.

Details

Summary

Dividing rendering to three parts:

  1. Pre-processing (loading scene, loading images, create bvh, ...): input - .blend fileScene, output - binary fileKernelGlobal (like memory dump from cuda/opencl)
  2. Path-tracing (kernel path tracing over all pixels): input - binary fileKernelGlobal, output - binary fileCyclesBuffer, bmp
  3. Post-processing (filters, tonemapping, ...): input - binary fileCyclesBuffer: output - image exr/png/jpeg

You can easily return to rendering with more samples and you can make filtering again.

Client source code:

  • blender/client/api
  • blender/client/cycles
  • blender/client/main
  • blender/client/scripts - example scripts for building and running

Global env. variables:
export CLIENT_FILE_KERNEL_GLOBAL=${scene}.${frame}.kg
export CLIENT_FILE_CYCLES_BUFFER=${scene}.${frame}.bf
export CLIENT_FILE_CYCLES_BMP=${scene}.${frame}.bmp
export CLIENT_FILE_ADDITIONAL_SAMPLES=0

The Blender Client needs only 30% of Blender memory for rendering (same memory like on GPU).

Diff Detail

Repository
rB Blender

The patch is compatible with the commit: d4959165 (d49591654701b032bfea31b487941c7187ddd4a7).

Sergey Sharybin (sergey) requested changes to this revision.Sep 14 2018, 11:37 AM

Doing such a binary dumps is somewhat dangerous in the type of production we do here -- that wouldn't be possible to come back to the file after fixing some bug if Cycles memory layout changed inbetween.

I am also not sure why GPU memory will be lower? Only if you're rendering from the interface, where Blender will occupy OpenGL resources. When rendering from the command line Blender itself doesn't use any OpenGL resources. Or maybe i'm misread the last sentence, and you're only mentioning host memory usage reduction?

For the host memory we can so much, especially in Blender 2.8.

First of all, in 2.8 we can delete the whole render dependency graph after objects were synchronized to Cycles. With the CoW concept that'd free a lot of memory up before textures gets loaded and such. Tricky part to be aware here are packed images and scene itself (which is used to only access scene name. this is simple to fix though).

Second of all, we can also free all evaluated states on all dependency graphs, which might be used by viewport. There is D1123 which needs an adoption to work with per-render-engine dependency graphs.

This will still keep blender itself and original .blend files in memory, but those are usually way smaller than evaluated scene state. But if that's still an issue after implementing all the above, we can kick current scene state to an on-disk undo buffer, and restore from it after rendering is done.

The good thing about the approaches i've listed above is that they are helping for artists who are using Blender from the interface, without going into all the troubles of setting up some crazy setup with render clients and so. Those approaches will also help command line rendering, with almost same memory usage as your approach.

Whole is see how it helps rendering, but don't think this is something we can accept in Blender in its current state.

This revision now requires changes to proceed.Sep 14 2018, 11:37 AM

It would have been good to discuss in advance the strategies to reduce memory. It's certainly an interesting experiment to see how much room for memory optimization there is, 30% is a great result.

But to be honest I think this too much complexity to Cycles to accept in our repository. It will make it harder for us to add new features and rendering algorithms in the future due to having to maintain yet another code path that duplicates a lot of. And if we add adaptive sampling for example, we can't simply serialize the device commands, these will depend on the rendered result. And so we can't get locked into an architecture where the device commands are serializable into a file, it's just not the right place to do it.

Also from a usability point of view it's important that blender -b -f can render with the minimum amount of memory, if it requires special setup very few people will end up using it.

Just skimming the code, I can't claim I fully comprehend it yet. A few thoughts:

There's a kernel_omp.cpp in there. Could that be a separate patch? How does it compare to our existing CPU kernel, any advantages?
How does this compare to Cycles' Network device? Could the same functionality of this be implemented on top of the network device (via localhost), reducing code duplication?
I see some duplicate code in there (tone mapping, color space conversion), can that be reduced? What's the motivation for writing a new BMP exporter instead of using one of the (already several) image writing libraries in Blender?

If the goal is to separate the rendering process from the rest of Blender (reducing memory footprint, render farms, etc), what was the motivation for not going with the existing Cycles standalone? It's what we did for Poser/Cycles, we completed Cycles' XML import (to the extend of the features Poser supports) and dumped mesh and curve data to binary files to save space (USD(Z), Alembic or binary glTF could be options too). Having a human-readable format (or at least otherwise standardised format) helps debugging issues and also allows for pipeline tools that operate on those files. Think of scripts to automatically generate additional viewpoints, replace proxies with hi-res instances, etc.

Passing raw memory between machines can potentially cause trouble. Endianess and alignment requirements must be identical on all machines. Most of us live in a i386-x64 monoculture, but not too long ago Apple built PPC machines, and ARM may become a more important architecture for Blender in the future. It also prevents us from using any pointers in Cycles' kernel - we're not doing it at this point, but with SVM on OpenCL and Unified Memory on CUDA it is becoming an option.