Page MenuHome

Cycles: Add optional Blue-Noise Dithered Sobol Sampling
Needs RevisionPublic

Authored by Lukas Stockner (lukasstockner97) on Aug 10 2016, 6:01 PM.
"Love" token, awarded by Schamph."Like" token, awarded by MetinSeven."Mountain of Wealth" token, awarded by franMarz."Like" token, awarded by Maged_afra."Love" token, awarded by Shimoon."Burninate" token, awarded by crantisz."Like" token, awarded by xdanic."Love" token, awarded by wo262."Pirate Logo" token, awarded by lordodin.



This patch implements precomputed dithered sampling as described in the paper "Blue-noise Dithered Sampling".

Mainly for low sample counts, such as the first few seconds of viewport rendering, the improvement in quality is substantial - especially the brightness flickering happening with regular Sobol and HDRIs is pretty much gone.

For now, the dithering matrix included in sobol.cpp isn't perfect yet - I'm currently running a 400 million iterations simulated annealing process, but that will still take about a day to finish. I'll update the patch as soon as it's here.
Because of the temporary matrix, you can still see some tiling happening at low sample counts, but with a good dithering matrix that shouldn't be as visible.

Diff Detail

rB Blender
Build Status
Buildable 98
Build 98: arc lint + arc unit

Event Timeline

Lukas Stockner (lukasstockner97) retitled this revision from to Cycles: Add optional Blue-Noise Dithered Sobol Sampling.
Thomas Dinges (dingto) requested changes to this revision.Aug 10 2016, 7:21 PM

Nice work, I can see some visibility improvements in low sample counts. Some minor things inline.


Reference needs to be removed from svm_image.h as well.


New variable here changes padding, remove pad1 to ensure alignment.

This revision now requires changes to proceed.Aug 10 2016, 7:21 PM
Lukas Stockner (lukasstockner97) edited edge metadata.

Updated svm_image.h with the correct texture amount.
As for the padding, turns out that it was wrong in master to begin with. I fixed it now.

  • Missing tex_free for __sobol_dither.
  • Add some comments about what all those bitwise operations do?
  • Put the simulated annealing code somewhere in the repo?
Lukas Stockner (lukasstockner97) edited edge metadata.
  • Added device_free call
  • Made bitwise operations in kernel_random.h a bit clearer
  • Added simulated annealing tool in intern/cycles/app/ (not in CMake, just as a single file)
  • Updated matrix with better one, still not the "final" one

The updated simulated annealing tool now also comes with some approximate math and SSE4.1 code, which makes it about 5 times faster.

I can't review the sampling algorithms because I'm too familiar with the original implementation, so I don't have much more to add here.

But before merging this kind of change we should definitely see test renders of various scenes to verify it's all working as intended.

Hi, would like to test but patch does not work with master da77d987.

Opensuse Leap 42.1 x86_64
Intel i5 3570K
GTX 760 4 GB /Display card
GTX 670 2 GB
Driver 367.35
gcc (SUSE Linux) 4.8.5

/blender_build/blender> patch -p1 < D2149.diff
patching file intern/cycles/app/cycles_dithering.cpp
patching file intern/cycles/blender/addon/
patching file intern/cycles/blender/blender_sync.cpp
patching file intern/cycles/kernel/kernel_bake.h
patching file intern/cycles/kernel/kernel_path_branched.h
patching file intern/cycles/kernel/kernel_path_surface.h
patching file intern/cycles/kernel/kernel_path_volume.h
patching file intern/cycles/kernel/kernel_random.h
patching file intern/cycles/kernel/kernel_textures.h
patching file intern/cycles/kernel/kernel_types.h
Hunk #1 succeeded at 1117 with fuzz 2 (offset 4 lines).
patching file intern/cycles/kernel/kernel_volume.h
patching file intern/cycles/kernel/svm/svm_image.h
patching file intern/cycles/render/integrator.h
patching file intern/cycles/render/integrator.cpp
patching file intern/cycles/render/scene.h
patching file intern/cycles/render/sobol.h
patching file intern/cycles/render/sobol.cpp
patching file intern/cycles/util/util_texture.h
Hunk #1 FAILED at 37.
1 out of 1 hunk FAILED -- saving rejects to file intern/cycles/util/util_texture.h.rej


As util_texture.h is the only file that doesn't patch, you can ignore it. Just compile it and it should run. The changes there are just about texture limits for Fermi.
Alternatively, install archanist and use the command "arc patch D2149" to apply the patch.

First test I made with dithered Sobol show some slight improvements in very simple scenes (some cubes) but not in benchmark files at low samples (at least not visible to the naked eye at 8, 16 nor 64 samples). The noise patterns are clearly different, but none seems to be better than the other one.
By the way, setting dithered sobol with CPU device and switching to GPU works well with OpenCL, so the choice can be made available for GPU/OpenCL.

Sergey Sharybin (sergey) requested changes to this revision.Apr 24 2017, 9:42 PM

The patch clearly needs an update against latest master.

When that' done i do have the following issues.

  • There are correlation artifacts in the following file which looks like

Not sure whether it's a merge conflict i resolved wrongly or there is a real problem with the particular dither.

Volume part can be solved by restoring the re-hashing in volume code, however this doesn't seem to be a good/proper solution: distribution of path_rng_1D should be good enough ideally.

You also mentioned some work being done for getting better dither matrix, any luck with that?

This revision now requires changes to proceed.Apr 24 2017, 9:42 PM

Thanks for the link, I just tried his tool and it seems to work reasonably well.
However, from what I can see so far, it doesn't perform as well as intern/cycles/app/cycles_dithering.cpp, the tool that's included in this patch.
The two main problems I see with it are:

  • It just runs straight into the first local minimum because it doesn't do simulated annealing - it just rejects every swap that increases the total energy
  • It recalculates the energy of the entire system after every swap, which is extremely wasteful - due to the local nature of the energy term used, only a handful of terms have to be recalculated. Doing so helps reducing the computational complexity a lot (my tool used to perform billions of swaps iirc, while his tool currently takes 30min to do 4096 iterations) as well as reducing numerical issues.

There were some more implementation details in the 2019 SIGGRAPH course "My favorite samples":

It might also be interesting to investigate this improved method by Heitz et al:
If you look at the supplemental material, it's impressive how much perceived quality improves at higher sample counts.

@Stefan Werner (swerner) @Lukas Stockner (lukasstockner97) are either of you interesting in getting this patch finished for 2.81? I think it would be great to have.

I have an updated version of this patch that I could upload here, sure.
However, I'd like to investigate the updates that Stefan linked above first.

I tried implementing the trick that was mentioned in the presentation (applying Cranley-Patterson rotation within each stratum instead of over the entire sample space), but it resulted in correlation artifacts similar to the microjittering patch.
One way to fix this might be to scramble the sample order in each pixel, but doing that effectively is tricky for non-square-of-two sample counts and also breaks the cache coherency benefits.
In addition to that, I haven't really looked much into Cycles' Sobol sampler yet, maybe applying the per-stratum rotation to something like PMJ might work better.

Regarding the new paper by Heitz et al., I hacked their provided example samplers into Cycles and it works quite well, but it's clearly not more than a proof-of-concept implementation.
I also started to implement a simulated annealing implementation that could generate masks for the Cycles Sobol sampler, but so far my masks don't work nearly as well as the provided ones...
Also, as above, extending this to non-square-of-two sample counts might not be as easy as it seems.

@Lukas Stockner (lukasstockner97) can you upload a version of the existing that work against the current master? That would make it easier for others to experiment with it.