Page MenuHome

Cycles PMJ adaptive sampling working poorly with many bounces
Closed, ResolvedPublicBUG

Description

System Information
Operating system: Windows-10-10.0.19041-SP0 64 Bits
Graphics card: GeForce GTX 950M/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 398.35

Blender Version
Broken: version: 2.90.0 Beta, branch: master, commit date: 2020-07-22 14:02, hash: rB0e280b96ca82

Short description of error
Adaptive sampling does not take into account the minimum number of samples, as well as the threshold, additionally leaving square artifacts
Without adaptive sampling


With adaptive sampling

Exact steps for others to reproduce the error

  • Open attached file
  • press F12 to render

Event Timeline

CHET (cheteron) updated the task description. (Show Details)
Germano Cavalcante (mano-wii) renamed this task from Render bug to Render artefacts with adaptive sampling+Intel denoise.Jul 23 2020, 8:47 PM
Germano Cavalcante (mano-wii) changed the task status from Needs Triage to Confirmed.
Germano Cavalcante (mano-wii) updated the task description. (Show Details)
Germano Cavalcante (mano-wii) changed the subtype of this task from "Report" to "Bug".

Can't reproduce with RTX 2080ti.
Please try to update the Nvidia drivers of the GPU to the latest (your drivers seem outdated) and try again.

It's not a driver problem, because in blender version: 2.90.0 Beta, branch: master, commit date: 2020-07-27 20:19, hash: 221604cdd663 - noise threshold works correctly

Can't reproduce with RTX 2080ti.
Please try to update the Nvidia drivers of the GPU to the latest (your drivers seem outdated) and try again.

The problem here is that adaptive sampling enables the PMJ sampling pattern, which seems to be doing poorly on this scene compared to Sobol and CMJ.

@Stefan Werner (swerner), maybe you have a guess for why PMJ would fail here.

Brecht Van Lommel (brecht) renamed this task from Render artefacts with adaptive sampling+Intel denoise to Cycles PMJ adaptive sampling working poorly with many bounces.Nov 9 2020, 3:35 PM

Have been investigating this and have a few ideas. So far I can confirm that it is not resorting to random sampling but I am beginning to think that maybe the sampler is setup slightly incorrectly which is causing the samples to be distributed oddly.

It's the decorrelation for higher dimensions that I copied from our Sobol sampler that appears to be causing this.

Thanks for the update. I'll look into that to see what I find.

I have investigated the CPU variant and the samples appear fine with random samples but on further analysis the cmj_hash_simple seems to bias the points which you can see below when performed on a regular grid see below

. I am currently working on fixing this.

OK, I have been working on how this can be improved using the table approach we already have. By just shuffling the sample index (so we get a randomish order for progressive samples) and also along the x row and y column to reduce the correlation between the sets nice samples can be obtained. This would also mean we only need 1 table of sample points instead of a number of them. This would be way better on the CPU. I have done some experimentation with this and managed to generate something like the correlated multi-jittered samples but that we can use in a progressive manner see the image below for the first 8 2D sets of points.

Notice that they keep many of the properties of the original set but are all different. Also since the sample order is essentially random it can be used as a progressive set which converges towards the original set in a more representative manner. I still need to optimise the shuffle permutation to get more speed and get a nicer sample order. For comparison using our existing progressive method with the same fixed start table gives the following which can be relatively clumpy
.

How can you shuffle the sample index of a progressive sampling pattern without losing stratification? A random subset of a stratified pattern is not stratified, unless there is something specific about the shuffling that preserves the stratification.

Maybe there is something specific about PMJ and a specific shuffling function that could work, but it's not obvious to me how that would be implemented.

I still don't understand why there is an issue here in the first place. Presumably the PMJ algorithm as described in the paper is correct. Is there an error in our implementation, either in how we generate the samples or how we use them? Are we running into a limitation of the algorithm with regards to number of dimensions or number of samples?

I shuffle along the x and y of the original sample set which has a large number of samples (which is similar to what is done with the current hash) but I also shuffle using the index of the sample which reorders the samples. However, I still need to find a decent mapping for this, the current one is based off a left top right and top to bottom ordering which can work for some sets but not others also random selection seems to perform well also but I was thinking since it is just a 1D set some kind of shuffled quasi-random set would be good but I need to check this out.

The PMJ algorithm uses shuffling and scrambling in a similar manner to this. They use a Owen scrambling hash that was created by Brent Burley I can implement that.

There is an error in our implementation in that our hashing does not preserve the properties of the sample set very well. Also we don't really shuffle the sample order only select from different sets.

Instead of using the texture lookup I tried out using Kellers Owen Scrambled Sobol sequence and it gives very good results and even runs faster than the current implementation on my CPU (probably due to better caching as it does not use the texture fetch). However, there are a few issues that need resolving but it appears to be a good candidate. The multi-dimensional sample plots are as follows


they not as good as the multi-jittered sets above but it is a infinite set that can be used without eventually resorting to random sampling. I think with some tweaking the sets could be made better.

It seems that even with the new sampling strategies the results are only slightly improved. It still appears that the non-progressive scheme is way better. However, it is not clear exactly why. I have found a few reasons for the tile edges being different due to correlated sampling issues at the edges of tiles due to the integrator seed matching the tile hash for when hash_uint2(seed, 0) matches hash_uint2(x, y) as these cancel each other out as they are combined using hash1^hash2.

So I have found that the correlation is caused by the initialization of the pixel rng_hash with

*rng_hash = hash_uint2(x, y);

replacing this with rand() gives the correct results. I will work on a more suitable fix that gives better repeatability. Here is the CMJ image


and the PMJ image with random hack mentioned above

both with 256 samples per pixel. As you can see they work equally well.

One of the renders seems to have denoising enabled so it's hard to compare?

But if the solution is just changing fixing hash functions or ensuring we don't use the same hash function and create correlations, that is hopefully quite straightforward.

The random hack seems to have quite more noise, am I right?

Yes there is more noise. I'll re-render the images but the main point is that there are more reflections than before. Previously very few samples reached the later bounces. What I think is causing it is that hash is generally a uint but in many cases it is mixed with an int. However, if the uint is too large then it becomes negative and this can be combined with the dimension the values are subtracted instead of a added causing correlation of some form. This appears validated by the fact that rand() returns an int and not a uint and so would have no negative values. However, upon replacing some of these uint and int combinations to only work in the uint space seems to make everything worse not better.

The sampling issues are caused because the PMJ sampler uses the same samples for the same dimension. By shuffling the sample the correlation issue can be avoided as then the same sample pattern is not used at the same dimension for every pixel. There are 3 choices to shuffle the sample pattern used per dimension or the individual component values of the sample or both.

I have now fixed the PMJ sampler and added a patch D11682 with a whole lot more. I probably needs work to meet the correct standards. The only bits needed to actually fix just PMJ are the updates to

path_rng_1D
path_rng_2D
pmj_sample_1D
pmj_sample_2D

and the functions they drag in for shuffling. Here is an update image using the PMJ sampling with the fix


the full patch also adds a different pixel hash and QMC sampling patterns with various shuffling and scrambling options. These appear of a similar or better quality especially if the number of samples is large I provided and image below

Also added in the patch are some new options in the advanced section

Are all these pattern/hash/shuffle options really needed by a user point of view? I mean, it's good to find solutions, but I think the best direction for Cycles is to remove sliders and checkboxes.
I wish all the sampling choices could be made under the hood. We're even close to see the "just set final quality" workflow...

Are all these pattern/hash/shuffle options really needed by a user point of view? I mean, it's good to find solutions, but I think the best direction for Cycles is to remove sliders and checkboxes.
I wish all the sampling choices could be made under the hood. We're even close to see the "just set final quality" workflow...

Don't listen to this guy, the more options the merrier. Sure artists might never use it but what if the TD wants to save your production 1000s of dollars on a render? No reason not to have these under the hood somewhere. I've been using cycles for 10 years. optimizing renders is my thing and what I get paid to do. its near impossible for cycles to know what it should use in every scene, sure it can take a pretty good guess but it wont be perfect. We should at least have some TD options (that we can toggle like the debug options) for fine tuning renders that artists might not care about.

Are all these pattern/hash/shuffle options really needed by a user point of view? I mean, it's good to find solutions, but I think the best direction for Cycles is to remove sliders and checkboxes.
I wish all the sampling choices could be made under the hood. We're even close to see the "just set final quality" workflow...

This is a very horrible suggestion. You need that control on larger productions.

Both cases can be addressed by using sensible default so you don't have to change the settings unless you are interested in doing so. I was wondering if the settings should be identical for all devices as certain settings would be better (read faster convergence) on a CPU than a GPU or is it better for the images to be more similar by using the exact same sampling algorithm. I need to perform some tests to figure this out and also the ideal default settings.

All options need to have some justification for existing, we always look at it on a case-by-case basis. Note that more options here can negatively impact GPU rendering performance, since this random number sampling is inlined in many places and can lead to extra register pressure or less efficient instruction cache.

CPU and GPU should use the same sampling algorithm, especially since Cycles supports combined CPU and GPU rendering.