Sss gpu render speed degradation
System Information
CentOS, Nvidia GTX 1070

Blender Version
Broken: 2.79 713027b8325 daily build, 2.8
Worked: 2.79b official

Short description of error
Rendering sss on gpu take more time in the daily build and 2.8.
This particular scene: 2.79b - 1:12, 2.79 daily build - 1:23
With more complex shading it was even 1:12 vs 1:38, and 1:42 on 2.8
Rendering on CPU is the opposite - faster in the daily build.
The settings are important. This behavior is only for:
branched path tracing
subsurface samples = 3
hdri as a light source (multiple importance is on)

And what's more, if you set the subsurface in the shader to 0, you'll get almost the same render time, but in 2.79b it'll have noticeably less noise.
Don't know if it's relevant and should i create a new report on this.

Exact steps for others to reproduce the error
Open the scene.
Hit render.
Repeat in the different versions of blender.



First question: Are you using "Hybrid" rendering in master?

I mean this (not available in 2.79b):

I tried with and without CPU enabled (I have a whimpy 10years old XEONx2 with a GTX780 3GB so the CPU is the weakest part) and here are my results:

VersionTile SizeCPU enabledRender Time
Master256x256Yes12.34min (!!!)
Master32x32Yes1.31min (CPU is still taking too long per tile)
Master16x16Yes1.22min (Now the tiles are small enough for the CPU to no longer slow down the whole process)

As you can see tile size is king and very important. There was a change after 2.79b where small GPU tiles were no longer a problem so it could be used together with usually smaller CPU tiles.
Most likely your "regression" is none but your render settings just need a little adjustment of the tile size to render even faster than 2.79b.
If you have a halfway decent CPU you will render a lot faster even.

No, i used GPU only render with 256x256 tile size.
And by the way, i didn't know about smaller tiles improvement! Now i've tried 32x32 and got 1:18. But it takes still a bit more than 256x256 on 2.79b (1:12)

Depending on your hardware and the scene configuration you might need to fiddle a bit with the tile size to find the sweet spot.
In my case it's almost always 32x32 or 64x64 but that can vary. Like with your scene it's 16x16 although I didn't test 8x8.

No no, the tile size is understood and i'm really happy with those news. On my hardware it works best on 32x32, but the thing is even with this sweet spot it's still a bit slower than 2.79b with 256x256 GPU only. I could get the exact same time with the hybrid render - 1:12, but it's still more, than gpu alone on 2.79b, right?

Tested some more scenes and i guess it's scene related and can vary a bit. The speed is about the same. Somehow the tile size speedup slipped away from me. The light behaves a little different, though, but the task can be closed now, i think.

