This implements Arvo's "Stratified sampling of spherical triangles". Similar to how we sample rectangular area lights, this is sampling triangles over their solid angle. It does significantly improve sampling close to the triangle, but doesn't do much for more distant triangles. So I added a simple heuristic to switch between the two methods. Unfortunately, I expect this to add render time in any case, even when it does not make any difference whatsoever. It'll take some benchmarking with various scenes and hardware to estimate how severe the impact is and if it is worth the change.

# Details

# Diff Detail

- Repository
- rB Blender

I just noticed a bug in this myself - it doesn't work properly with instanced triangles, as it does not apply the transform. Will try and fix this.

I added support for instancing and object motion blur. Deformation motion blur is still missing, and I don't think it was supported before either.

Deformation motion blur is now supported too. Makes quite a difference, as before deformation motion blurred objects were not sampled as light sources.

This is great, just like quad solid angle sampling the improvement will be especially noticeable inside volumes.

intern/cycles/kernel/kernel_light.h | ||
---|---|---|

781 | You can pass | |

838 | This might not be a great estimate for shading points near the middle of long thin triangles. The distance from the point to the plane would have some false positives, but still avoids most of the cost I expect. distance_to_plane = abs(dot(N, A)) / len(N) | |

843–848 | I think it would be simpler and faster to share the computation of | |

861 | triangle area computation could be optimized since it's | |

873 | Spaces after | |

875 | I think this formula is wrong. There's two pdf's here that need to be multiplied together. One for sampling a point in the triangle: pdf_triangle = t*t/(cos_pi * area_post) And the other for picking a triangle in pdf_distribution = area_pre * kernel_data.integrator.pdf_triangles
pdf *= area_pre / area_post; | |

892 | Use | |

900 | We can immediately return here. |

A new update, taking Brecht's comments into account and a few other improvements.

I'm not sure if there is a perfect heuristic to switch between the two sampling strategies - in certain cases, I could make the line where the switch happens visible as a sudden change in noise - still, at an overall better quality than before the patch.

Looks good to me now.

It's possible to make a transition region where it chooses between the two sampling methods, but I'm not sure if that's worth it in practice.

This one's not quite ready yet. I'm seeing odd artifacts when using this on CUDA hardware that doesn't show up when rendering on the CPU. Can't say yet what's causing it. This needs some investigation.

One more round of improvements. A few optimisations, and a change to the heuristic for switching between sampling strategies. Now it looks at the triangle's edge lengths instead of its area, that should hopefully help long and thin triangles.

About the cost of the improved sampling:

The BMW benchmark scene, which is lit entirely by a large mesh light, goes on my machine from 11m 6s to 11m 50s. Since those are large mesh lights, pretty much every pixel in the scene is using the more expensive sampling path. The benefits of the better sampling are however not visible, since most of the noise in that scene comes from glossy reflections, not the area light.

Most of the performance penalty appears to come from the trig functions. I did experiment with using SSE to vectorise the calls to normalise(), but that didn't do much other than make the code less readable. If anyone else has suggestions for improvements, I'm all ears.

I don't immediately have a good suggestion to optimize this. It's possible in principled to use SIMD for the cross product, normalize and fast_acos (which is likely the slowest part), but that's not so simple to implement. To me performance seems acceptable.

To be clear, this should be committed after 2.79 is released since we are in bugfix only mode.