Ray offsetting adopted in Cycles has been reported to cause various artifacts: T43835, T54284, and etc. These artifacts stand out when the scene is far from the origin or the scale of the scene is too large or too small compared to 1. In the case of instancing, the problem becomes worse because the ray offset, calculated for the world position, scale, and axis directions of the instanced object, is transformed into the object space during ray-object intersection.

There was an experiment D1212, which tried to address these problems by skipping the ray push and instead checking distance to triangle with epsilon during intersection. The result was not satisfactory.

This patch takes a different approach to tackle the problem. Instead of ray offset or epsilon, the following rules are enforced to the bvh traversal/intersection algorithms for preventing a ray from intersecting a surface that it has just departed from:

- A ray departing from a surface primitive does not intersect with the primitive.
- A shadow ray connecting a surface primitive and a light primitive does not intersect with both the primitives.

These rules are evident if the primitives are triangles. The rules are also applicable to a curve consisting of line segments if each line segment is treated as a separate primitive. In the case of a curve consisting of cardinal curve segments, the fact is utilized that a cardinal curve segment is subdivided into a piecewise line segment on the fly for finding ray-curve intersection. The subdivided line segment is treated as a separate primitive, which allows a ray to intersect with(be occluded by) the same cardinal curve it departed while prohibiting it from intersecting to the same departure point. (It turns out that the ray-curve intersection point is refined later to that on a real cardinal curve, which already gives enough offset. However, as subdivision level is raised, a gap between a piecewise linear approx. and a real curve becomes reduced. On the other hand, shadow occlusion check is still done against the linear approx. Therefore, I conclude that excluding a tiny curve parameter space including the line segment while searching intersection would do good with little harm.)

What if a ray hits a boundary of two primitives? For that case, this patch does nothing because the probability that a ray falls on the range of numerical errors between two adjacent primitives should be very very low and would not contribute to the render result. Otherwise, the tessellation itself is problematic: the primitives are either several orders of magnitude larger than the current view frustum or too small that their sizes are already comparable to floating point precision errors.

A ray skipping a departing primitive alone suffers when there is another primitive overlapping it, which causes unpredictable back and forth of ray between two primitives and results in unobtrusive visual artifacts. This patch implements a novel method for coping with the problem. A tight ray start time(tstart) is estimated for each ray before bvh traversal to cull overlapped meshes and remove artifacts, with the same math used by cycles internal bvh traversal/ray intersection routines. Custom implementations ( and perhaps extra APIs ) are required to make this (estimation of ray start time) work for external path tracing libraries such as NVIDIA Optix and Intel Embree. Without a ray start time estimation, however, self-intersections are still prevented by the rules above.

Comments on struct Ray, struct Intersection, and struct PathState in 'kernel_types.h' explain how this patch works. All the other modifications are just a bookkeeping for applying the rules.

The following results will demonstrate the effectiveness of this patch.

Before | After |

Before | After |

@Brecht Van Lommel (brecht) 's test set of different scales and origins

blender file:

Before | After | |

Size 1 | ||

Size 1e-3 | ||

Size 1e-5 | ||

Size 1e3 | ||

Size 1e5 | ||

Origin 1e3 | ||

Origin 1e5 | ||

This patch is not intended just as a proof of concept. The following are benchmark results of the demo scenes:

Windows 10 64bit

GPU: NVIDIA GeForce RTX 2080 SUPER

CPU: Intel(R) Core(TM) i7-9700

'Cosmos Laundromat Demo'

Before | After | ||

Image | |||

Optix (tiles 64x64) | 05:25.46 | 05:24.05 | |

CUDA (tiles 64x64) | 07:34.45 | 07:43.73 | |

CPU (tiles 32x32) | 38:09.92 | 38:26.68 | |

'Agent 327 Barbershop'

Before | After | |

Image | ||

CUDA (tiles 64x64) | 08:42.48 | 08:53.60 |

CPU (tiles 16x16) | 24.42.43 | 24.01.31 |

'Spring'

Before | After | |

Image | ||

CPU (tiles 16x16) | 25:03.65 | 25:01.07 |

2019.12.15 : All benchmark results are updated.