Page MenuHome

Fix T43835, T54284: Cycles with no ray offsetting
Needs ReviewPublic

Authored by Ha Hyung-jin (robobeg) on Wed, Nov 13, 6:08 PM.

Details

Summary

Ray offsetting adopted in Cycles has been reported to cause various artifacts: T43835, T54284, and etc. These artifacts stand out when the scene is far from the origin or the scale of the scene is too large or too small compared to 1. In the case of instancing, the problem becomes worse because the ray offset, calculated for the world position, scale, and axis directions of the instanced object, is transformed into the object space during ray-object intersection.

There was an experiment D1212, which tried to address these problems by skipping the ray push and instead checking distance to triangle with epsilon during intersection. The result was not satisfactory.

This patch takes a different approach to tackle the problem. Instead of ray offset or epsilon, the following rules are enforced to the bvh traversal/intersection algorithms for preventing a ray from intersecting a surface that it has just departed from:

  1. A ray departing from a surface primitive does not intersect with the primitive.
  2. A shadow ray connecting a surface primitive and a light primitive does not intersect with both the primitives.

These rules are evident if the primitives are triangles. The rules are also applicable to a curve consisting of line segments if each line segment is treated as a separate primitive. In the case of a curve consisting of cardinal curve segments, the fact is utilized that a cardinal curve segment is subdivided into a piecewise line segment on the fly for finding ray-curve intersection. The subdivided line segment is treated as a separate primitive, which allows a ray to intersect with(be occluded by) the same cardinal curve it departed while prohibiting it from intersecting to the same departure point. (It turns out that the ray-curve intersection point is refined later to that on a real cardinal curve, which already gives enough offset. However, as subdivision level is raised, a gap between a piecewise linear approx. and a real curve becomes reduced. On the other hand, shadow occlusion check is still done against the linear approx. Therefore, I conclude that excluding a tiny curve parameter space including the line segment while searching intersection would do good with little harm.)

What if a ray hits a boundary of two primitives? For that case, this patch does nothing because the probability that a ray falls on the range of numerical errors between two adjacent primitives should be very very low and would not contribute to the render result. Otherwise, the tessellation itself is problematic: the primitives are either several orders of magnitude larger than the current view frustum or too small that their sizes are already comparable to floating point precision errors.

A ray skipping a departing primitive alone suffers when there is another primitive overlapping it, which causes unpredictable back and forth of ray between two primitives and results in unobtrusive visual artifacts. This patch implements a novel method for coping with the problem. A tight ray start time(tstart) is estimated for each ray before bvh traversal to cull overlapped meshes and remove artifacts, with the same math used by cycles internal bvh traversal/ray intersection routines. Custom implementations ( and perhaps extra APIs ) are required to make this (estimation of ray start time) work for external path tracing libraries such as NVIDIA Optix and Intel Embree. Without a ray start time estimation, however, self-intersections are still prevented by the rules above.

Comments on struct Ray, struct Intersection, and struct PathState in 'kernel_types.h' explain how this patch works. All the other modifications are just a bookkeeping for applying the rules.

The following results will demonstrate the effectiveness of this patch.

T43835

BeforeAfter

T54284

BeforeAfter

@Brecht Van Lommel (brecht) 's test set of different scales and origins
blender file:

BeforeAfter
Size 1
Size 1e-3
Size 1e-5
Size 1e3
Size 1e5
Origin 1e3
Origin 1e5

This patch is not intended just as a proof of concept. The following are benchmark results of the demo scenes:

Windows 10 64bit
GPU: NVIDIA GeForce RTX 2080 SUPER
CPU: Intel(R) Core(TM) i7-9700

'Cosmos Laundromat Demo'

BeforeAfter
Image
Optix (tiles 64x64)05:25.4605:34.35
CUDA (tiles 64x64)07:34.4507:43.73
CPU (tiles 32x32)38:22.5038:47.47

'Agent 327 Barbershop'

BeforeAfter
Image
CUDA (tiles 64x64)08:42.4808:49.92

'Spring'

BeforeAfter
Image
CPU (tiles 16x16)25:19.9025:21.64

Diff Detail

Repository
rB Blender

Event Timeline

Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Wed, Nov 13, 6:12 PM
Ha Hyung-jin (robobeg) retitled this revision from Fix T43835, T54284: Cycles with no ray offsetting and epsilon to Fix T43835, T54284: Cycles with no ray offsetting and epsilon testing.Wed, Nov 13, 6:35 PM
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Wed, Nov 13, 7:03 PM
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Wed, Nov 13, 7:16 PM
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Wed, Nov 13, 7:20 PM
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Wed, Nov 13, 7:41 PM
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Thu, Nov 14, 1:47 AM
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Thu, Nov 14, 4:16 AM
Sergey Sharybin (sergey) requested changes to this revision.Thu, Nov 14, 11:19 AM

That is exciting to see movement on this front!

Unfortunately, with this change there are some real regressions caught by out regression suit. Just some of them: visibility_particles, ambient_occlusion_only_local, point density_vertices_world, volume_overlap, volume_scatter_albedo.

This revision now requires changes to proceed.Thu, Nov 14, 11:19 AM

That is exciting to see movement on this front!
Unfortunately, with this change there are some real regressions caught by out regression suit. Just some of them: visibility_particles, ambient_occlusion_only_local, point density_vertices_world, volume_overlap, volume_scatter_albedo.

Hi. What do yo mean by the regression suit and how do i check them?

Ok. I found the related page. Thank.
Oops, I found some bugs related to local intersection immediately. Thanks a lot.

@Ha Hyung-jin (robobeg), i am confused a bit. Do you still need help figuring out the regression tests?

There is a page: https://wiki.blender.org/wiki/Tools/Tests/Setup

Can also share HTML report which could give some clues about what's going wrong.

8>>! In D6250#144466, @Sergey Sharybin (sergey) wrote:

That is exciting to see movement on this front!
Unfortunately, with this change there are some real regressions caught by out regression suit. Just some of them: visibility_particles, ambient_occlusion_only_local, point density_vertices_world, volume_overlap, volume_scatter_albedo.

@Ha Hyung-jin (robobeg), i am confused a bit. Do you still need help figuring out the regression tests?
There is a page: https://wiki.blender.org/wiki/Tools/Tests/Setup
Can also share HTML report which could give some clues about what's going wrong.

Thanks but I already ran the tests and reviewed the results. Some bugs are fixed but there still seem lots to do. I just feel like being out of date since I've worked with old blender versions.

Many bugs are fixed.

Remaining tests to pass:
( * : noticeable differences )

  • hair geom reflection​
  • hair geom transmission​
  • hair reflection​
  • visibility particles​

hair basemesh intercept​
hair instancer uv​
hair particle random​
principled hair absorptioncoefficient​
principled hair directcoloring​
principled hair melaninconcentration​
T41143

Remaining tests to pass, with SIMULATE_RAY_OFFSET defined in 'kernel_types.h'

  • hair geom reflection​
  • hair geom transmission​
  • hair reflection​

The other test differences seem to be most likely due to 'ray offsetting' which causes various artifacts in the repository version.

Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Fri, Nov 15, 5:03 PM
Ha Hyung-jin (robobeg) updated this revision to Diff 19682.EditedSun, Nov 17, 5:55 PM

Now I claim that this patch passed all the cycles regression tests.

The patched version with SIMULATE_RAY_OFFSET defined in 'kernel_types.h' passed all the cycles regression tests.

The patched version without SIMULATE_RAY_OFFSET failed the following tests:

test group 1, not using hair BSDF

  • hair basemesh intercept
  • hair instancer uv
  • hair particle random
  • visibility particles
  • T41143

test group 2, using hair BSDF

  • hair geom reflection
  • hair geom transmission
  • hair reflection
  • principled hair absorption coefficient
  • principled hair direct coloring
  • principled hair melanin concentration

In the case of test group 1, the only difference between with and without SIMULATE_RAY_OFFSET is that with SIMULATE_RAY_OFFSET, 'extra' ray offsets are given on top of the basic provision of this patch. Therefore, it can be said the test results only show numerical differences. I also found that test sets for 'hair instancer uv', 'hair particle random', and 'visibility particles' suffered from overlapped particles(meshes), which cause the following differences.

without SIMULATE_RAY_OFFSETwith SIMULATE_RAY_OFFSET

Although ray offsetting has a side effect of covering overlapped meshes, it should not be required that overlapped meshes be rendered 'normally'. The test sets should be revised.

The situation of test group 2 is more complicated. With hair BSDFs in cycles, both ray offsetting and allowing self-intersection are required to produce the same render results, i.e. a ray may be shot backward to make it self-intersect. So with SIMULATE_RAY_OFFSET, the logic for avoiding self-intersection is also turned off for surfaces with hair BSDFs. Let us see 'hair geom reflection' as an example:

without SIMULATE_RAY_OFFSETwith SIMULATE_RAY_OFFSET

The latter is slightly brighter because 'self-intersection' is allowed, but other than that I cannot see any difference in visual quality. Moreover, it may not sustain the hair effect if the object is translated/rotated/scaled/instanced, for the same reason that the ray offsetting fails in the current repository version.

Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Sun, Nov 17, 6:21 PM

Logic for preventing self-intersection within a cardinal curve is revised to be consistent with logic for other primitives.
Epsilon check is introduced between adjacent line segments to make the algorithm more numerically robust. Since this epsilon is defined in the curve parameter space [0,1], it will not cause any harm unlike those defined in the world space or the object space.

Some inlined comments which I've potted while compiling this patch.

Now, for the hair_instancer_uv.blend test. Don't think the issue is only due to intersecting geometry. Here is a render result of 2K image with 1K samples.

masterD6250

To me this looks like a real render artifacts.

As for hair shader on triangle geoemtry the shading difference might actually be expected. This case is somewhat stretching the shader above what it was supposed to be used for.

Did you compare performance (aka render time) before and after applying this patch?

intern/cycles/kernel/kernel_emission.h
218
#else  // __SIMULATE_RAY_OFFSET__
222
#endif  // __SIMULATE_RAY_OFFSET__
intern/cycles/kernel/kernel_light.h
539

Suggest submitting such semantic fixes as a separate patch.
Will be faster to review and apply, and reduce size of this patch.

intern/cycles/kernel/kernel_path_surface.h
396
#endif  // __SIMULATE_RAY_OFFSET__
intern/cycles/kernel/kernel_types.h
697

Shouldn't this be float?

Thanks for spotting my last-minute error! I had to re-render benchmark scenes. Here are the results:

'The Spring'

masterD6250
Image
CPU25:19:9025:05:07

After changing the rule on cardinal curves, the hair shade of D6250 is slightly changed that two results become nearly indistinguishable.

'Barbershop'

masterD6250
CUDA08:42:4808:55:52

Render times are almost the same.

About 'hair_instancer_uv', particles are indeed overlapped. 1000 particles are set to emit from faces of a base sphere, which is polygonized at a low level. In 'hair_particle_random', you can even watch spheres with different colors are overlapped. The 'emit from' property of the current particle system should be revised, I believe.

And that is one weakness of this patch. Unlike ray offsetting, overlapped meshes are rendered really really ugly.
So one of my idea is to provide 'bouncing ray offsetting' as a checkable property of an individual object for quick remedy for mesh overlapping, how about that? As I showed, this patch's method and ray offsetting coexist well and that may help people to be adapted between two methods. ( The exception codes for 'hair bsdf' and SIMULATE_RAY_OFFSET macro will be removed, of course. Those were just for proving my argument. )

Ha Hyung-jin (robobeg) updated this revision to Diff 19766.EditedThu, Nov 21, 8:41 PM
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)

Cycles object property 'offset_bouncing_ray' and light property 'offset_emitting_ray' are added. Their default values are 'False' but if given 'True' for older blender files after the blender minor version number being raised, all previous versions of blender files would be rendered as before with the mesh overlap issue as less noticed as before. ( Self-intersection avoidance is guaranteed now with this patch, but brightness of hair geom test results will become darken a little. ) I believe this is how blender has evolved. The related UI change is not included, and it seems better for blender staffs to consider the UI layout themselves.

Mesh overlap occurs when areas of triangle primitives overlap so that the ray-triangle intersection algorithm cannot determine which one is nearer at each point on the entire overlapped area. This is a path-tracing version of z-fighting, happens only when meshes are placed with near-identical postures, and the render result is not defined over the overlapped area. While ray offsetting is relatively in luck because a ray usually happens to escape from the overlapped area by offsetting, this patch's method (called 'primitive id checking' in the literature) suffers from it directly because a ray does not leave the overlapped area although it skips the departing primitive.

If this patch is accepted, systems in blender that are guaranteed to generate mesh overlapping should be revised, I think. But we buy some time to do that with 'ray offsetting' still in one hand. If this patch is accepted, it would be much easier for me to contribute because my some other works are based on this.

Having such per-object ray offsetting options is something which will turn out unusable in a real production.

A bit stretched example, but think of the hair_instancer_uv file. You'd to have it rendered correctly in the proximity of world origin you need some ray offsetting going on. To render it in a scene which it's animated (via armature deform, just to make things more interesting) far away from origin you'd need to disable ray offsetting (to address the issue this patch is solving). But then you trade one artifacts with another.

In practice, when character consists of many many objects you can not possibly keep all those settings fine-tuned in a per-scene basis.

systems in blender that are guaranteed to generate mesh overlapping should be revised

I am not aware of such systems. Blender doesn't restrict from having objects intersections. It's almost unaviodable in a scene set construction.

I would say that the solution to the original problem should be helping artists to make quality renders with less thinks to worry about and less knobs to be tweaked.
There could be a tweak to this patch. Or could be completely separate approach: normalizing scene internally in Cycles in a way which brings area of interest to the world's origin.

systems in blender that are guaranteed to generate mesh overlapping should be revised

I am not aware of such systems. Blender doesn't restrict from having objects intersections. It's almost unaviodable in a scene set construction.

Generally, overlapping faces already have artifacts in Cycles (and Eevee and the viewport), since we can't reliably determine which is closest. Artist normally avoid them and mesh tools and modifiers don't typically generate them.

So in that sense I think it's a good trade-off to have that case fail here as well. Other types of intersections without overlapping faces are common in production, but not affected by this I think.

I would say that the solution to the original problem should be helping artists to make quality renders with less thinks to worry about and less knobs to be tweaked.

Agree that we should avoid adding a setting here if we can.

There could be a tweak to this patch. Or could be completely separate approach: normalizing scene internally in Cycles in a way which brings area of interest to the world's origin.

If this is automatically determined it would not be temporally stable in animation. And arguably, if there are precision problems then they also exists in other areas of Blender or in the shading node evaluation which would still work in world space. So I'm not sure if there is a good automatic solution for that problem (besides double precision).

Recently I encountered an application which requires not only to render a scene filled with instanced objects correctly with arbitrary scaling but also to deal with objects of overlapped meshes without noticeable artifacts. For that application, my solution was to use this patch's method ( 'primitive id checking' ) with providing 'ray offsetting' as an additional option. I believe that we should put first priority on rendering a normal scene without glitches, but I admit that I am also distracted by mesh overlapping artifacts.

But I think that I finally figured out how to have the advantage of this patch's method while easing most mesh overlapping artifacts that artists would face with in blender. I will report the result soon.

intern/cycles/kernel/kernel_light.h
539

Those fixes were for checking if I handled all the cases where object == OBJECT_NONE indicates it is not instanced, i.e., its transform is pre-applied, not a non-object.

Should I submit those as a separate patch?

intern/cycles/kernel/kernel_types.h
697

Yes! Thanks for finding that.

Ha Hyung-jin (robobeg) updated this revision to Diff 19955.EditedThu, Nov 28, 10:13 PM
Ha Hyung-jin (robobeg) retitled this revision from Fix T43835, T54284: Cycles with no ray offsetting and epsilon testing to Fix T43835, T54284: Cycles with no ray offsetting.

Artifacts caused by mesh overlapping are resolved.

Regression tests results: ( Those hair tests still failed because of numerical differences between with and without ray offsetting. )

hair_instancer_uv, 2K image with 1K samples:

Merit of no ray offsetting is still strong: ( Artifacts in line curves of Size1e-5 worsen since the update adding an epsilon test on curve segment boundaries, before this patch revision. )

T43835T54284Origin1e5Size1e5Size1e-5

A tight ray start time(tstart) is estimated for each ray before bvh traversal to cull overlapped meshes and remove artifacts, with the same math used by cycles internal bvh traversal/ray intersection routines. Custom implementations ( and perhaps extra APIs ) are required to estimate a tight ray start time for external path tracing libraries such as NVIDIA Optix and Intel Embree.

Render time is increased slightly. ( about 30s for CPU rendering of 'The Spring' ) Benchmark results will be updated soon.

Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Fri, Nov 29, 1:01 AM
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Fri, Nov 29, 1:04 AM
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Fri, Nov 29, 1:12 AM
Ha Hyung-jin (robobeg) edited the summary of this revision. (Show Details)Fri, Nov 29, 1:29 PM

Commented with some nitpicks on the OptiX side of things. Would prefer not to consume more payload registers (those come at a cost and accessing pointers through them especially), but it may not be possible to avoid that.

It seems like it is not necessary to store all of src_prim, src_object and src_type in the ray though. The primitive index (src_prim_index) should uniquely identify any primitive, so just storing and comparing that should be enough? I also don't believe it would hurt much to just look up src_type in place from __prim_type where necessary (since it's only really used for curves), but haven't benchmarked. Doing so could make it possible to avoid passing in the ray pointer through the payload and instead just pass through a primitive index that should be avoided during intersection.

intern/cycles/kernel/kernel_types.h
1024

OBJECT_NONE

intern/cycles/kernel/kernels/optix/kernel_optix.cu
262

This anyhit program is never executed for anything but triangles (curves are added to the AS with the OPTIX_GEOMETRY_FLAG_DISABLE_ANYHIT and all traces that force an anyhit use a different anyhit program).

308

This is not necessary: The intersection program can never be called for a primitive that is not a curve, since that is the only existing custom primitive type and triangles by definition cannot go through this.

321

Some spaces sneaked in here and below =P

Thank @Patrick Mours (pmoursnv) for your advice. I have practically zero knowledge on OptiX.

On the cycles bvh side, src_object is needed for an instanced object and src_prim_index alone does not identify it. Although src_prim can be looked up with src_prim_index, different prim_indices map to a same prim, which is coupled with object id to identify a primitive. Without storing src_prim, table lookup will be repeated to find it later. Also src_type now stores a flag at its highest bit, which indicates whether the object is pre-transformed/instanced or not. This information is needed later to calculate a tight ray start time for a new ray direction for culling overlapping meshes. Although all these info may be found later except src_object, my intent was to reduce repeated table lookup and recycle already found info on the cycles side.

Now, I have a question. Does OptiX currently provide APIs for implementing ray_update_tstart()? The calulated ray start time ( or epsilon ) by ray_update_tstart() must reflect the math inside OptiX bvh traversal algorithms accurately to cull overlapping meshes and offset the ray from the departing primitive consistently due to 'floating point determinism'.

It looked like ray_update_tstart was only called in places where the ray was already set up, so the type data was available and could e.g. be passed in as another argument to the function. But really I was only searching for ways to avoid additional payload registers.

And you are right, I was looking at it from the Cycles OptiX implementation and there the primitive index is a one-to-one mapping, but that is not the case for Cycles own BVH. It may be possible to take advantage of this for OptiX though. I can experiment a bit to see if there would be any advantage.

As for a triangle_update_tstart implementation for OptiX: OptiX has no such API currently. And it is made more complicated by the fact that intersection precision may vary depending on which GPU generation is used and how the AS was build (e.g. a motion blur AS can behave differently from a non-motion one). I also don't think it would be easy to provide such an API because of how the hardware works, but I'll raise this with the team here to check.

Implementation of triangle_update_tstart() in OptiX may be easier than it looks. Currently object inv transform matrices needed are all supplied from outside, not calculated inside the function. (The calculation code inside is for contingency and also serves as specification.)

On the other hand, since a ray has all information to calculate triangle_update_tstart() now, it may also be possible to calculate it inside OptiX bvh traversal modules, not exposing the value outside.

External path tracing libraries now skip triangle_update_tstart() because the function does not work with them at all.
Some OptiX code is modified not to check unnecessary conditions. Optimization for OptiX is still left to experts.

Ha Hyung-jin (robobeg) updated this revision to Diff 19995.EditedSun, Dec 1, 3:34 PM
Ha Hyung-jin (robobeg) marked an inline comment as done.

For an AVX2-optimized CPU kernel, bvh traversal routines are set to use triangle_intersect() temporarily instead of triangle_intersect8() for rays with tstart updated.

Numerical difference between float calculation with and without AVX2 is somehow expected and a proper solution would be to implement an AVX2 version of triangle_update_tstart() for primitives supposed to be handed over to triangle_intersect8(). The problem is that among obvh intersection routines, some use triangle_intersect8() while others still use triangle_intersect().

Ha Hyung-jin (robobeg) updated this revision to Diff 20023.EditedMon, Dec 2, 8:44 PM

In addition to triangle_update_tstart(), motion_triangle_update_tstart() is implemented to calculate a tight ray start time for motion triangle primitives.
I have completely forgotten triangle vertex motion.

Ha Hyung-jin (robobeg) updated this revision to Diff 20055.EditedTue, Dec 3, 10:51 PM

For external path tracing libraries not supporting a feature for estimation of a tight ray start time at the moment, ray_offset() is restored as a fallback. However, self-intersection is still guaranteed to be prevented by this patch for them. Only functions ray_offset() and ray_update_tstart() need to be modified when those libraries are ready to support the feature.

Now this patch has achieved: prevention of self-intersection of ray-triangle, no glitches caused by ray offsetting, and reduced artifacts of overlapping meshes on a par with ray offsetting. From this patch, artists will get only benefits with no extra burdens.

For AVX2-optimized CPU kernels, bvh traversal routines are set to use triangle_intersect() temporarily instead of triangle_intersect8() for rays with a tight ray start time estimated, because two functions return slightly different results while bvh routines use both functions intermixed. One quick solution is to choose a greater value after estimating both tight ray start times with/without AVX2 optimzation but it increases unnecessary overhead. Besides, I also found some negative zero issues in triangle_intersect8(). I plan to deal with these as a separate patch.

Meanwhile, the patched version now outperforms the master version with blender benchmark scenes even with AVX2-optimized CPU kernels. Benchmark results will be updated soon.