Cycles: disable fast math flags, only use a subset.
ClosedPublic

Authored by Brecht Van Lommel (brecht) on Sep 7 2017, 2:07 PM.

Details

Summary

See these pages for details on the meaning of the flags:
https://gcc.gnu.org/wiki/FloatingPointMath
https://msdn.microsoft.com/en-us/library/aa289157(vs.%2071).aspx

Empty BVH nodes are set to NaN which must be preserved all the way to the
tnear <= tfar test which can then give false for empty nodes. This needs
strict semantices and careful argument ordering for min() and max(), so
the second argument is used if either of the arguments is NaN.

Fixes T52635: crash in BVH traversal with SSE4.1.

Diff Detail

Repository
rB Blender

Only tested with macOS + clang (Xcode 8.1) so far, where I found no measurable performance impact.

Linux + GCC 6.3 also shows no performance regression.

On Windows I'm getting a 6% slowdown with this patch. I can't find more fine grained compiler flags to solve that.

We can risk continuing to use fp:fast there, since perhaps MSVC is less aggressive in its optimizations and doesn't seem to give us any issues in BVH traversal. It would have been nice to avoid that though.

On Windows I'm getting a 6% slowdown with this patch. I can't find more fine grained compiler flags to solve that.

I'm not sure what kind of control you are looking for. you can change the behavior for certain parts of the code from the default with a construct like this in msvc

#pragma float_control(precise, on, push)  
// Code that uses /fp:precise mode  
#pragma float_control(pop)

I was looking for flags like -fno-signed-zeros and -fno-rounding-math. If these were available we could enable them selectively and figure out which ones have a significant impact on performance and are hopefully safe to use.

Figuring out which specific code gets slowed down by using fp:precise instead of fp:fast could be done with those pragmas, and then we could manually do the optimizations, but it's too much work.

Ah in that case i think you're looking for _controlfp ( https://msdn.microsoft.com/en-us/library/e9b52ceh.aspx ) but i have to admit, i knew it existed, but have never used it..

I don't expect performance improvements from changing the float controls states on modern CPUs. It used to be that NaN/Inf slowed things down but I believe that's no longer the case.

What an option like -fno-rounding-math does is a bit different: it lets the compiler assume the default rounding mode is always used. Without that it can't always constant fold an expression like a * b, because the result would be different depending on the rounding mode which can change at runtime.

I am all up to get rid of fast math, it's only causes issues lately.

Maybe we can leave MSVC as-is for now, and get rid of fast math after we see what exactly cauzes such a slowdown?

This revision is now accepted and ready to land.Sep 8 2017, 2:51 PM

I'll commit this keeping /fp:fast for MSVC.

This revision was automatically updated to reflect the committed changes.