Page MenuHome

Cycles: CUDA renderings are approximately 30% slower in 2.71-testbuild1
Closed, ResolvedPublic

Description

System Information
Windows 7, x64, gtx 580, nvidia drivers 332.21

Blender Version
Broken: 2.71-testbuild1
Worked: 2.70a

Short description of error
Cycles is approximately 30% slower when rendering in 2.71 compared to 2.70

I used the Mike Pan benchmark (modified, find attached the version i used), rendered via commandline with the command blender -b BMW-MikePan.blend -o C:\tmp\ -f 1

2.70a rendered in 55 seconds
2.71 rendered in 73 seconds

This is consistent with real world scenes.

I am unable to pinpoint exactly down where it happened but somewhere between revision 25ad7b4 and revision cc439f6

Details

Type
Bug

Event Timeline

Sergey Sharybin (sergey) triaged this task as Normal priority.May 26 2014, 10:50 AM

Confirmed this slowdown

WIN 7 64Bit
GTX 650
BMW1M-MikePan-Scene
Tiles@256x256

blender-2.70-f93bc76-win64-vc12 (10.04.2014)
2min 7sec

blender-2.70-fce731a-win64 (26.05.2014)
2min 56sec

Brecht Van Lommel (brecht) raised the priority of this task from Normal to Confirmed, Medium.May 26 2014, 2:22 PM

I can confirm a similar slowdown on Windows with a 650M GT, but not on OS X. I'm not sure what is going on yet, don't have a setup on Windows where I can build with CUDA 6.0 at the moment.

It did try building with CUDA 6.5 early access and it seems to render a few percent faster than 2.70. So if we can't find a solution for 6.0 at least we can hope the next CUDA toolkit release solves it.

@Brecht Van Lommel (brecht). For windows for 2.71 I do not have a problem building sm20,21,30,35 with 5.0 and sm50 with cuda 6.0.

I think this will save us potentially a lot of reports.

Ok, found some further clues on OS X. The official testbuild also shows the slowdown, but my own cmake build does not, while a scons build does. So this may just be a cmake/scons difference, committed a fix in {b33d83bf51e8}.

Please test own builds or tomorrow's builder.blender.org builds to see if this is fixed now.

bmw.blend, 100 samples, Windows 7 x64, Geforce 540M (scons).

Blender 2.70a: 99s
b33d83b: 115s

Thats a 16% slowdown still, but I think that's what we have to expect with CUDA Toolkit 6.0.

Windows 7 64bit BMW1M-MikePan.blend @ 200 samples, 256x256 tilesize
Commandline blender -b -o -f 1

With my old and fastest Blender Build 2.69.11 for sm_50 / GTX750ti (maxregcount 40)
00:50.98

With Blender 2.70 since _launch_bounds_
01:14.05

Blender 2.70.5 b33d83b
00:56.98

Blender 2.70.5 b33d83b with MAX_REGISTERS 40 for sm_50
00:52.70

I have tested it on a current project...

2.70a : 2:17.82
b33d83b : 2:21.92

I think this sort of slowdown is acceptable...

The only problem now is that there is excessive ram usage..2.70a was using 632mb ram compared to b33d83b using 1359... I will make another bug report when i find the cause of this ram increase.

Thanks for quick patch!

@Carlo Andreacchio (candreacchio) Is this what blender reports as ram usage or what GPU-Z or some other util reports?
If it is the latter you might render a default cube in both releases and you might find a very much larger amount of ram even for simple scenes.

If this is the case this comes ultimately from cycles complexity and the general properties of GPU architectures. Increases the max registers will most likely drop this number but also carries a performance penalty in most cases.

@Brecht Van Lommel (brecht) I think a new bug report for this issue is best ?

@Martijn Berger (juicyfruit) Already created a new bugreport with a test case, tested via GPU-Z

@Carlo Andreacchio (candreacchio)
Ok in that case I think we (you) should open a separate issue. This issue is indirectly related to a possible slowdown if we change the max-registers but is will be a different slowdown then the one this report is about. ( Combination of new cuda and a buildsystem bug)

GTX660M BMW1M-MikePan.blend
before this patch 3:39:11
after this patch 2:58:02
official 270a 2:36:56

blender-2.70-f93bc76-win64-vc12 (10.04.2014)
2min 7sec

blender-2.70-06a05e4-win64 (27.05.2014)
2min 22sec

Marking this resolved now, this is about as good as I can do as I tried a lot of things to get this faster with CUDA 6 (using cmake). Now that scons gives the same results as cmake the slowdown matches the performance based on which we made the decision to switch to CUDA 6. I think CUDA 6.5 will help further but that's not something we can use for this release yet.

Hi, is it possible to DOwnload the latest version? somewhere else?

Thanks i read all the posts and so on. And have to say i dont understand why the Cuda Builds of all version after 2.70a are so slow.? I got a Geforce 750 Ti OC my Test Shot needed with my old version only 1 min and 41 sec... even the latest build it took more than 2 minutes..? what happend? the speed improvment of my 750 in comparission to my old 550Ti drop nearly by one third that is nothing good. if this will not fixed i will use two different versions one for GPU and one for Smoke renderings. is it not possible to update the old faster cuda 5.0 kernel?

A lot of features were added since 2.70, mainly deformation motion blur, fire and smoke, but also baking. Additional code makes our CUDA kernel slower. And we switched to the CUDA Toolkit 6.0, which comes with a slowdown unfortunately.

There is nothing we can do about this, at east not for Blender 2.71. We also don't have 10 different GPUs at hand, so we cannot test with every card. As Brecht said, switching to the upcoming CUDA Toolkit 6.5 might improve the performance again, but for now we cannot fix this.

my kernel_sm_50.cubin is from ‎Thursday, ‎20. March ‎2014, ‏‎12:30:25 filesize: 1.186 kb .... its alot faster...

Yes, that was before all the big changes.

2.71 is the first release that officially supports sm_50 cards. So this is just the way it is for now. We can try to improve the performance later..

thanks for the fast reply. and your work. you done a amazing job with cycles. But i have to say i will not switch to version 2.71 till this is fixed. Because when i use it, its like i wasted my money with having a new card and lost a lot of speed in calcualtion.

but maybe there is a problem with the deformation motion blur. i dont use motion blur in my scene settings, and if i turn off the deformation blur stuff, it render a bit faster, even i turned it off in the global settings? maybe there is one of the problem areas. ... so thanks for your work... and hopefully this get solved in later builds maybe in 2.72

I absolutely understand the problem, and we will try to solve this later.

But we are close to the 2.71 release and cannot do risky changes anymore. As said, the upcoming CUDA Toolkit 6.5 from nvidia will hopefully improve the performance again. But we have to wait until that gets released. :)

Thomas Dinges (dingto) added a subscriber: alex.EditedMay 27 2014, 3:24 PM

@- (alexgerman123): Brecht commited a improvement, please check a new Build from the Buildbot tomorrow (we only build 1x a day). https://developer.blender.org/rB55e4454db8edac58c3d64271d84263e5bb5e9c29

thanks i will do;)

Just tested it on a real world scene... lots of trees and foliage... Following results from a GTX 580

Blender 2.70a -- 2:30.40 rendertime -- 1351 ram usage
Blender 2.71-testbuild2 -- 2:10.55 rendertime -- 1431mb ram usage

Speed is faster now thanks to the optimizations throughout this release.

Thanks guys!

thanks the Testbuild Two is acceptable.. Nice work !!! my test scene only took 2 sec more.. that is ok..;)

Just a quick note.
I am not a native speaker but I think we should try and leave the acceptability to the people who have to accept the change in the end.
It might stem from a mistranslation / miss alignment with English but to me sentences like "this is totally unacceptable" somehow sound aggressive and if the person saying that is somehow the person who has to power to accept or not accept some change in blender.

Any and all regressions are bad and as far as I know all developers hate performance regressions with a passion but sometimes decision need to be made that negatively impact performance. This is not fun and probably never will be. But I think end the end it will have to be up to the people doing the actual work to decide if it is "acceptable".

just my 2 cents. And sorry if it is off topic

Sorry for the off topic. But i have to say something. Sorry for my words, but in my point of view blender will not make the breakthrough to the industry with the tools itself inside blender. it can only conquer the markets with cycles. The Key is to be faster and with more quality then other render tools. That is for alot of guys i know for myself the only reason to make the change to blender. And so in my point of view it will be not good to drop the performance of some of the tools. Because a true artist dont need a new special button for creating a very good new simulation in physics and so on. Every Studio is working with different tools to get the best out of every software. Blenders Cycles Engine is the fastest VFX Renderengine i know for years. in my entire life, and i do it now for more than 17 years. I know exactly from what iam talking about. The key is speed and Quality inside Cycles. That is the most important point for alot of guys to make the switch...

And the Cycles development is Awesome, Thanks to every single line code from every Developer and Special Thanks to Brecht.

....

Sorry for my words but that is my opinion. ... 7 Artists i know which are working inside Studios told me. Cycles is the only reason to change...right now.. sorry to say but that is true.. and cycles with blender is very good in that case.