Page MenuHome

Blender 2.81/2.82 - OpenCL - Significan slow render times
Closed, ResolvedPublicBUG

Tokens
"Y So Serious" token, awarded by lordodin."100" token, awarded by UrbenLegend."Burninate" token, awarded by ponomarovmax."Burninate" token, awarded by enenra."100" token, awarded by MasterNurmi.
Authored By

Description

System Information
Operating system: Windows 10 - AMD drivers 19.10.1
Graphics card: RX Vega 64

Blender Version
Broken: 2.81 - eefd806afc15 & 2.82 - 23e1fb365b65
Worked: 2.80 - official build

Short description of error
Cycles OpenCL - Using default BMW pulled from Blenders benchmark section, rendering in latest 2.81/2.82 experimental builds have significant slow render times

Exact steps for others to reproduce the error
Car Demo scene from : https://www.blender.org/download/demo-files/
Blender 256 128 64 32 Build
2.80 1m28s 1m30s 1m39s 1m56s official
2.81 2m.34s 2m34s 2m43s 3m20s eefd806afc15
2.82 2m.35s 2m34s 2m43s 3m25s 23e1fb365b65

more details and test by others reporting similar issue on blender nation post:
https://blenderartists.org/t/very-slow-gpu-rendering-in-cycles-opencl-blender-2-81/1186239

Event Timeline

Yes, I have noticed this also. The reason is that there are a lot of nodes/complexity (4d noise) added to cycles. The AMD compiler doesn't do an excellent job at reusing registries resulting that less work that can be done in parallel and speeds down the rendering process dramatically. This needs further investigation.

Brecht Van Lommel (brecht) lowered the priority of this task from 90 to High.Nov 11 2019, 4:01 PM

After benchmarking the BMW scene on 23564583a4988778b4c43496fd21818b286f6ba1 and 45d4c925799e94c6d442a9a9066af2d3305724e1, which are the commits before and after the new noise functions were added, there isn't any regression.

Since the Noise node is the only node in the BMW scene that we touched during GSoC, then the regression must be due to the increased size of the kernel as Brecht described before, perhaps due to the Voronoi node.

I will bisect the code and try to find the source of the regression.

Just confirmed that the new Voronoi is the source of the regression. What can we do about this?

@Omar Emara (OmarSquircleArt) first check if we can speed it up by commenting out the case 4: in svm_node_tex_voronoi and if this speeds the rendering up we could add this case in a special NODE_FEATURES construction.

My experience is that sometimes small changes can have huge impact in OpenCL rendertimes. It is not based on what is used, but what is compiled.

Last week we did several performance tests and it seems that the Vega architecture is really dropping behind. The Polaris architecture seems to hold up better. I don't have any data about NAVI atm.

The next results are from BMW on an AMD RX480

  • 2.80 (3:10)
  • 2.81 (5:48)
  • 2.81 excluding 4d voronoi (4:05)
  • 2.81 excluding 4d voronoi+mus+noise (4:05)
  • 2.81 excluding 4d voronoi+3d smooth (3:52)
  • 2.81 excluding 4d voronoi+3d (3:52)
  • 2.81 excluding 4d voronoi+3d smooth + 2d smooth (3:50)

Tagging with 2.81 since this fix really should go in the release.

Brecht Van Lommel (brecht) removed Jeroen Bakker (jbakker) as the assignee of this task.EditedNov 13 2019, 3:42 PM
Brecht Van Lommel (brecht) lowered the priority of this task from High to 50.

I think this is sufficiently resolved now for the 2.81 release.

During the 2.82 cycle we can look at getting back a bit more of the performance since it's still not fully the same.

@Jeroen Bakker (jbakker), not sure if you wanted to be assigned this follow up task, up to you.

Just wanted to say thanks for the work you are doing. I can confirm that render time is much better on my setup as well.

2.80 - 1m28s
2.81 - 2m34s
Nov14 builds
2.81 - 1m46s

I would like to confirm that the issue is still present on official 2.81.

Windows 10
GPU: Radeon VII (19.11.3).
CPU: Ryzen 3800X

I've tested rendering using Class Room scene (256*256)
2.80 (official) 2m40s
2.81 (official) 3m20s

Thank you so much for this task and making this a priority! Maybe my new GPU won't have been a mistake after all <3

Jeroen Bakker (jbakker) triaged this task as Low priority.
Jeroen Bakker (jbakker) changed the subtype of this task from "Report" to "To Do".

Due to the changes done we are not in line with 2.80. There is still one task open to be able to create OpenCL2.0 contextes which will improve memory intensive renders (Barbershop). Will mark this issue with TODO and close it when the patch is finished

Brecht Van Lommel (brecht) raised the priority of this task from Low to High.Feb 21 2020, 1:48 PM

Marking as high priority since this is something we want to address for the 2.83 release. @Jeroen Bakker (jbakker) has submitted a patch to use OpenCL 2.0 which will helps, but currently requires some manual configuration which I'll try to find an automatic solution for.

Brecht Van Lommel (brecht) changed the subtype of this task from "To Do" to "Bug".Tue, Mar 24, 3:06 PM
Brecht Van Lommel (brecht) claimed this task.

From my testing this is resolved now. If others still find significant performance regressions please let me know.

For reference, here are my performance measurements:

I have this set up now to test CPU, CUDA, Optix and OpenCL performance, so we should be able to catch issues like this early in the future.

Unfortunately I can NOT confirm this issue is solved.

I'm using Vega 64GPu (also tested on 2 other vega 56 GPU's i have in my build. times blow are for Vega64.

  1. WIndows 10, fresh install (last weekend)
  2. Radeon drivers 20.3.1

BMW27_gpu.blend file pulled from blender.org site. no changes to any settings.

Times are average of 3 runs, excluding first as that includes also kernel compile.

blender 2.80 - 1m27s
Blender 2.82a - 1m54s
Blender 2.83 - 1m58s (todays build - 94b8166a8b05)

rB94b8166a8b05 does not include the fix yet, it will be in the next build.

Classroom scene - file dated 2019-06-13 (pulled from blender https://www.blender.org/download/demo-files/

changes : set to GPU and 256x256 tiles

2.80 - 3m12s
2.82a - 5m34s
2.83 - 7m15s... (build - 94b8166a8b05)

fingers crossed, and eager to test. will test again tomorrow and report back.

Thanks, here are results, confirming the fix.

BMW
Blender 2.80 - 1m27s
Blender 2.83 - 1m23s (fixed version)

Classroom
Blender 2.80 - 3m12s
Blender 2.83 - 2m57s (fixed version)

nice to get even faster renders. don't see any visual differences between the version.

Again thanks for the work.