Page MenuHome

2.8 Builds are limited to 16 threads of processing
Closed, ResolvedPublic


Operating system: Windows 10 and Ubuntu 18.04 - 32 Core Threadripper 2990WX 64GB ECC memory
Graphics card: NVIDIA Quadro P2000
Last build tested: Jan 17 e57ee5934a30 , Jan 17 ad707115d5bc
Multi-thread editing features in both of the 64 bit Windows and Linux 64 builds appear to be limited to 16 threads of computing. For example, Sculpt with smoothing brush runs the first 16 threads at 100% utilization, while leaving the other 75% of the CPU idle. Other features, such as mesh editing, seem to have the same restriction.
This issue was observed using htop in Ubuntu/Linux 18.04 and CPUID in Windows. Both provide realtime display of CPU thread utilization.
Recreate the problem using Windows and the bmw27_cpu benchmark file.
Sculpting bmw27_cpu, with the smooth brush, works with any build, but only using the first 16 threads as described earlier. Other mesh editing exhibits similar limitations.

Event Timeline

Brecht Van Lommel (brecht) triaged this task as Confirmed, Medium priority.

I can confirm this.

This report has led to correcting an important bug in Cycles rendering, but my intention was to report that my Threadripper never attempts to utilize more than 16 threads for any multi-threaded tasks.

Cycles, as of build 036ec5cae4f7 (Feb. 12) correctly uses all 64 threads in the AMD 32 core chip – but features such as Eevee and Workbench rendering, sculpt, compositor, and fluid simulation all make heavy use of the first sequential 16 threads of the CPU, but ignore the rest of the chip. This is also the case for any of the older 2.79 builds.

Please observe this limitation by running the included file in a AMD 32 Core Threadripper system, while monitoring CPU utilization, and do the following:

  1. Perform a Sculpt operation on the object.
  2. Render an Animation, using Eevee.
  3. Render an Animation, using Workbench.
  4. Render animation with Cycles CPU, and note that this is the only function which uses all 64 threads.

Sorry - I should have flagged this as "Open" in my previous comment.

@Sergey Sharybin (sergey), I can confirm this on Ubuntu. Commenting out numaAPI_RunProcessOnNode in threads.c avoids the problem.

To me it seems as if this is somehow inherited by threads. If that's the case I'm not sure what the right solution is, seems rather tricky to figure out all the places that might start a thread, some of them outside our control.

@Brecht Van Lommel (brecht), for our thread pool it's fairly simple fix. But is a good point about threads summoned by someone else.
Guess half-decent solution would be to not set affinity for main thread, but set it in thread pools (both Blender and Cycles) to:

  • Make sure data is localized per node/CPU group
  • Allow use of more than 64 threads on Windows

Wouldn't work if Cycles is indirectly initializes a threads (maybe in OIIO/OSL). Not sure if that's happening, and if so i don't think we can have any decent solution.

@Sergey Sharybin (sergey), Please, let me repeat that Cycles currently works fine. All 64 threads are very active with a Cycles CPU render on the Threadripper. It is all the other functionality within Blender that has the 16 thread limitation.

@Chris Clawson (meloware), I understand that, Those are all related issues, rooting to the fact that we've tried to give fastest core to the main thread. Cycles does special trickery to distribute its threads on all cores. Similar thing can be done for threads on Blender side. But we have no control over threads created by external libraries, like OpenJpeg and OpenEXR. Also, Cycles uses OIIO/OSL which might be creating threads as well, and those we also have no control over.

Brecht Van Lommel (brecht) raised the priority of this task from Confirmed, Medium to Confirmed, High.Mar 4 2019, 6:44 PM