Cycles: Create a new hybrid (CPU + Intel Xeon Phi) computed OMP (OpenMP) device
Needs ReviewPublic

Authored by Milan Jaros (jar091) on Dec 4 2016, 11:35 PM.

Details

Summary

The new OMP device can use the hardware, which supports OpenMP technology. It can combine more than one devices (Intel Xeon Phi+CPU, TODO: CPU+CUDA). The new OMP device is faster than original CPU devkce, but only in final rendering with bigger tile. We can achieved more than 2x speed-up with hybrid system CPU and KNC coprocessors.

I fix also some bugs for intel compiler and replace assert with kernel_assert.

Diff Detail

Repository
rB Blender
Milan Jaros (jar091) retitled this revision from to Cycles: Create a new hybrid (CPU + Intel Xeon Phi) computed OMP (OpenMP) device.Dec 4 2016, 11:35 PM
Milan Jaros (jar091) updated this object.
Milan Jaros (jar091) set the repository for this revision to rB Blender.
Milan Jaros (jar091) added a project: Cycles.
Sergey Sharybin (sergey) requested changes to this revision.Dec 5 2016, 10:17 AM

Can you explain more about performance of new OMP device on CPU? Is it faster when using just a CPU, or performance improvement only happens when having a co-processor type of device?

intern/cycles/CMakeLists.txt
196

It is a bit weird to have some places calling device offload and in other places omp. What is the difference here?

intern/cycles/bvh/bvh_node.h
24 ↗(On Diff #7940)

BVH is not executed on a device, so it is weird to have __KERNEL_OFFLOAD__ here.

Also, note that indentation of preprocessor happens in a way:

#ifndef FOO
#    include "bar.h"
#endif
intern/cycles/device/CMakeLists.txt
37

Indentation and extra trailing whitespace.

I also don't see this file. Mind doublecheking it is properly added to the diff?

intern/cycles/render/session.cpp
840

Any reason to break indeentation?

intern/cycles/util/util_debug.h
20

Worth mentioning why it is disabled.

158
#endif  /* __KERNEL_OFFLOAD__ */
intern/cycles/util/util_optimization.h
121

Same as above.

This revision now requires changes to proceed.Dec 5 2016, 10:17 AM

This patch also seems to be for an older version of the code, in latest master there have been a bunch of changes to the device code so this will need to be updated for that.

It would indeed be interesting to hear more details about the performance: the CPU and Xeon Phi specs, which scene you tested, and what the render time was.

Milan Jaros (jar091) added a comment.EditedDec 6 2016, 8:12 AM

I forgot to add some files. I apologize for that. I used the tag v2.78a. I will fix it today:)

The speedup is approx. 1.10x (OpenMP vs. CPU POSIX), but only if you set the size of tile on the values of the resolution. The speed of the original CPU device depends on size of tiles (it is disadvantage in some cases).

I renamed OMP Device to OpenMP Device. I used "master" from this week. I added the missing files. The new device works with GCC and with Intel Compiler. I had to disabled qbvh and SSE for Intel Compiler. The offload mode has a problem with util_debug.h ( problem with static value of class = undefined reference, I disable it with pragma).

I will send benchmarks later.

I added some comments to comments from Sergey.

intern/cycles/CMakeLists.txt
196

Offload is only for coprocessors. I renamed omp to OpenMP and it is for CPUs and coprocessors.

intern/cycles/device/CMakeLists.txt
37

I added the missing file.

intern/cycles/render/session.cpp
840

The source code was changed in master.

intern/cycles/util/util_debug.h
158

I have problem with static instance ( undefined reference ). I had to disable this class.

intern/cycles/util/util_optimization.h
121

The Intel Xeon Phi does not support SSE. I had to disable it.

Milan Jaros (jar091) removed rB Blender as the repository for this revision.

Fix whitespaces, Fix Intel optimization in blenkernel, Optimizing of pathrace loop

Hi there, I just ordered a Xeon Phi CoProcessor for testing. Will be shipped here shortly after Easter. Can you enlight me if the Phi is support in Blender allready or do I have to compile a version for me. Sorry to say, I am not a developer, this is all modern chinese to me.

Hi there, I just ordered a Xeon Phi CoProcessor for testing. Will be shipped here shortly after Easter. Can you enlight me if the Phi is support in Blender allready or do I have to compile a version for me. Sorry to say, I am not a developer, this is all modern chinese to me.

Hi Dieter, could you specify the type of Xeon Phi (KNC or KNL? I tested both:)). Thanks. Milan

Its a Xeon Phi 31S1P (BC31S1P). Got it cheap on ebay. Just want to see and try this CPU.

I uploaded new diff file, which could be use for current master (commit: bb8f7784).

Its a Xeon Phi 31S1P (BC31S1P). Got it cheap on ebay. Just want to see and try this CPU.

Hi Dieter, you can find the unofficial build here: https://code.it4i.cz/blender/builds