The heap on windows is single threaded causing it to lag behind linux in performance in allocation heavy multithreaded scenarios, BVH building is a prime example.
Results of CPU Render on an aws c5d.18xlarge instance
|Total Render Time||Scene Prep||Total Improvement||Prep Improvement|
|scene||MSVC Heap||TBB_Allocator||MSVC Heap||TBB_Allocator||Seconds||%||Seconds||%|
- Required a version bump for TBB for the hooking to work on recent windows versions.
- Static versions of tbbmalloc/tbbmalloc_proxy do not seem to work (static tbb is frowned upon anyway, but that's a worry for a future diff) so the dynamic versions were used.
- Windows only at this point, given the linux allocator is multithreaded by default I'm not expecting wonders there.
for testing with the allocator enabled/disabled you can set the environment variable TBB_MALLOC_DISABLE_REPLACEMENT=1 to disable the allocator.
for easy testing drop these files into the tbb/lib folder