Page MenuHome

Multires: Multithread unsubdivide grid extraction
Needs ReviewPublic

Authored by Pablo Dobarro (pablodp606) on Tue, Sep 1, 5:30 PM.

Details

Summary

This multithreads the main loop of unsubdivide grids extraction from the
original mesh. On a Threadripper 3990X, this loop goes from 0.369013 to
0.017537 when unsubdividing suzane with 6 subdivisions applied (2M
vertices).

The entire operation still takes around 16 seconds as the slowest part
is dissolving the vertices to generate the base mesh (it takes almost 11
out of 16 seconds), so the speedup in the unsubdivide and rebuilds
subdivision operator is not that noticeable.

Diff Detail

Repository
rB Blender
Branch
unsubdivide-extract-multithreaded (branched from master)
Build Status
Buildable 9971
Build 9971: arc lint + arc unit

Event Timeline

Pablo Dobarro (pablodp606) requested review of this revision.Tue, Sep 1, 5:30 PM
Pablo Dobarro (pablodp606) created this revision.

From a glance generally looks fine. Is the peak memory usage stays the same?

Please run clang-format on the patch.

Is the peak memory usage stays the same?

How can I measure that?

How can I measure that?

You analyse the algorithm. Are there extra allocations to main per-thread contexts? Things like this.

Also, now after re-reading the description there are even more questions. From my understanding the timing is breaking down as following:

  • 11 sec to dissolve vertices.
  • 0.36 sec to extra grid data
  • 3.5 sec to do something else
  • Total is 16 sec

Almost as if it doesn't worth adding threading complexity and the actual work is to be focused in the hotspot.
But since the work is done already, I don't mind if this code is threaded. Maybe double-check with @Brecht Van Lommel (brecht).

source/blender/blenkernel/intern/multires_unsubdivide.c
1003–1004

Seems these can become const.

Add comments from the below about mapping from base to original vertex index mapping.

Also make sure the definition of base and original is present.

1010

base_loop_offset. With description about what it is.

1095–1115

You can move map initialization to own functions, keeping pointers outside of initialization always const.

But since the work is done already, I don't mind if this code is threaded. Maybe double-check with @Brecht Van Lommel (brecht).

I don't mind having this multithreaded.

Pablo Dobarro (pablodp606) marked 3 inline comments as done.
  • Review Update

If I add const to the vertex maps in the struct there is a warning in multires_unsubdivide_vertex_maps_free, is this ok?

If I add const to the vertex maps in the struct there is a warning in multires_unsubdivide_vertex_maps_free, is this ok?

Oh, well. No, warning is no good.

I still think we should have const qualifier for data which is not supposed to be modified after initialization. But then the "legacy" semantic of C style free gets in a way. I think is better to keep const and cast it away in the MEM_freeN() call. In a longer term maybe even use const void* in the MEM_freeN() function. This will be more intuitive with C++, where

const int* a = new int[10];
delete[] a;

is an absolutely valid code.

@Brecht Van Lommel (brecht) , @Sybren A. Stüvel (sybren) what do you think?

Oh, well. No, warning is no good.

I agree.

I still think we should have const qualifier for data which is not supposed to be modified after initialization.

I also agree here.

But then the "legacy" semantic of C style free gets in a way. I think is better to keep const and cast it away in the MEM_freeN() call. In a longer term maybe even use const void* in the MEM_freeN() function. This will be more intuitive with C++

Such a cast should IMO be accompanied by a comment that explains why it's okay, as generally you don't want to cast constness away. I'd be fine with changing MEM_freeN() to get const void*.

The patch description reads:

On a Threadripper 3990X, this loop goes from 0.369013 to 0.017537

Always include units with the numbers (unless they are actually unitless).