This task intention is to plan how to tackle the stroke performance issue that sculpt mode has. This is the issue we are trying to solve:
The plan is to avoid the slowdown that freeze the UI during the stroke. In any case, if the computer hardware is not enough to keep up with the mesh deformation, the stroke should lag consistently instead of freezing for several seconds.
This should not focus on:
- Dyntopo performance (should be handled in a separate task as it has some specific performance issues).
- Making large brush strokes and mesh filters faster.
These are the attempt the were made. There were some successful experiments, but none of them can provide the expected performance sculpt mode should have in a high end computer:
Control paint operator events rate
- Dynamic Stroke Spacing D8508: This is the solution with the most noticeable effect so far. In order to make this work, high frequency pen tablet events should be available in all platforms. This patch can be beneficial to merge also for not performance reasons as it will also allow to reduce the spacing of some brushes that need it.
- Paint stroke step queue D5676: This is not related to improve performance (it may even make performance worse), but to improve UX avoiding locking the UI. The idea was to use it for 2D painting were it works fine, but it 3D projection painting and Sculpt Mode it still has some issues.
- Ignore INVETWEEN_MOUSE_EVENTS in grab brushes (in master since 2.81) D5429: This also had a noticeable effect in the performance of these brushes.
Modify PBVH scheduler settings
- Change Multires to limit the PBVH leaf size using vertices D8454: This reduces the leaf limit for Multires. It improved performance in high end computers (Julien workstation). Before this patch, Multires was lagging with 80k vertices with 3 subdivisions levels.
- Change PBVH leaf limit size D8442: This has a noticeable effect in performance. Stroke performance usually improves when making the limit size lower, but this makes tools that modify all nodes slower (like elastic deform or the mesh filters). When increasing the limit size, stroke performance is worse, but tools that modify the whole mesh are faster.
- PBVH Scheduler performance design task: T72943
I tested sculpt mode in the following computers, all of them running Ubuntu:
- Mini-PC media center: i3, 8GB integrated graphics.
- Gaming Laptop: i7 9750H, GTX 1650, 16GB
- Workstation: Threadripper 3990X, Radeon Pro VII, 128GB
This is my experience so far
- Performance does not scale: The workstation and the mini-pc have the same stroke performance and lag problem. In the case of the mini-pc, performance is clearly limited by the GPU as the viewport lags with high poly meshes without doing any strokes.
- Removing the normal recalculation completely before drawing the PBVH does not have any noticeable effect. If we can find a fix that only works for flat shading (so we can move these updates to somewhere else), it should be acceptable,
- I did a refactor during the 2.82 development to unify all the loops for the Clay Strips brush as suggested in T68873. It didn't have any noticeable performance effect.
- In 2.81 the scheduler was changed to TBB. Before that change, Sculpt Mode was noticeable faster disabling multithreaded sculpting. This option is no longer available.
- This bug fix removed a heavy computation per vertex in clay strips. Even after the fix, the stroke lag is still there
- Stroke lag is generally higher in Multires than in meshes, when we should expect Multires to be faster than meshes (less cache misses).
- In the gaming laptop with the EEVEE PBVH drawing enabled, stroke performance is almost the same (if not better) in EEVEE than in workbench. I did't test this with the workstation because I can't guarantee that the new GPU is working correctly. This makes me think that this problem is not related to a viewport drawing issue.
- The default Sculpt Vertex Colors paint brush does not need normal updates, sampling the area normal or updating the nodes bounding boxes, but it is also affected by the stroke lag issue.
- The Brush cursor is using a PBVH raycast and a multithreaded task to sample the area normal, but it is not affected by the lag, no matter how fast you move the cursor over the mesh.
Performance reference benchmark
Even this is not an easy to define thing, we should have a fixed performance reference benchmark for sculpting for supported devices. This should allow people to report performance issues with different setups instead of just assuming that performance is bad because Blender is not a especialized software, so we can have more data to debug this issue (the same way if we notice that moving elements in edit mode on a 20 vertices mesh lags, we know it is a bug).
As and initial proposal, I would expect that with a 500k - 750k vertices mesh, Sculpt Mode should never lag regardless of the stroke size or stroke speed, in any computer that meets the minimum requirements to run Blender.