Opening even remotely complex scenes with many objects takes much longer than before #89194

Closed
opened 2021-06-16 11:50:37 +02:00 by Rainer Trummer · 31 comments

System Information
Operating system: Windows-10-10.0.19041-SP0 64 Bits
Graphics card: Quadro RTX 4000/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 461.33

Blender Version
Broken: version: 3.0.0 Alpha, branch: master, commit date: 2021-06-15 14:45, commit ID: fcc844f8, BLI: use explicit task isolation, no longer part of parallel operations
Worked: 3.0.0 Alpha, branch: master, commit date: 2021-06-15 14:45, hash: b3f0dc2907

Short description of error
Most of my more complex scenes that opened just fine with the commit ID b3f0dc29, hash: b3f0dc2907 (which is the one right before the broken one) do not ever get displayed, Blender just freezes for an unknown amount of time. This is true for scenes that contain many mesh objects, which are linked into a main file. Small scenes still open just fine.

As far as I can tell, Blender does complete loading of the scene, but then just drops to 0% activity in the Task Manager, and never updates the Viewport. It also does not accept any input, it's just frozen.

When trying to put together a debug file, I noticed that loading times increased a lot between the two mentioned commits (or even 2.93). With the provided demo file I measure roughly a doubling in load times when loading the DemoCrash_Mainfile.blend between 2.93 and current master. Note that this demo file only linkes four simple mesh objects plus 36.000 empties.

Exact steps for others to reproduce the error
Open this file here
File_Open_Regression.zip
and unzip the content. Then open DemoCrash_Mainfile.blend and measure loading times.

On my system:
Blender 2.93: roughly 25 seconds for DemoCrash_Mainfile.blend
Blender master: roughly 50 seconds for DemoCrash_Mainfile.blend

opening DemoCrash.blend, which contains most of the data, takes less than 10 seconds in both cases.

**System Information** Operating system: Windows-10-10.0.19041-SP0 64 Bits Graphics card: Quadro RTX 4000/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 461.33 **Blender Version** Broken: version: 3.0.0 Alpha, branch: master, commit date: 2021-06-15 14:45, commit ID: `fcc844f8`, BLI: use explicit task isolation, no longer part of parallel operations Worked: 3.0.0 Alpha, branch: master, commit date: 2021-06-15 14:45, hash: `b3f0dc2907` **Short description of error** Most of my more complex scenes that opened just fine with the commit ID `b3f0dc29`, hash: `b3f0dc2907` (which is the one right before the broken one) do not ever get displayed, Blender just freezes for an unknown amount of time. This is true for scenes that contain many mesh objects, which are linked into a main file. Small scenes still open just fine. As far as I can tell, Blender does complete loading of the scene, but then just drops to 0% activity in the Task Manager, and never updates the Viewport. It also does not accept any input, it's just frozen. When trying to put together a debug file, I noticed that loading times increased a lot between the two mentioned commits (or even 2.93). With the provided demo file I measure roughly a doubling in load times when loading the `DemoCrash_Mainfile.blend` between 2.93 and current master. Note that this demo file only linkes four simple mesh objects plus 36.000 empties. **Exact steps for others to reproduce the error** Open this file here [File_Open_Regression.zip](https://archive.blender.org/developer/F10175161/File_Open_Regression.zip) and unzip the content. Then open `DemoCrash_Mainfile.blend` and measure loading times. On my system: Blender 2.93: roughly 25 seconds for `DemoCrash_Mainfile.blend` Blender master: roughly 50 seconds for `DemoCrash_Mainfile.blend` opening `DemoCrash.blend`, which contains most of the data, takes less than 10 seconds in both cases.
Author

Added subscriber: @RainerTrummer

Added subscriber: @RainerTrummer
Rainer Trummer changed title from Opening even remotely complex scenes with many objects never completes to Opening even remotely complex scenes with many objects takes much longer than before 2021-06-16 15:32:06 +02:00
Brecht Van Lommel was assigned by Rainer Trummer 2021-06-18 10:08:35 +02:00
Brecht Van Lommel removed their assignment 2021-06-18 21:18:52 +02:00

Added subscriber: @brecht

Added subscriber: @brecht

I can see that buildbot builds of 2.93 and master have a performance difference. But I can not confirm with my own build that fcc844f8 is what makes the difference.

Bisecting is not working well here since the load times vary quite a bit even for the same commits, and trying multiple times to build fcc844f8 and b3f0dc29 I did not find a significant difference.

If any other developer can confirm this, it would be good to run a profiler or break in a debugger to see where the extra time is spent.

I can see that buildbot builds of 2.93 and master have a performance difference. But I can not confirm with my own build that fcc844f8 is what makes the difference. Bisecting is not working well here since the load times vary quite a bit even for the same commits, and trying multiple times to build fcc844f8 and b3f0dc29 I did not find a significant difference. If any other developer can confirm this, it would be good to run a profiler or break in a debugger to see where the extra time is spent.
Member

Added subscriber: @LazyDodo

Added subscriber: @LazyDodo
Member

i didn't have a final 2.93 copy, so used 2.92 to benchmark against

42s to open in 2.92 (official release)
57s to open in 3.0 (todays master)

perf in 2.92 isn't stellar either for such a simple scene..

{F10180653, width=100%}

So of that 57s , 42s is being spend in BLI_FindString and 15 in id_sort_by_name haven't looked why that is, but it feels excessive

i didn't have a final 2.93 copy, so used 2.92 to benchmark against 42s to open in 2.92 (official release) 57s to open in 3.0 (todays master) perf in 2.92 isn't stellar either for such a simple scene.. {[F10180653](https://archive.blender.org/developer/F10180653/image.png), width=100%} So of that 57s , 42s is being spend in `BLI_FindString` and 15 in `id_sort_by_name` haven't looked why that is, but it feels excessive
Member

Added subscriber: @HooglyBoogly

Added subscriber: @HooglyBoogly
Member

I wonder if e4ca6b93ad might be a culprit then

I wonder if e4ca6b93ad might be a culprit then
Author

Greatly appreciate everyone investigating this! From my side I can only confirm that reverting the commit I mentioned resolves my problems, but of course the source could be somewhere else entirely. Really difficult to narrow down how to reproduce the problem. For me, production scenes in the company don‘t open within 10 minutes, at that point I kill the Blender process in the task manager. 2.93 opens the same scene in 45 seconds.

Is it possible that this is a Windows platform only issue? I wasn‘t testing under Linux in a while now, so differences here are totally possible.

Greatly appreciate everyone investigating this! From my side I can only confirm that reverting the commit I mentioned resolves my problems, but of course the source could be somewhere else entirely. Really difficult to narrow down how to reproduce the problem. For me, production scenes in the company don‘t open within 10 minutes, at that point I kill the Blender process in the task manager. 2.93 opens the same scene in 45 seconds. Is it possible that this is a Windows platform only issue? I wasn‘t testing under Linux in a while now, so differences here are totally possible.

In #89194#1179211, @RainerTrummer wrote:
Greatly appreciate everyone investigating this! From my side I can only confirm that reverting the commit I mentioned resolves my problems, but of course the source could be somewhere else entirely. Really difficult to narrow down how to reproduce the problem. For me, production scenes in the company don‘t open within 10 minutes, at that point I kill the Blender process in the task manager. 2.93 opens the same scene in 45 seconds.

It seems like you reported two different bugs then, one about slower loading of files for which you provided a repro case, and one about Blender hanging for which you did not.

> In #89194#1179211, @RainerTrummer wrote: > Greatly appreciate everyone investigating this! From my side I can only confirm that reverting the commit I mentioned resolves my problems, but of course the source could be somewhere else entirely. Really difficult to narrow down how to reproduce the problem. For me, production scenes in the company don‘t open within 10 minutes, at that point I kill the Blender process in the task manager. 2.93 opens the same scene in 45 seconds. It seems like you reported two different bugs then, one about slower loading of files for which you provided a repro case, and one about Blender hanging for which you did not.

Changed status from 'Needs Triage' to: 'Needs User Info'

Changed status from 'Needs Triage' to: 'Needs User Info'
Member

could be the same bug to, just with larger files, O(n^2) solution to a problem could easily exhibit that behaviour

could be the same bug to, just with larger files, [O(n^2)](https://twitter.com/BruceDawson0xB/status/1120381406700429312) solution to a problem could easily exhibit that behaviour

The task isolation change is unlikely to introduce O(n^2) behavior, but is likely to introduce a deadlock.

The task isolation change is unlikely to introduce O(n^2) behavior, but is likely to introduce a deadlock.
Member

there's no threading at play here, only the main thread is active, expand_doit_library is being a tad optimistic by calling is_yet_read which loops though all objects ....... for every object added, and that gets expensive as the number of objects grows.

there's no threading at play here, only the main thread is active, `expand_doit_library` is being a tad optimistic by calling `is_yet_read` which loops though all objects ....... for every object added, and that gets expensive as the number of objects grows.

Yes, no threading at play for the attached .blend, yet a threading related commit identified as the cause for another production .blend. That supports there being two bugs here.

Yes, no threading at play for the attached .blend, yet a threading related commit identified as the cause for another production .blend. That supports there being two bugs here.
Author

It seems like you reported two different bugs then, one about slower loading of files for which you provided a repro case, and one about Blender hanging for which you did not.

@brecht I initially thought Blender would freeze entirely when trying to open our production scenes in the company. When I later tried to put together a simple scene where developers could reproduce the problem, I could not find a simple case where loading would freeze entirely. However then I noticed the increase in loading times. My thinking then was, that maybe Blender did not really freeze on my production scenes either, it propably just takes an excessively long time to load. I never waited for the process long enough to complete though, after max 10-15 minutes I stopped the attempt. Since the production scenes are for a racing team competing in World Championships, I have no possibilities to share the original affected files.

Hope this clarifies how the bug report was phrased like this?

> It seems like you reported two different bugs then, one about slower loading of files for which you provided a repro case, and one about Blender hanging for which you did not. @brecht I initially thought Blender would freeze entirely when trying to open our production scenes in the company. When I later tried to put together a simple scene where developers could reproduce the problem, I could not find a simple case where loading would freeze entirely. However then I noticed the increase in loading times. My thinking then was, that maybe Blender did not really freeze on my production scenes either, it propably just takes an excessively long time to load. I never waited for the process long enough to complete though, after max 10-15 minutes I stopped the attempt. Since the production scenes are for a racing team competing in World Championships, I have no possibilities to share the original affected files. Hope this clarifies how the bug report was phrased like this?

Right, I understand how it would look like it's the same issue, but to me it seems likely that's it not.

If you are able to run Blender with the production file in a debugger, it would be interesting to see where it is hanging (after a few minutes). That could give us a good clue for the cause.

Right, I understand how it would look like it's the same issue, but to me it seems likely that's it not. If you are able to run Blender with the production file in a debugger, it would be interesting to see where it is hanging (after a few minutes). That could give us a good clue for the cause.

I found a repro case for the deadlock. That is hopefully fixed now, but there is still a performance issue. More details in D11603: BLI: use explicit task isolation, no longer part of parallel operations.

I found a repro case for the deadlock. That is hopefully fixed now, but there is still a performance issue. More details in [D11603: BLI: use explicit task isolation, no longer part of parallel operations](https://archive.blender.org/developer/D11603).
Author

Hi @brecht your fix indeed resolved the scene opening problems for me!

I've benchmarked opening times now on different scenarios on my system, always freshly restarting Blender and then trying to open the scene (cycle repeated at least once). Here are the results (all measured on Windows 10, all files reside on a network drive and link other data in):

production scene that initially refused to open:
current master: ~45-50 seconds
2.93 release: ~47-48 seconds

another less complex production scene, which before also refused to open:
current master: ~17-22 seconds
2.93 release: ~16-18 seconds

provided debug file, read from a local ssd drive:
current master: ~33-38 seconds
2.93 release: ~26-28 seconds

conclusion: 2.93 release tends to be a little faster indeed. Interestingly, it also shows a smaller variance in loading times when you repeat the cycle. The time increase does not look exponential at first glance.
However, for me the main problem of this bug report is fixed, the increase in loading times isn't at a critical level. I leave it up to you if you want to close the report or continue searching for performance improvements.

Hi @brecht your fix indeed resolved the scene opening problems for me! I've benchmarked opening times now on different scenarios on my system, always freshly restarting Blender and then trying to open the scene (cycle repeated at least once). Here are the results (all measured on Windows 10, all files reside on a network drive and link other data in): **production scene that initially refused to open:** current master: ~45-50 seconds 2.93 release: ~47-48 seconds **another less complex production scene, which before also refused to open:** current master: ~17-22 seconds 2.93 release: ~16-18 seconds **provided debug file, read from a local ssd drive:** current master: ~33-38 seconds 2.93 release: ~26-28 seconds conclusion: 2.93 release tends to be a little faster indeed. Interestingly, it also shows a smaller variance in loading times when you repeat the cycle. The time increase does not look exponential at first glance. However, for me the main problem of this bug report is fixed, the increase in loading times isn't at a critical level. I leave it up to you if you want to close the report or continue searching for performance improvements.

Changed status from 'Needs User Info' to: 'Needs Triage'

Changed status from 'Needs User Info' to: 'Needs Triage'

Added subscriber: @mont29

Added subscriber: @mont29

It's up to @mont29 if he wants to consider this a bug and investigate performance regressions in .blend file loading, or not.

It's up to @mont29 if he wants to consider this a bug and investigate performance regressions in .blend file loading, or not.

Sorting libdata is indeed going to increase a bit loading time, not sure how much it accounts here...

More interested in the graph from @LazyDodo here, string lookup is indeed always very expensive, that's why we use hashes most of the time, guess that's another case to try it on. Will switch the task as TODO and check on this later.

Sorting libdata is indeed going to increase a bit loading time, not sure how much it accounts here... More interested in the graph from @LazyDodo here, string lookup is indeed always very expensive, that's why we use hashes most of the time, guess that's another case to try it on. Will switch the task as TODO and check on this later.
Member

Given it is a sorted list, i took a quick stab (temp hack in P2221) at implementing a binary search for is_yet_read which did speed it up a little (took ~ 5 sec off) but it moved to hot spot to BLI_listbase_count which should be trivial as well to figure out , but some lib work came along and i had to shelf this, feel free to run with it.

Heads up: while the binary search seems to work, I have not proven its correctness yet, so be weary if you re-use that code

Given it is a sorted list, i took a quick stab (temp hack in [P2221](https://archive.blender.org/developer/P2221.txt)) at implementing a binary search for `is_yet_read ` which did speed it up a little (took ~ 5 sec off) but it moved to hot spot to `BLI_listbase_count` which should be trivial as well to figure out , but some lib work came along and i had to shelf this, feel free to run with it. Heads up: while the binary search seems to work, I have not proven its correctness yet, so be weary if you re-use that code
Member

Added subscriber: @lichtwerk

Added subscriber: @lichtwerk
Member

Changed status from 'Needs Triage' to: 'Needs Developer To Reproduce'

Changed status from 'Needs Triage' to: 'Needs Developer To Reproduce'
Member

Will switch the task as TODO and check on this later.

Changing status to reflect this then.

> Will switch the task as TODO and check on this later. Changing status to reflect this then.

Changed status from 'Needs Developer To Reproduce' to: 'Confirmed'

Changed status from 'Needs Developer To Reproduce' to: 'Confirmed'

Rather confirmed, this has a patch now...

Rather confirmed, this has a patch now...

Changed status from 'Confirmed' to: 'Resolved'

Changed status from 'Confirmed' to: 'Resolved'
Bastien Montagne self-assigned this 2021-07-27 16:41:20 +02:00

Committed in db4fe8e322

Committed in db4fe8e322

Added subscriber: @hadrien

Added subscriber: @hadrien
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
7 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#89194
No description provided.