Adding large number of objects much slower in 2.80 #66613

New Issue

Paul Melis · 2019-07-09T17:20:59+02:00

Paul Melis commented

2019-07-09 17:20:59 +02:00

System Information
Operating system: Arch Linux
Graphics card: GTX Titan

Blender Version
Broken: 2.80.74 (9c5d54bfaf, binary from blender.org)
Worked: 2.79.7 (Arch Linux package)

Short description of error

Adding a large number of objects is taking much more time with 2.80 compared to 2.79

Exact steps for others to reproduce the error

With the attached script (which just calls bpy.ops.mesh.primitive_cube_add in a loop with random locations):

$ blender -b -P thousands_of_objects.py 
Blender 2.79 (sub 7) (hash a29446da526d built 2019-03-27 18:12:23)
Read prefs: /home/paulm/.config/blender/2.79/config/userpref.blend
    0 0.001s (0.001x)
  100 0.027s (42.151x)
  200 0.066s (2.413x)
  300 0.112s (1.705x)
  400 0.177s (1.582x)
  500 0.256s (1.444x)
  600 0.361s (1.410x)
  700 0.474s (1.314x)
  800 0.610s (1.286x)
  900 0.735s (1.205x)
 1000 0.876s (1.192x)

Blender quit
paulm@cmstorm 17:18:/data/examples/blender/python_api$ ~/software/blender-git/blender -b -P thousands_of_objects.py 
Blender 2.80 (sub 74) (hash a4bc6aca0ef9 built 2019-07-08 16:50:26)
Read prefs: /home/paulm/.config/blender/2.80/config/userpref.blend
found bundled python: /home/paulm/software/blender-git/2.80/python
    0 0.001s (0.001x)
  100 0.149s (204.909x)
  200 0.421s (2.837x)
  300 0.734s (1.742x)
  400 1.062s (1.446x)
  500 1.404s (1.322x)
  600 1.752s (1.248x)
  700 2.112s (1.205x)
  800 2.502s (1.185x)
  900 2.874s (1.148x)
 1000 3.295s (1.146x)

Blender quit

So 2.80 takes roughly 3.8x as much time, although it is slightly less exponential in behaviour compared to 2.79.

thousands_of_objects.py

**System Information** Operating system: Arch Linux Graphics card: GTX Titan **Blender Version** Broken: 2.80.74 (9c5d54bfaf48, binary from blender.org) Worked: 2.79.7 (Arch Linux package) **Short description of error** Adding a large number of objects is taking much more time with 2.80 compared to 2.79 **Exact steps for others to reproduce the error** With the attached script (which just calls `bpy.ops.mesh.primitive_cube_add` in a loop with random locations): ``` $ blender -b -P thousands_of_objects.py Blender 2.79 (sub 7) (hash a29446da526d built 2019-03-27 18:12:23) Read prefs: /home/paulm/.config/blender/2.79/config/userpref.blend 0 0.001s (0.001x) 100 0.027s (42.151x) 200 0.066s (2.413x) 300 0.112s (1.705x) 400 0.177s (1.582x) 500 0.256s (1.444x) 600 0.361s (1.410x) 700 0.474s (1.314x) 800 0.610s (1.286x) 900 0.735s (1.205x) 1000 0.876s (1.192x) Blender quit paulm@cmstorm 17:18:/data/examples/blender/python_api$ ~/software/blender-git/blender -b -P thousands_of_objects.py Blender 2.80 (sub 74) (hash a4bc6aca0ef9 built 2019-07-08 16:50:26) Read prefs: /home/paulm/.config/blender/2.80/config/userpref.blend found bundled python: /home/paulm/software/blender-git/2.80/python 0 0.001s (0.001x) 100 0.149s (204.909x) 200 0.421s (2.837x) 300 0.734s (1.742x) 400 1.062s (1.446x) 500 1.404s (1.322x) 600 1.752s (1.248x) 700 2.112s (1.205x) 800 2.502s (1.185x) 900 2.874s (1.148x) 1000 3.295s (1.146x) Blender quit ``` So 2.80 takes roughly 3.8x as much time, although it is slightly less exponential in behaviour compared to 2.79. [thousands_of_objects.py](https://archive.blender.org/developer/F7579356/thousands_of_objects.py)

Paul Melis commented

2019-07-09 17:20:59 +02:00

Added subscriber: @PaulMelis

Germano Cavalcante commented

2019-07-09 17:45:54 +02:00

Added subscriber: @mano-wii

Germano Cavalcante commented

2019-07-09 17:45:54 +02:00

Calling an operator within such a large loop is not a good practice.
You should think of better ways to achieve the same result.
Since this is a regression I will confirm as a bug, however we have many other bugs to keep on track, so it will have low priority.

Calling an operator within such a large loop is not a good practice. You should think of better ways to achieve the same result. Since this is a regression I will confirm as a bug, however we have many other bugs to keep on track, so it will have low priority.

Brecht Van Lommel commented

2019-07-09 17:49:56 +02:00

Added subscriber: @brecht

Brecht Van Lommel commented

2019-07-09 17:49:56 +02:00

Changed status from 'Open' to: 'Archived'

Brecht Van Lommel closed this issue

2019-07-09 17:49:56 +02:00

Brecht Van Lommel self-assigned this 2019-07-09 17:49:56 +02:00

Brecht Van Lommel commented

2019-07-09 17:49:56 +02:00

Definitely we can improve performance in many ways, but I do not consider this a bug at all. We make no performance guarantees about using operators as a programming API to add thousands of objects.

Paul Melis commented

2019-07-09 21:26:56 +02:00

Okay, I used the cube add operator in the test as it was convenient, figuring it would show the behaviour I was seeing when creating meshes. Turns out the difference is much smaller in the latter case, but still quite a big slowdown in 2.8 compared to 2.7 (and higher memory usage). See the attached script, which uses foreach_set() as the fastest way I know to create geometry from a bpy script. This again creates lots (20,000) of mesh objects in a loop, simulating an import script that does more-or-less the same. The mesh objects are simple quads to focus on the overhead of object creation. Here's some results (note: different blender versions that above, as this is on my home system):

Version	Time	Total mem	Peak mem
2.79.7	37.592 s	266.581 MB	266.614 MB
2.80.74 (`65b2cc2301`)	49.672 s	595.699 MB	864.208 MB

Some observations:

Total time for creating the meshes in 2.8 is 32% higher. I understand the remark about "no performance guarantees" about using the API like this, but this report is more about the performance regression in 2.8 than about the slowness of the API.
The increase in memory usage in 2.8 is bizarre at 2.23x higher. Peak memory usage is even worse at 3.24x higher. The peak memory usage also seems excessive for such a simple scene, 864 MB for 20,000 objects corresponds to 43,200 bytes per mesh (which holds 4 vertices and 1 quad). I should perhaps force a Python GC at the end of the script as I'm not sure how much garbage remains that gets counted.
The exponential behaviour of adding objects is much better than with the operator, which is great. This doesn't follow from the table by the way, but from the output of the script
The script uses unique mesh and object names, to avoid the overhead of the unique naming algorithm. I never really looked into the performance of that, but for 2.7 using the same object and mesh name in the script (forcing Blender to uniqueify the name) slows down the script by a factor of 1.34x. Wow, never thought it would be that bad. For 2.8 it's a factor of 1.27x, slightly better. Something I'll keep in mind next time I'm generating lots of geometry from a script.

The actual real-life case this is all coming from was the 55GB .obj file somebody gave me that was exported from Rhino containing 120,000 OBJ groups. Blender did not manage to import it before running out of memory, even on a system with 96GB RAM :) So I started doing some (micro-)benchmarking and figured I'd report it. In general, I would expect performance and memory regressions from 2.7 to 2.8 to be of interest to you guys. Again, I understand that the Python API is not optimized for all kinds of wacky large scene creation, but since the current importers are (still, cf GSOC) based on the same API regressions are felt all across the board for large scenes. Plus, the Python API is in most cases the only way to do this kind of stuff, as extending Blender with C/C++ takes significant effort to get to know the code base and way of doing things (a C/C++ plugin API would be nice ;-)).

By the way, for some reason the output of the bpy.ops.wm.memory_statistics() call that prints the memory stats gets printed before the other output in 2.8 when redirecting to file on Linux, even though it is the last statement in the script. Did the stdout/stderr caching change in 2.8?

thousands_of_meshes.py

Okay, I used the cube add operator in the test as it was convenient, figuring it would show the behaviour I was seeing when creating meshes. Turns out the difference is much smaller in the latter case, but still quite a big slowdown in 2.8 compared to 2.7 (and higher memory usage). See the attached script, which uses `foreach_set()` as the fastest way I know to create geometry from a bpy script. This again creates lots (20,000) of mesh objects in a loop, simulating an import script that does more-or-less the same. The mesh objects are simple quads to focus on the overhead of object creation. Here's some results (note: different blender versions that above, as this is on my home system): | Version | Time | Total mem | Peak mem | | ---- | ---- | ---- | --- | --- | | 2.79.7 | 37.592 s | 266.581 MB | 266.614 MB | | 2.80.74 (65b2cc2301af) | 49.672 s | 595.699 MB | 864.208 MB | Some observations: - Total time for creating the meshes in 2.8 is 32% higher. I understand the remark about "no performance guarantees" about using the API like this, but this report is more about the performance regression in 2.8 than about the slowness of the API. - The increase in memory usage in 2.8 is bizarre at 2.23x higher. *Peak* memory usage is even worse at 3.24x higher. The peak memory usage also seems excessive for such a simple scene, 864 MB for 20,000 objects corresponds to 43,200 bytes per mesh (which holds 4 vertices and 1 quad). I should perhaps force a Python GC at the end of the script as I'm not sure how much garbage remains that gets counted. - The exponential behaviour of adding objects is much better than with the operator, which is great. This doesn't follow from the table by the way, but from the output of the script - The script uses unique mesh and object names, to avoid the overhead of the unique naming algorithm. I never really looked into the performance of that, but for 2.7 using the same object and mesh name in the script (forcing Blender to uniqueify the name) slows down the script by a factor of 1.34x. Wow, never thought it would be that bad. For 2.8 it's a factor of 1.27x, slightly better. Something I'll keep in mind next time I'm generating lots of geometry from a script. The actual real-life case this is all coming from was the 55GB .obj file somebody gave me that was exported from Rhino containing 120,000 OBJ groups. Blender did not manage to import it before running out of memory, even on a system with 96GB RAM :) So I started doing some (micro-)benchmarking and figured I'd report it. In general, I would expect performance and memory regressions from 2.7 to 2.8 to be of interest to you guys. Again, I understand that the Python API is not optimized for all kinds of wacky large scene creation, but since the current importers are (still, cf GSOC) based on the same API regressions are felt all across the board for large scenes. Plus, the Python API is in most cases the only way to do this kind of stuff, as extending Blender with C/C++ takes significant effort to get to know the code base and way of doing things (a C/C++ plugin API would be nice ;-)). By the way, for some reason the output of the `bpy.ops.wm.memory_statistics()` call that prints the memory stats gets printed *before* the other output in 2.8 when redirecting to file on Linux, even though it is the last statement in the script. Did the stdout/stderr caching change in 2.8? [thousands_of_meshes.py](https://archive.blender.org/developer/F7579710/thousands_of_meshes.py)

Brecht Van Lommel commented

2019-07-09 23:29:50 +02:00

Improving performance with many object is one of the items listed here #63728 (Data, Assets & I/O Module).

But we don't organize that by bug reports, it's not efficient to analyze or explain to everyone why something takes a certain amount of time or memory in some specific case, as long as it's within something like a 2x or 3x range.

Improving performance with many object is one of the items listed here #63728 (Data, Assets & I/O Module). But we don't organize that by bug reports, it's not efficient to analyze or explain to everyone why something takes a certain amount of time or memory in some specific case, as long as it's within something like a 2x or 3x range.

Bastien Montagne commented

2019-07-16 22:49:43 +02:00

Added subscriber: @mont29

Bastien Montagne commented

2019-07-16 22:49:43 +02:00

I’d bet differences in performances here are mostly due to the 'linking to collection' step, which is much more expensive that the 'linking to scene' equivalent we had in 2.79. Code in 2.80 is not necessarily fully optimal there yet, but mainly, 2.8 system is much more complex and requires lots more checks and caching operations when linking (especially the BKE_main_collection_sync() which is ran on every object adding, and will become more and more expensive with the amount of existing objects)…

I’d bet differences in performances here are mostly due to the 'linking to collection' step, which is much more expensive that the 'linking to scene' equivalent we had in 2.79. Code in 2.80 is not necessarily fully optimal there yet, but mainly, 2.8 system is much more complex and requires lots more checks and caching operations when linking (especially the `BKE_main_collection_sync()` which is ran on every object adding, and will become more and more expensive with the amount of existing objects)…

Sign in to join this conversation.

No Label

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

Adding large number of objects much slower in 2.80 #66613