Page MenuHome

Adding large number of objects much slower in 2.80
Closed, ArchivedPublic

Description

System Information
Operating system: Arch Linux
Graphics card: GTX Titan

Blender Version
Broken: 2.80.74 (9c5d54bfaf48, binary from blender.org)
Worked: 2.79.7 (Arch Linux package)

Short description of error

Adding a large number of objects is taking much more time with 2.80 compared to 2.79

Exact steps for others to reproduce the error

With the attached script (which just calls bpy.ops.mesh.primitive_cube_add in a loop with random locations):

$ blender -b -P thousands_of_objects.py 
Blender 2.79 (sub 7) (hash a29446da526d built 2019-03-27 18:12:23)
Read prefs: /home/paulm/.config/blender/2.79/config/userpref.blend
    0 0.001s (0.001x)
  100 0.027s (42.151x)
  200 0.066s (2.413x)
  300 0.112s (1.705x)
  400 0.177s (1.582x)
  500 0.256s (1.444x)
  600 0.361s (1.410x)
  700 0.474s (1.314x)
  800 0.610s (1.286x)
  900 0.735s (1.205x)
 1000 0.876s (1.192x)

Blender quit
paulm@cmstorm 17:18:/data/examples/blender/python_api$ ~/software/blender-git/blender -b -P thousands_of_objects.py 
Blender 2.80 (sub 74) (hash a4bc6aca0ef9 built 2019-07-08 16:50:26)
Read prefs: /home/paulm/.config/blender/2.80/config/userpref.blend
found bundled python: /home/paulm/software/blender-git/2.80/python
    0 0.001s (0.001x)
  100 0.149s (204.909x)
  200 0.421s (2.837x)
  300 0.734s (1.742x)
  400 1.062s (1.446x)
  500 1.404s (1.322x)
  600 1.752s (1.248x)
  700 2.112s (1.205x)
  800 2.502s (1.185x)
  900 2.874s (1.148x)
 1000 3.295s (1.146x)

Blender quit

So 2.80 takes roughly 3.8x as much time, although it is slightly less exponential in behaviour compared to 2.79.

Details

Type
Bug

Event Timeline

Germano Cavalcante (mano-wii) lowered the priority of this task from Needs Triage by Developer to Confirmed, Low.EditedJul 9 2019, 5:45 PM

Calling an operator within such a large loop is not a good practice.
You should think of better ways to achieve the same result.
Since this is a regression I will confirm as a bug, however we have many other bugs to keep on track, so it will have low priority.

Brecht Van Lommel (brecht) closed this task as Archived.Jul 9 2019, 5:49 PM
Brecht Van Lommel (brecht) claimed this task.

Definitely we can improve performance in many ways, but I do not consider this a bug at all. We make no performance guarantees about using operators as a programming API to add thousands of objects.

Okay, I used the cube add operator in the test as it was convenient, figuring it would show the behaviour I was seeing when creating meshes. Turns out the difference is much smaller in the latter case, but still quite a big slowdown in 2.8 compared to 2.7 (and higher memory usage). See the attached script, which uses foreach_set() as the fastest way I know to create geometry from a bpy script. This again creates lots (20,000) of mesh objects in a loop, simulating an import script that does more-or-less the same. The mesh objects are simple quads to focus on the overhead of object creation. Here's some results (note: different blender versions that above, as this is on my home system):

VersionTimeTotal memPeak mem
2.79.737.592 s266.581 MB266.614 MB
2.80.74 (65b2cc2301af)49.672 s595.699 MB864.208 MB

Some observations:

  • Total time for creating the meshes in 2.8 is 32% higher. I understand the remark about "no performance guarantees" about using the API like this, but this report is more about the performance regression in 2.8 than about the slowness of the API.
  • The increase in memory usage in 2.8 is bizarre at 2.23x higher. Peak memory usage is even worse at 3.24x higher. The peak memory usage also seems excessive for such a simple scene, 864 MB for 20,000 objects corresponds to 43,200 bytes per mesh (which holds 4 vertices and 1 quad). I should perhaps force a Python GC at the end of the script as I'm not sure how much garbage remains that gets counted.
  • The exponential behaviour of adding objects is much better than with the operator, which is great. This doesn't follow from the table by the way, but from the output of the script
  • The script uses unique mesh and object names, to avoid the overhead of the unique naming algorithm. I never really looked into the performance of that, but for 2.7 using the same object and mesh name in the script (forcing Blender to uniqueify the name) slows down the script by a factor of 1.34x. Wow, never thought it would be that bad. For 2.8 it's a factor of 1.27x, slightly better. Something I'll keep in mind next time I'm generating lots of geometry from a script.

The actual real-life case this is all coming from was the 55GB .obj file somebody gave me that was exported from Rhino containing 120,000 OBJ groups. Blender did not manage to import it before running out of memory, even on a system with 96GB RAM :) So I started doing some (micro-)benchmarking and figured I'd report it. In general, I would expect performance and memory regressions from 2.7 to 2.8 to be of interest to you guys. Again, I understand that the Python API is not optimized for all kinds of wacky large scene creation, but since the current importers are (still, cf GSOC) based on the same API regressions are felt all across the board for large scenes. Plus, the Python API is in most cases the only way to do this kind of stuff, as extending Blender with C/C++ takes significant effort to get to know the code base and way of doing things (a C/C++ plugin API would be nice ;-)).

By the way, for some reason the output of the bpy.ops.wm.memory_statistics() call that prints the memory stats gets printed before the other output in 2.8 when redirecting to file on Linux, even though it is the last statement in the script. Did the stdout/stderr caching change in 2.8?

Improving performance with many object is one of the items listed here T63728: Data, Assets & I/O Module.

But we don't organize that by bug reports, it's not efficient to analyze or explain to everyone why something takes a certain amount of time or memory in some specific case, as long as it's within something like a 2x or 3x range.

I’d bet differences in performances here are mostly due to the 'linking to collection' step, which is much more expensive that the 'linking to scene' equivalent we had in 2.79. Code in 2.80 is not necessarily fully optimal there yet, but mainly, 2.8 system is much more complex and requires lots more checks and caching operations when linking (especially the BKE_main_collection_sync() which is ran on every object adding, and will become more and more expensive with the amount of existing objects)…