Page MenuHome

Speedup: OldNewMap implementation for file loading
Needs ReviewPublic

Authored by Jacques Lucke (JacquesLucke) on Wed, Dec 5, 9:08 PM.

Details

Summary

The old method felt a bit suspicious to me since I've first seen it.
Before, the mapping was just stored as an array for two main reasons:

  • Insertion happens much more often than lookup (why?).
  • Often data is looked up in order, so a lasthit variable was stored to speedup the next lookup.

While both assumptions are true, I think are not a valid reason to use an array instead of a hashmap.

Some numbers (mostly based on one file from Spring but I also checked other files, the ratios remain similar):

  • There are roughly 4 times more insertions than lookups. (for Spring files usually like 2-4 million insertions and 500.000-1.000.000 lookups)
  • # of lookups in 1 step: 360.000 [Edit: might have counted this incorrectly, but focus is on the other numbers anyway]
  • # of lookups in <10 steps: 63.000
  • # of lookups in <100 steps: 25.000
  • # of lookups in <1000 steps: 40.000
  • # of lookups in <10000 steps: 50.000
  • # of lookups in <100000 steps: 15.000
  • # of lookups in >100000 steps: 20

What I learn from this is that optimizing for the common case is the wrong approach.
Even if 90% of lookups would be extremely fast. It does not help if in the other cases need more than 1000 steps on average.

This patch implements a simple hash map based on the approach used for Python dicts since Python 3.6 (although this implementation lacks many of the features).

Note: If someone decides to add back the lasthit approach, that is still possible because the data structure preserves insertion order (and could also be ordered in any way)!
However I think that is not necessary. The lookup time is not the significant factor anymore because there are 4x more insertions than lookups.

Another benefit of this patch is, that it reduces code complexity imo, because it gets rid of lasthit. This was an implementation detail that leaked into different parts of the loading code.

For actual numbers, see comment below.

Diff Detail

Repository
rB Blender
Branch
oldnewmap_speedup (branched from blender2.8)
Build Status
Buildable 2596
Build 2596: arc lint + arc unit

Event Timeline

  • better probing

I took the same numbers I used in D4037 as base line.

Blender 2.8:
  35,022,073,098
  35,014,875,744
  35,026,930,037

With this patch:
  31,951,994,412
  31,957,468,618
  32,016,310,143

D4037 + this patch:
  30,471,050,491
insert: 0.2% -> 1.3%
lookup: 9.5% -> 0.5%

This is the command I used for testing: valgrind --tool=callgrind ./blender /shared/spring/svn/scenes/03-alpha/03_035_A/03_035_A.anim.blend > /dev/null

This seems fine, but needs *really* careful testing to ensure there are no data corruption bugs.

source/blender/blenloader/intern/readfile.c
311–312

No need to use a char instead of int unless you really need to save memory.

314

This macro seem unnecessary.

Even though it looks fine using a hash here, it's worth measuring times for a blend file that do a lot of linking and one that doesn't, since library linking uses a binary search.

  • cleanup

I originally used char not to save any memory, but to indicate that this variable stores not the actual size but the exponent.
But I can just use int as well.

I tested a couple of different files.
Afaik the Spring files use a lot of linked in stuff, so all my original tests tested that case.

I just tested some files that use no linking (e.g. a file with lots of cubes, with some parent relations). The performance improvement on that file was measurable but not significant.
On files with only few objects, this patch has pretty much no impact at all.