Page MenuHome

Blender freezes in multi-threaded tasks since recent rB98123ae91680, on windows - atomic ops issue?
Closed, ResolvedPublic

Description

System Information
Win 7x64, Nvidia GTX 580

Blender Version
broken: blender-2.77.0-git.b72aef9-AMD64
Working: blender-2.77.0-git.898d040-AMD64

Blender freezes while trying to do vertex snapping with subdivision modifier enabled both in object and edit mode.

  1. Create Suzanne
  2. Add Subsurface modifier, duplicate mesh
  3. Try to manipulate meshes with vertex snapping enabled

Sometimes i can reproduce this instantly, sometime not, so try open attached file, it freezes all the time.

Event Timeline

Denis Belov (dihotom) set Type to Bug.
Denis Belov (dihotom) created this task.
Denis Belov (dihotom) raised the priority of this task from to Needs Triage by Developer.

Cannot reproduce that here… @Campbell Barton (campbellbarton) you just opened the file and did some snapped transform in Object or Edit mode, and got the freeze?

Anyway, if this commit causes issues, it can be reverted, gave nearly no speed gain anyway…

I don't like to revert optimizations (no matter how small) :(
@Bastien Montagne (mont29) if the first time has not frozen, try again, there are times when it works.
(Just make the snap to vertices in object mode. Edit mode also freezes)

I tried it several times of course, with both release and debug builds. Am on linux though, not sure on which OS Campbell reproduced it.

I do found an error in new code that could create issues, committed a fix, please give it a try. :)

ops, saying to "try again" I meant close and open Blender (but you also must have already tried this way).

At first, it seemed that was fixed. But the problem came back on the second try :( (no fix)

Still cannot reproduce at all…

Might be related to T48437, can you please try and see if you can reproduce it?

Yes I can reproduce it.
And the race condition also occurs in the loop "while (UNLIKELY(previter != olditer)".

I found a strange thing - after the end of the loop execution, suddenly, out of nowhere, it runs again without passing by the expected sequence of the function.

(I do not understand this atomic thing however)

Grumph… atomic means the operation is done in 'a single step' from CPU point of view, i.e. you cannot have thread 1 start an atomic operation, then thread 2 modify one of its operands, then thread 1 finish the operation.

Those atomic ops are implemented in all modern CPUs, and are much cheaper than using regular thread synchronization primitives like mutex or spinlock.

That looping func is a way to perform an operation that does not exists in atomic primitives, idea is to:

  1. read current value of the shared data we want to modify (32bit data, reading is assumed atomic, i.e. you cannot read part of the value, then get it changed by another thread, then read the remaining part).
  2. do the operation and store its value in a local variable (this can take any amount of time, since it only uses local or read-only variables).
  3. do an atomic CAS to set the shared variable we want to modify.
  4. Repeat as long a value returned by CAS is not the same as the one we stored at the beginning (meaning the shared variable has been modified by another thread in-between).

The atomic CAS (compare and swap) compares the value of the data to modify with a given 'reference', only sets the former with the new value if it equals to the reference value, and then return the old value of modified data.

So in theory, this is perfectly safe and no deadlock should happen. Actually, there is no actual deadlock possible here, since there is no lock - I’d rather think of an inifite loop due to something messed up in msvc version of our atomic primitives.

I would suspect some stupid conversion mismatch between signed and unsigned integers (though afaik uint32 < INT_MAX should not be an issue here :| ).
Can you please try to replace line 76 of intern/atomic/intern/atomic_ops_msvc.h file with that one, and check again?

	return InterlockedCompareExchange((long *)v, *(long *)(&_new), *(long *)(&old));

(I'm still trying to understand all the explanation ...)

However did the change you requested (line 76 of atomic_ops_msvc.h), and the problem persists :(

Guess I’ll have to go and debug this myself on my win VM (provided blender still runs on it), looks like our win atomics is broken somehow (unless I miss something else, maybe we'd need some kind of memory fence here, not sure why or where though)…

Bastien Montagne (mont29) renamed this task from Blender freezes while trying to do vertex snapping with subdivision modifier enabled. to Blender freezes in multi-threaded tasks since recent rB98123ae91680, on windows - atomic ops issue?.
Bastien Montagne (mont29) claimed this task.