BLI_task_parallel_range counts the number of tasks depending on the
number of items.
In the case of BLI_bvhtree_overlap the number of items is always
between 2 and 16, which makes it always run in single thread.
This patch proposes to run in multi-thread by setting the maximum
number of items per thread to 1.
Although I expected an even greater performance improvement, in my
tests the cloth collision system (which calls that function) went
from 0.80fps to 0.88fps.