it's a wip, not meant for review yet, but last time I uploaded a build with just a paste, I got problems, so here it is, the full diff of my patched build.
The point of this patch is to reduce kernel recompiles when using OpenCL split kernel, while keeping good performance (actually, performance with this patch are higher than with master)