The current shadow system has bad defaults and has too many limitations (light leak, does not work with duplis).
I propose to get rid of ESM and VSM and use old style PCF (percentage closer filtering) and let soft shadows jitter get rid of the banding.
Use 16bit depth by default and use smalest bias possible (see slides link).
This would improve the rendering and filtering cost.
Choosing a flat cubemap packing instead of tetrahedron projection would remove a conversion step. To do it, we would have to render a 1px border around the target faces.
Idea: We could discriminate lights based on their screen space size.
For dupli-lights, we would have to create a tagging system with object ids and time-stamps.
Estimate : 3 weeks