This is a proposition for a new implementation of the Array Modifier, that does not use BMesh. Modifiers using BMesh have very poor performance.
This patch is a proposed new implementation. It is identical in its features and results. It gives a performance improvement of more than 100 fold (really it does) when merge option is not selected, and of around 10-20 with merge option checked. These improvements are measured without OpenMP, another /2 or /4 factor could be gained through multi-threading, which is much easier to implemement on the simple loops of direct derived mesh processing.
In this patch is a proposed implementation of doubles detection that is inspired from the algorithm used in the "Remove Doubles" operator, but with a few differences, using separate sorted arrays for target and source, and I think a slightly improved performance. This map_doubles() function could be added to cdderivedmesh.c, and made available to other modifiers.
The new implementation of the array modifier also uses CDDM_merge_verts() from cdderivedmesh.c, which up until now was only called by the mirror modifier.
I understand there is some risk in overhauling such rock-solid old work horse as the array modifier, but a 100 times gain is a lot, I believe it's definitely worth it, it can be the difference between a 10 seconds wait and a 0.1 second result.