Global undo is slow on scenes with many objects. A big reason is that undo writes and reads all datablocks in the file (except from linked libraries). If we can only handle the ones that actually changed, performance would be much improved in common scenarios.
This proposal outlines how to achieve this in 3 incremental steps. Only 1 or 2 of these steps would already provide a significant speedup.
At the time of writing there is no concrete plan to implement this, but I've had this idea for a while and it seemed useful to write down in case someone likes to pick it up.
1. Read only changed datablocks
Linked library datablocks are already preserved on undo pop. We can do the same thing for all non-changed datablocks, and only read the changed ones.
We can detect which datablocks were changed by checking that their diff is empty. Addition and removal of datablocks also needs to be detected. We need to be careful to exclude runtime data like LIB_TAG_DOIT from writing, to avoid detecting too many datablocks as changed. Most changes to runtime data go along with an actual change to the datablock though.
Since unchanged datablocks may point do changed datablocks, we need to reload datablocks in the exact same memory location. We can safely assume the required memory size stays the same, because if it didn't those other datablocks' pointers would be changed as well. In principle other datablocks should not point to anything but the datablock itself, but there may be exceptions that need to be taken care of (e.g. pointers to [armature] bones hold by [object] pose data...).
Main fields other than datablock lists may need some attention too.
2. Optimize post undo/redo updates
The most expensive update is likely the dependency graph rebuild. We can detect if any depsgraph rebuild was performed between undo pushes, and if so mark that undo step as needing a depsgraph rebuild. This means the undo step is no more expensive than the associated operation.
This requires the dependency graph to be preserved through undo in scene datablocks, similar to how images preserve image buffers. This would then also preserve GPU data stored in the dependency graph. If the dependency graph supports partial rebuilds in the future, that could be taken advantage of.
There may be other expensive updates, to be revealed by profiling.
3. Write only changed datablocks
While undo/redo is usually the most expensive, the undo push that happens after each operation also slows down as scene complexity increases. If we know which datablocks changed, we can only save & diff those.
This could be deduced from dependency graph update tags. However there are likely still many operations where tags are missing. If it is not done correctly, then instead of a missing refresh there is a more serious bug of not undoing all changes. So this needs verification throughout the Blender code for correct tagging, before it can be enabled.
Debug builds could verify correctness by testing the diff is empty whenever there was no tag.