Page MenuHome

Undo - Current Status and Important Fixes Needed
Confirmed, NormalPublicDESIGN

Assigned To
None
Authored By
Bastien Montagne (mont29)
Dec 15 2020, 5:35 PM
Tokens
"Love" token, awarded by kursadk."Love" token, awarded by Ali6."Like" token, awarded by Maged_afra."Like" token, awarded by MetinSeven."Love" token, awarded by Anubis."Love" token, awarded by gilberto_rodrigues."Love" token, awarded by Raimund58."Like" token, awarded by TheRedWaxPolice."Love" token, awarded by Schamph.

Description

While investigating T82851: Undo bug while sculpting I realized that current general undo code has several issues, some being fairly critical and not trivial to solve (as is, too involved for a mere bugfix).

Here is current view of the situation (as I understand it), and some draft proposals to fix things and hopefully reach a better general state for undo code.

Current Status and Issues

1- Some undo steps will never be undone

Some undo modes (Sculpt and Image ones afaict) have a specific way to store and process their undo steps, which results in the fact that when undoing, they actually need to process the next step, and not the one that is passed to them (see sculpt_undosys_step_decode_undo e.g.).

This is all good as long as we stay in a same undo step type, but as soon as the step is from another type (like a global memfile one e.g.), its apply function is fully unaware of this, and therefore that first sculpt undo step never gets processed (undone).

2 - Some undo modes have useless redundant and critically buggy code

Again, Sculpt and Image undo code has loops over previous/next steps that are both useless, and critically buggy as they assume said previous/next steps are of the same undotype.

I'd assume this is inherited from the time where undo stacks where separated, but in a single common undo stack this is both useless (since main undo code like BKE_undosys_step_undo_with_data_ex already ensures that all previous/next steps have been processed in proper order), and obviously broken (due to lack of undo step type checking).

3 - Unclear half-not-implemented fully-not-working support of 'absolute' undo steps

Definition:

  • A absolute step can be evaluated as its own and generate a valid data state, regardless of any other steps in-between it and the current data.
  • A relative step depends on the neighbor steps' results, so going back to a specific step requires to sequentially apply all intermediary steps in-between it and the current data.

Currently, general undo system (in undo_system.c) assumes all steps to be relative (see e.g. the first loop in BKE_undosys_step_undo_with_data_ex). There are comments and pieces of code that seem to hint that absolute steps were supported, or intended to be supported at some point.

In addition, while memfile steps used to be absolute, it's no longer the case with new system for them either.

4 - Apparently unused GPencil undo code

GPencil undo code is tagged in ed_undo_step_impl as needing update to be properly integrated in new general undo stack, however it seems to never be used effectively. See T84703: Modernize (or Remove) GPencil Undo System for details.

...And more?

There are probably more issues more or less related to issues mentioned above (did not check our reports related to edits/memfile undos mismatches e.g.), but I think addressing those three points would already bring things to a saner state.

Proposals

1 - Some undo steps will never be undone

Think main undo system should deal with those cases. Would do it that way:

  • All step_decode callbacks should only ever process the step that is given to them.
  • We need a tag for undo types that require the step-after-target-one to be processed in undo case, so that main undo system knows how to deal with them, including within a mixed type of steps situation.
    • E.g. when you undo from a sculpt step to a memfile one, you actually need to process both the sculpt undo step (exception case) and the memfile one.

2 - Some undo modes have useless redundant and critically buggy code

Those extra looping pieces of code should simply be removed imho. Afaict they are dead code from earlier versions of undo system?
Tried to get them actually do something in sculpt_undosys_step_decode_undo() in debug session, and in all cases those actually never do anything, besides setting step-to-be-processed as the next one instead of the target one e.g.

3 - Unclear half-not-implemented fully-not-working support of 'absolute' undo steps

Suggestion is to simply always assume all steps are relative ones. This will simplify logic and make code easier to follow.

This does imply some extra work when undoing a lot of steps at once (through the undo history menu e.g.), however I think this situation is not that common? And one could argue that on very heavy files like production ones, doing fifty relative undo steps affecting only a few data-blocks may very well be much quicker than re-reading the whole .blend file from memory, with all the implied depsgraph rebuild, updates, etc.

Event Timeline

Bastien Montagne (mont29) changed the task status from Needs Triage to Confirmed.Dec 15 2020, 5:35 PM
Bastien Montagne (mont29) created this task.
Bastien Montagne (mont29) changed the subtype of this task from "Report" to "Design".

This task is mainly to see if other devs having some good knowledge of undo area see missing issues, or invalid analysis of those, and/or have other ideas about how to address them, before I start spending time in actual code refactor...

@Campbell Barton (campbellbarton) am especially eager to have your feedback here, since iirc you were the last one to work on general undo system (among other things, merging all stacks into a single one?). ;)

Clarifications

  • re: Some undo modes have useless redundant and critically buggy code and Unclear half-not-implemented fully-not-working support of 'absolute' undo steps.

    I would like to see more examples of this... mainly because it's unclear to me where the issues are.

    Global undo used to be absolute In that you could read an undo step and it would load most data (except for images, handled separately, maybe some other rare exceptions).

Regarding Proposal

1 - Some undo steps will never be undone

This is fuzzy to me, if we need to handle special cases, OK, but I'm not sure which steps are not undone and the kinds of bugs this is causing currently.

2 - Some undo modes have useless redundant and critically buggy code

Can we confirm that this is only sculpt mode? - if that's the case. Then we're on the same page. Getting sculpt modes undo working at all was quite difficult given how it handles undo steps internally.

3 - Unclear half-not-implemented fully-not-working support of 'absolute' undo steps

+1 to remove absolute undo steps. There are times speed things up, but my sense is that undo-history is not a common way to access undo. And we can accept some performance loss to keep the internal logic stable + working.

Note that we _could_ always support ways of skipping some undo steps as an optimization later (if we want). For example - loading undo steps into the same edit-mesh is not very useful. However I dont see this as a priority... having working predictable undo is so much more important.


Comments...

  • Generally the proposal seems fine, although I'd like to be aware of bugs. In general my impression is undo is working well in that it's not crashing and users are not regularly running into problems. That's not to say there are no problems, just that overall it's reliable. OTOH, it's possible I miss some issues and am unaware of some bugs/problems.
  • I'd highly recommend using tests for any development in this area. It's far too easy to accidentally break things and not realize, especially when this is interactions with different undo systems.

    Some breakages only show up in quite obscure cases... undo/redo between editors, switching modes... etc.

    See: ../lib/tests/ui_simulate/run.py, for the current undo tests.

    Suggest to:
    • Double check the current tests cover the areas you intend to change/develop.
    • Create tests that expose existing bugs these changes intend to fix (handy for TDD too)

I can help with these tests if needed, if you can point me to bugs in undo, I don't mind to turn them into tests we can use to resolve issues, or if you want a hand in getting them setup, that's fine too.

I can mostly only repeat why I already say in task itself...

Clarifications

No open bug afaik, but image undo code seems to use the exact same 'use next step instead of given one' process as sculpt mode, so would expect exact same issue of missing first undo step...

  • re: Some undo modes have useless redundant and critically buggy code and Unclear half-not-implemented fully-not-working support of 'absolute' undo steps.

    I would like to see more examples of this... mainly because it's unclear to me where the issues are.

Again, why are internal undo/redo code of image and sculpt (sculpt_undosys_step_decode_undo etc.) looping over undo steps, when main system (BKE_undosys_step_undo_with_data_ex) already does it? Looks like dead code from before merge of the different undo stacks to me?

Regarding Proposal

1 - Some undo steps will never be undone

This is fuzzy to me, if we need to handle special cases, OK, but I'm not sure which steps are not undone and the kinds of bugs this is causing currently.

See T82851: Undo bug while sculpting. First sculpt step stored after a Memfile one will never be undone. This is not visible from user most of the time since switching to Sculpt mode generates it's own, initial 'do nothing' sculpt step, but if there are memfile undo steps within a sculpt session e.g., then you'll have that bug.

That's why I want to make main BKE_undosys_step_undo_with_data_ex aware of this requirement instead, so that it can deal properly with this case, and mode-specific undo code can just blindly always process the step passed to it, and nothing else. This is the only safe way to handle an heterogeneous undo stack imho. mode-specific callbacks should never have to access any other step than the one they are given.

2 - Some undo modes have useless redundant and critically buggy code

Can we confirm that this is only sculpt mode? - if that's the case. Then we're on the same page. Getting sculpt modes undo working at all was quite difficult given how it handles undo steps internally.

Again, think this is at least sculpt and image. At the very least, they both share the same process in sculpt_undosys_step_decode_undo and image_undosys_step_decode_undo, iterating over undo steps blindly assuming those are of their own type, without any checks (and afaict without any valid reason to do so anymore).

Comments...

  • Generally the proposal seems fine, although I'd like to be aware of bugs. In general my impression is undo is working well in that it's not crashing and users are not regularly running into problems. That's not to say there are no problems, just that overall it's reliable. OTOH, it's possible I miss some issues and am unaware of some bugs/problems.
  • I'd highly recommend using tests for any development in this area. It's far too easy to accidentally break things and not realize, especially when this is interactions with different undo systems.

    Some breakages only show up in quite obscure cases... undo/redo between editors, switching modes... etc.

    See: ../lib/tests/ui_simulate/run.py, for the current undo tests.

    Suggest to:
    • Double check the current tests cover the areas you intend to change/develop.
    • Create tests that expose existing bugs these changes intend to fix (handy for TDD too)

I can help with these tests if needed, if you can point me to bugs in undo, I don't mind to turn them into tests we can use to resolve issues, or if you want a hand in getting them setup, that's fine too.

Indeed, starting with new tests seems the best thing to do here, will do that first.

Added new view3d_sculpt_with_memfile_step test demonstrating issue from T82532/T82851 in rBL62534

I can assure you guys that using undo history menu to undo multiple steps at once is an extremely common use case. Why? Because of 2 reasons

1). Undoing and then redoing between the same steps to check whether a change you made is a good one is pretty much a universal workflow, no one teaches you to work like this you learn this yourself, given just how universal this is you can almost call it human nature.

2). Sometimes or rather most of the times comparing the state of something (be it a mesh or scene etc etc...) immediately before and after a change is just not a fair comparison to make, it's like comparing apples to oranges , to make it a more fair comparison you need to make more than one change possibly even 10 to 20 changes before you can finally compare the current state with what it was many, many undo steps ago.

@StrictlyIncreasing (StrictlyIncreasing) it's not about removing ability to undo multiple steps at once, it's about removing half-finished, not-really-working that tries to optimize that specific scenario, but is not really working anyway currently.

FYI, quick-slapped a (horrible!) investigating patch to address the sculpt-step-not-properly-undone case,

1diff --git a/source/blender/blenkernel/intern/undo_system.c b/source/blender/blenkernel/intern/undo_system.c
2index 078c93532d9..fb2e80fccf2 100644
3--- a/source/blender/blenkernel/intern/undo_system.c
4+++ b/source/blender/blenkernel/intern/undo_system.c
5@@ -207,7 +207,24 @@ static void undosys_step_decode(
6 }
7
8 UNDO_NESTED_CHECK_BEGIN;
9- us->type->step_decode(C, bmain, us, dir, is_final);
10+ if (dir < 0) {
11+ if (us->type == BKE_UNDOSYS_TYPE_SCULPT) {
12+ if (us->next && us->next->type == BKE_UNDOSYS_TYPE_SCULPT) {
13+ us->next->type->step_decode(C, bmain, us->next, dir, is_final);
14+ }
15+ }
16+ else if (us->type != BKE_UNDOSYS_TYPE_SCULPT && us->next &&
17+ us->next->type == BKE_UNDOSYS_TYPE_SCULPT) {
18+ us->next->type->step_decode(C, bmain, us->next, dir, is_final);
19+ us->type->step_decode(C, bmain, us, dir, is_final);
20+ }
21+ else {
22+ us->type->step_decode(C, bmain, us, dir, is_final);
23+ }
24+ }
25+ else {
26+ us->type->step_decode(C, bmain, us, dir, is_final);
27+ }
28 UNDO_NESTED_CHECK_END;
29
30 #ifdef WITH_GLOBAL_UNDO_CORRECT_ORDER
31@@ -696,8 +713,8 @@ bool BKE_undosys_step_undo_with_data_ex(UndoStack *ustack,
32
33 /* Handle accumulate steps. */
34 if (ustack->step_active) {
35- UndoStep *us_iter = ustack->step_active;
36- while (us_iter != us) {
37+ UndoStep *us_iter = ustack->step_active->prev;
38+ while (us_iter && us_iter != us) {
39 /* TODO:
40 * - skip successive steps that store the same data, eg: memfile steps.
41 * - or steps that include another steps data, eg: a memfile step includes text undo data.
42diff --git a/source/blender/editors/sculpt_paint/sculpt_undo.c b/source/blender/editors/sculpt_paint/sculpt_undo.c
43index 9677152cf7e..d837e7c5c5d 100644
44--- a/source/blender/editors/sculpt_paint/sculpt_undo.c
45+++ b/source/blender/editors/sculpt_paint/sculpt_undo.c
46@@ -1467,6 +1467,8 @@ static void sculpt_undosys_step_decode_undo(struct bContext *C,
47 SculptUndoStep *us)
48 {
49 SculptUndoStep *us_iter = us;
50+ sculpt_undosys_step_decode_undo_impl(C, depsgraph, us_iter);
51+ return;
52 while (us_iter->step.next && (us_iter->step.next->type == us_iter->step.type)) {
53 if (us_iter->step.next->is_applied == false) {
54 break;
.

It does fix the sculpt issue, but is obviously not a proper clean solution at all, and also break some other tests on the road... But think it demonstrate the problem on a code side of things.

I have had a case with undoing text edits inside the text editor also causes massive wait times with heavy scenes. I am wondering if it is possible to decouple the text editor changes from the actual scenes. This kind of slow down is extra unproductive.

Thanks