Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undo/Redo #105

Closed
3 of 5 tasks
raineorshine opened this issue Oct 16, 2019 · 40 comments · Fixed by #811
Closed
3 of 5 tasks

Undo/Redo #105

raineorshine opened this issue Oct 16, 2019 · 40 comments · Fixed by #811
Assignees
Labels
feature New feature or request

Comments

@raineorshine
Copy link
Contributor

raineorshine commented Oct 16, 2019

Add Undo and Redo functionality.

Requirements:

  • Review 2-3 undo/redo libraries for Redux and provide reasoning for or against.
  • Add undo and redo buttons to toolbar with appropriate svg icons.
  • Undo history should persist in state, local, and remote stores (See: Data Storage).
  • Work with iterative loading (Thought Cache & Iterative Loading #703).
  • Undo groups of actions at once.

Groups

Each undo should be associated with one or more actions.

Consider the following sequence of actions:

  1. newThought
  2. existingThoughtChange1
  3. existingThoughtChange2
  4. existingThoughtChange3

These should be undoable in a single undo.

Another example, based on the following state:

  • a
    • b [cursor]
      • c
  1. existingThoughtChange1
  2. existingThoughtChange2
  3. existingThoughtChange3
  4. cursorDown
  5. existingThoughtChange4
  6. existingThoughtChange5
  7. existingThoughtChange6
  8. cursorUp

Activating undo should undo 5-8, i.e. navigation and contiguous edits.

Actions

Undoable

The following action types should be undoable:

archiveThought
bumpThoughtDown
cursorBack (navigation)
cursorBeforeSearch (navigation)
cursorDown (navigation)
cursorForward (navigation)
cursorHistory (navigation)
cursorUp (navigation)
deleteAttribute
deleteData
deleteEmptyThought
deleteThought
existingThoughtChange
existingThoughtDelete
existingThoughtMove
expandContextThought
indent
moveThoughtDown
moveThoughtUp
newThought
newThoughtSubmit
outdent
searchLimit
setAttribute
setCursor (navigation)
setFirstSubthought
settings
splitThought
subCategorizeAll
subCategorizeOne
toggleAttribute
toggleCodeView
toggleContextView
toggleHiddenThoughts
toggleSplitView
toolbarOverlay
tutorial
tutorialChoice
tutorialNext
tutorialPrev
tutorialStep

Non-undoable

Patches should not be generated for the following actions and they should be ignored completely by the undo/redo history:

alert
authenticate
clear
clearQueue
dragHold
dragInProgress
editing
editingValue
error
invalidState
loadLocalState
loadLocalThoughts
modalComplete
modalRemindMeLater
prependRevision
render
search
selectionChange
setResourceCache
showModal
status
toggleSidebar
undoArchive (will be replaced with new undo/redo logic)
updateSplitPosition
unknownAction
updateThoughts (complicated)
@raineorshine raineorshine modified the milestone: New User Oct 16, 2019
@GugaGongadze
Copy link
Contributor

There are two main approaches one can take when considering adding a redo/undo functionality to their app.

It depends if your app is state-heavy or action-heavy.

A state-heavy app is one that has a lot of information saved during the app's lifetime, but the reducers for the actions are thin and not computationally heavy.

An action-heavy app is one that has a relatively thin state, but the reducers for the action do some heavy filtering, mapping, etc.

Because of this, there are two types of libraries for implementing undo/redo in Redux.

Ones that save a sequence of state versions in past, present and future arrays and ones that save a sequence of actions taken in history array and the current version in the present field.

Considering Em has a relatively thin state and the actions for the shortcuts outlined in the issue are computation-heavy, I would suggest going with the first option and using a library that tracks state version changes rather than actions.

The most popular library for that approach is redux-undo, which has 2.4k stars and offers some nice additional functionalities:

  • Apply the undo/redo functionality to a specific part of the state.
  • Jump more than one step into the past or the future.
  • Limit the size of your history(so your Redux store does not get bloated).

@raineorshine
Copy link
Contributor Author

Thank you! I appreciate the explanation.

With the approach of saving a sequence of state versions, does it save a diff of each state after a checkpoint, or does it save the entire state? thoughtIndex and contextIndex are the heaviest parts of em's state by far, which may have tens of thousands of entries each. If only a small change is made, it would be important to only save the diff rather than the full state each step.

@GugaGongadze
Copy link
Contributor

It saves the whole state to avoid the merging overhead. This is time-efficient but, not that space-efficient if you have a rather large state.

To counterpart this you can use some of the features outlined above: Only make specific reducers undoable, limit how many steps get saved, etc.

Or just use the other approach and save only the actions. But then again, if your reducer logic is computation heavy, you need to be conscious of that as well.

@raineorshine
Copy link
Contributor Author

It saves the whole state to avoid the merging overhead. This is time-efficient but, not that space-efficient if you have a rather large state.

Yeah, I think with the size of thoughtIndex and contextIndex we will have to go with a space-efficient solution.

To counterpart this you can use some of the features outlined above: Only make specific reducers undoable, limit how many steps get saved, etc.

These aren't viable options for us, since we need full undo/redo for all the commands listed in the issue.

Or just use the other approach and save only the actions. But then again, if your reducer logic is computation heavy, you need to be conscious of that as well.

This seems like the right path to me. I'm okay having a slight performance hit on the actual undo/redo execution if it is more space-efficient.

Does this approach rely on inverse actions? It seems like that would be the only way to go backwards through history. Otherwise you would have to go forwards from the beginning of the history each time to get to the desired state.

@GugaGongadze
Copy link
Contributor

GugaGongadze commented Jan 11, 2020

Does this approach rely on inverse actions? It seems like that would be the only way to go backwards through history. Otherwise you would have to go forwards from the beginning of the history each time to get to the desired state.

Yes, it replays the entire history from the beginning. The thing about the reverse actions is that it becomes non-trivial for complex ones. Simple INCREMENT-like actions that have only type field can have a counterpart DECREMENT action, but what about the actions that take in some kind of value?

That's why it is better advised to save states in history. Replaying everything can become too heavy at some point.

@raineorshine
Copy link
Contributor Author

The thing about the reverse actions is that it becomes non-trivial for complex ones. Simple INCREMENT-like actions that have only type field can have a counterpart DECREMENT action, but what about the actions that take in some kind of value?

Good point. I can see that getting complex, especially with new actions being added.

That's why it is better advised to save states in history. Replaying everything can become too heavy at some point.

I have point out again that the state in em contains tens of thousands of free text entries. Even with a small number of undo steps, such as 20, that seems untenable from a storage performance perspective. Given the IndexedDB limit, it effectively reduces how much can be stored locally by a factor of the number of history steps that are supported.

(FWIW, server-side caching will have to be implemented at some point which will reduce the state size, but that requires a large change of infrastructure and has wider security implications.)

It seems like the best approach would be to save the diff of each state change. That would be both space efficient and more computationally efficient than replaying the full action. What are your thoughts about this approach? Is there a functional downside? Why don't existing undo/redo frameworks do this?

I would potentially be open to the conventional action replay approach to start with, and then consider optimizing in the future with a more performant approach like the above in the future.

Thanks for addressing these specific concerns given the heavy state of em.

@GugaGongadze
Copy link
Contributor

GugaGongadze commented Jan 15, 2020

Storing diffs has 2 main problems with efficiency:

  1. Diffs are new objects with metadata that describes how to apply them. Storing the states doesn't need this.
  2. Diffs are somewhat expensive to calculate and then you have to deep-merge it as well.

Yes, let's go with the more conventional action replay approach first.

@GugaGongadze GugaGongadze mentioned this issue Mar 3, 2020
@raineorshine raineorshine added feature New feature or request and removed project labels Mar 21, 2020
@anmolarora1
Copy link
Contributor

anmolarora1 commented Jul 1, 2020

@raineorshine I believe that there are a couple of options for implementing this. The approach of diffs sounds the best one until we think about its possible complications, as discussed above. Here's some more discussion on that.

Based on my understanding, these are the two options that might be work considering -

  1. An action-replay based approach similar to undox (something like the PR implemented) but with a slight change. Instead of maintaining the full-history of actions, we can limit the number of undo/redo operations n and always store the state reference of the state at current - n actions. That way, we'll have to replay a max of n actions to get to the current state. We'll be using twice the memory but saving quite a bit of performance-cut.
    If limiting the number of undo/redo ops isn't an option, we can go with a hybrid approach where we store the state references at regular intervals, say after every 20 actions.

  2. Implement inverse actions for every action - Tedious and error-prone but efficient once implemented correctly

For any of these to work correctly, the reducers need to be pure functions without any side effects.

@raineorshine
Copy link
Contributor Author

Thanks for reviewing the approaches!

@raineorshine I believe that there are a couple of options for implementing this. The approach of diffs sounds the best one until we think about its possible complications, as discussed above.

I’m not convinced the two complications above are valid. I don’t know yet whether this is relevant for the solution you are proposing, but I don’t wish to rule out diffs yet, for the following reasons:

Storing diffs has 2 main problems with efficiency:

  1. Diffs are new objects with metadata that describes how to apply them. Storing the states doesn't need this.

I don’t believe this is relevant given that storing the entire state is not an option.

  1. Diffs are somewhat expensive to calculate and then you have to deep-merge it as well.

Immer has very fast deep object diffing. Our state does not have a lot of depth.

Here's some more discussion on that.

The points about references and gc in this comment are valid for in-memory references but I don’t believe it applies to persisted history. We need a solution that supports serialization and persistence of the history.

Based on my understanding, these are the two options that might be work considering -

  1. An action-replay based approach similar to undox (something like the PR implemented) but with a slight change.

I like the grouping functionality.

Instead of maintaining the full-history of actions, we can limit the number of undo/redo operations n and always store the state reference of the state at current - n actions. That way, we'll have to replay a max of n actions to get to the current state. We'll be using twice the memory but saving quite a bit of performance-cut.

Unfortunately this won’t allow persistent history and won’t work with the new iterative loading mechanism where not all state is in memory at the same time.

  1. Implement inverse actions for every action - Tedious and error-prone but efficient once implemented correctly

I am still quite interested in this approach, as it would be nice to be able to “revert” the state one action or group of actions at a time, from present to past, without requiring any snapshots or replays.

However my thinking is that if we are going to use inverse actions, why not simply generate an inverse patch for each regular action? Immer can do this easily. Then we have a general solution which works for all actions. We would just need to devise a mechanism to handle the grouping.

For any of these to work correctly, the reducers need to be pure functions without any side effects.

Agreed.

On an unrelated note, I am concerned about how undo/redo will work with the new iterative loading. What’s happens when part of the state that is being undone is no longer in memory? We’ll need to think carefully about that. We don’t want to choose an approach that is incompatible with iterative loading, though I regret the additional complexity this involves.

Looking forward to hearing what you think!

@anmolarora1
Copy link
Contributor

anmolarora1 commented Jul 2, 2020

I’m not convinced the two complications above are valid. I don’t know yet whether this is relevant for the solution you are proposing, but I don’t wish to rule out diffs yet, for the following reasons:

Storing diffs has 2 main problems with efficiency:

  1. Diffs are new objects with metadata that describes how to apply them. Storing the states doesn't need this.

I don’t believe this is relevant given that storing the entire state is not an option.

  1. Diffs are somewhat expensive to calculate and then you have to deep-merge it as well.

Immer has very fast deep object diffing. Our state does not have a lot of depth.

@raineorshine The complexity with the diffs is manually maintaining the additional payload/metadata, which describes how to apply them.
As you pointed out, immer does that out of the box with patches. So that's worth considering.

Instead of maintaining the full-history of actions, we can limit the number of undo/redo operations n and always store the state reference of the state at current - n actions. That way, we'll have to replay a max of n actions to get to the current state. We'll be using twice the memory but saving quite a bit of performance-cut.
If limiting the number of undo/redo ops isn't an option, we can go with a hybrid approach where we store the state references at regular intervals, say after every 20 actions.

Unfortunately this won’t allow persistent history and won’t work with the new iterative loading mechanism where not all state is in memory at the same time.

This should work if go with the hybrid approach of storing state references after a certain number of actions.

  1. Implement inverse actions for every action - Tedious and error-prone but efficient once implemented correctly

I am still quite interested in this approach, as it would be nice to be able to “revert” the state one action or group of actions at a time, from present to past, without requiring any snapshots or replays.

However my thinking is that if we are going to use inverse actions, why not simply generate an inverse patch for each regular action? Immer can do this easily. Then we have a general solution which works for all actions. We would just need to devise a mechanism to handle the grouping.

I looked at immer patching & inverse-patching, and they seem like a good solution. The only caveat would be that we’ll have to refactor all our reducers to use immer.
We'll need grouping to ensure that we don’t end up with a bloated history of patches. 
And it can be handled by something like this. There can be some performance implications but it's still worth giving a shot. To try it out for one of our core reducers and analyze the performance can be a good starting point.

On an unrelated note, I am concerned about how undo/redo will work with the new iterative loading. What’s happens when part of the state that is being undone is no longer in memory? We’ll need to think carefully about that. We don’t want to choose an approach that is incompatible with iterative loading, though I regret the additional complexity this involves.

That doesn't seem like a problem if iterative loading is gonna be based on pure functions/reducers. The patches/action-replays would store the payloads as well, which should be sufficient to rebuild the state at any given point in time.
If there are any specific complexities that I'm missing out on, it'd help if you can point them out.

@raineorshine
Copy link
Contributor Author

@raineorshine The complexity with the diffs is manually maintaining the additional payload/metadata, which describes how to apply them.
As you pointed out, immer does that out of the box with patches. So that's worth considering.

Maybe it won't be so hard with immer:

state = applyPatches(state, changes)

Instead of maintaining the full-history of actions, we can limit the number of undo/redo operations n and always store the state reference of the state at current - n actions. That way, we'll have to replay a max of n actions to get to the current state. We'll be using twice the memory but saving quite a bit of performance-cut.
If limiting the number of undo/redo ops isn't an option, we can go with a hybrid approach where we store the state references at regular intervals, say after every 20 actions.

Unfortunately this won’t allow persistent history and won’t work with the new iterative loading mechanism where not all state is in memory at the same time.

This should work if go with the hybrid approach of storing state references after a certain number of actions.

By state references, do you mean in-memory references? What happens when the page is refreshed? Maybe you have a serialization scheme in mind that I am not aware of.

  1. Implement inverse actions for every action - Tedious and error-prone but efficient once implemented correctly

I am still quite interested in this approach, as it would be nice to be able to “revert” the state one action or group of actions at a time, from present to past, without requiring any snapshots or replays.
However my thinking is that if we are going to use inverse actions, why not simply generate an inverse patch for each regular action? Immer can do this easily. Then we have a general solution which works for all actions. We would just need to devise a mechanism to handle the grouping.

I looked at immer patching & inverse-patching, and they seem like a good solution. The only caveat would be that we’ll have to refactor all our reducers to use immer.

Okay, good to know. How would the reducers use immer? Would it be possible to generate the patches directly from the state in a middleware rather than changing all the reducers?

We'll need grouping to ensure that we don’t end up with a bloated history of patches. 
And it can be handled by something like this. There can be some performance implications but it's still worth giving a shot. To try it out for one of our core reducers and analyze the performance can be a good starting point.

Yes, this seems like the right approach.

On an unrelated note, I am concerned about how undo/redo will work with the new iterative loading. What’s happens when part of the state that is being undone is no longer in memory? We’ll need to think carefully about that. We don’t want to choose an approach that is incompatible with iterative loading, though I regret the additional complexity this involves.

That doesn't seem like a problem if iterative loading is gonna be based on pure functions/reducers. The patches/action-replays would store the payloads as well, which should be sufficient to rebuild the state at any given point in time.
If there are any specific complexities that I'm missing out on, it'd help if you can point them out.

Yeah, it'd be good for me to be more specific here. Let me give this some more thought. Thanks!

@anmolarora1
Copy link
Contributor

Instead of maintaining the full-history of actions, we can limit the number of undo/redo operations n and always store the state reference of the state at current - n actions. That way, we'll have to replay a max of n actions to get to the current state. We'll be using twice the memory but saving quite a bit of performance-cut.
If limiting the number of undo/redo ops isn't an option, we can go with a hybrid approach where we store the state references at regular intervals, say after every 20 actions.

Unfortunately this won’t allow persistent history and won’t work with the new iterative loading mechanism where not all state is in memory at the same time.

This should work if go with the hybrid approach of storing state references after a certain number of actions.

By state references, do you mean in-memory references? What happens when the page is refreshed? Maybe you have a serialization scheme in mind that I am not aware of.

By state references, I mean 'deep clones' of state after every fixed of actions.
The state can be preserved during refreshes using the persist flag, which helps to keep a copy of the state in the local storage.

  1. Implement inverse actions for every action - Tedious and error-prone but efficient once implemented correctly

I am still quite interested in this approach, as it would be nice to be able to “revert” the state one action or group of actions at a time, from present to past, without requiring any snapshots or replays.
However my thinking is that if we are going to use inverse actions, why not simply generate an inverse patch for each regular action? Immer can do this easily. Then we have a general solution which works for all actions. We would just need to devise a mechanism to handle the grouping.

I looked at immer patching & inverse-patching, and they seem like a good solution. The only caveat would be that we’ll have to refactor all our reducers to use immer.

Okay, good to know. How would the reducers use immer?
The reducers can be written in immer like this.

Would it be possible to generate the patches directly from the state in a middleware rather than changing all the reducers?
Afaik, that would require us to do the same state computation twice. Some more digging can be done though, to explore other possible options

We'll need grouping to ensure that we don’t end up with a bloated history of patches. 
> And it can be handled by something like this. There can be some performance implications but it's still worth giving a shot. To try it out for one of our core reducers and analyze the performance can be a good starting point.

Yes, this seems like the right approach.

Great! Should we start with this or explore the options of doing it in the middleware first?

@shresthabijay I believe you've had some experience with immer. It'd be good to get your thoughts on this

@anmolarora1
Copy link
Contributor

anmolarora1 commented Jul 4, 2020

Okay, good to know. How would the reducers use immer?
The reducers can be written in immer like this.

Would it be possible to generate the patches directly from the state in a middleware rather than changing all the reducers?
Afaik, that would require us to do the same state computation twice. Some more digging can be done though, to explore other possible options

@raineorshine I believe we can use a store-enhancer instead of a middleware. I can probably try out a basic implementation and create a draft PR. What do you think?

@raineorshine
Copy link
Contributor Author

By state references, I mean 'deep clones' of state after every fixed of actions.

Okay, thanks for clarifying. Let's refer to them as clones. References usually refer to shallow copies or pointers in Javascript, not clones.

Deep cloning state is going to increase the memory consumption significantly. If you think this is a viable option, you will need to provide a convincing argument with specific reasons as to why it is a better approach than inverse patches, which do not require cloning the state.

The state can be preserved during refreshes using the persist flag, which helps to keep a copy of the state in the local storage.

Is this a flag you are making, or part of a library? Keep in mind that we need to persist undo/redo using our existing data infrastructure, i.e. to dexie and remote, via the syncQueue mechanism.

Okay, good to know. How would the reducers use immer?
The reducers can be written in immer like this.

So will the patches somehow be generated from these calls to immer.produce, and then aggregated? Help me understand that part a little bitter.

Would it be possible to generate the patches directly from the state in a middleware rather than changing all the reducers?
Afaik, that would require us to do the same state computation twice. Some more digging can be done though, to explore other possible options.

Yes, let's investigate this further. Changing all of the reducers touches a lot of code and makes them more tightly coupled to the undo/redo logic which I would prefer to avoid.

Great! Should we start with this or explore the options of doing it in the middleware first?

We still have some investigation to do. Once we have what looks like a viable solution, you should write up a short proposal that describes how it will meet each of the requirements.

@raineorshine
Copy link
Contributor Author

@raineorshine I believe we can use a store-enhancer instead of a middleware. I can probably try out a basic implementation and create a draft PR. What do you think?

No, I understand the desire to get into the code, but a task like this requires a more rigorous proposal that demonstrates the viability of the approach. Otherwise we end up tinkering with code and risking a dead end with an approach that does not meet the requirements from the beginning.

@anmolarora1
Copy link
Contributor

@raineorshine I believe we can use a store-enhancer instead of a middleware. I can probably try out a basic implementation and create a draft PR. What do you think?

No, I understand the desire to get into the code, but a task like this requires a more rigorous proposal that demonstrates the viability of the approach. Otherwise we end up tinkering with code and risking a dead end with an approach that does not meet the requirements from the beginning.

Sure, that makes sense. In order to keep our changes minimal and avoid refactoring our existing users, I propose writing a store-enhancer that looks like this -

const patchReducerEnhancer = createStore => (
  reducer,
  initialState,
  enhancer
) => {
  const patchReducer = (state, action) => {
    const newState = produce(state, draft => {
      return reducer(draft, newState)
    }, (patches, inversePatches) => {
      // store patches and inversePatches to the state
    })
    return newState
  }
  return createStore(patchReducer, initialState, enhancer)
}

While making use of our existing reducers, we can store the patches and inversePatches for every undoable action in the state. The actions and reducers for handling undo and redo can be added just like any other actions/reducers that'll read
patch/inversePatch values from the state and update them accordingly. If that seems like a viable approach, we can think about handling the grouping of actions. What are your thoughts?

@raineorshine
Copy link
Contributor Author

Yes! That looks workable. Now we can make a plan for grouping, including how we specify which actions are grouped together, and how patches are applied as a group.

We can be somewhat less detailed with the persistence mechanism. Since iterative loading is not yet in dev, we may want to consider even adding undo/redo persistence in a separate PR. However we should think ahead enough to demonstrate beyond reasonable doubt that our system will support persistence.

@anmolarora1
Copy link
Contributor

Yes! That looks workable. Now we can make a plan for grouping, including how we specify which actions are grouped together, and how patches are applied as a group.

We'll need a list of undoable actions and the list of such groups of actions that need to be treated as a single, undoable action.

  1. We'll only store the patches for the actions which belong to the list
  2. Instead of keeping the patches in a flat array, we can have a 2-dimensional array of patches - each item of the array would represent an undoable action with a shape like this -[{actionType: string, patch: immerPatch}]

For each action, we can check the last item of the array and decide if the current action should be added to the previous group or represent a new group.

@raineorshine
Copy link
Contributor Author

Thanks!

Question: How should we handle non-undoable actions mixed in with undoable actions? This deserves some thought. It would be less than ideal for a user to be blocked from additional undos due to a non-undoable action at the top of the history.

Please describe how/when the undo history will be persisted via sync.

the list of such groups of actions that need to be treated as a single, undoable action.

The grouping is algorithmic, so a simple list won't suffice. How do you suggest this be represented? Please provide justification and any alternatives considered.

  1. We'll only store the patches for the actions which belong to the list
  2. Instead of keeping the patches in a flat array, we can have a 2-dimensional array of patches - each item of the array would represent an undoable action with a shape like this -[{actionType: string, patch: immerPatch}]

An array of arrays, or an array of objects? 2-dimensional array refers to an array of arrays, but your example looks like an array of objects. Let me know if I'm missing something.

For each action, we can check the last item of the array and decide if the current action should be added to the previous group or represent a new group.

That makes sense at a high level. Would you provide a more detailed, step-by-step account of the grouping mechanism?

@raineorshine
Copy link
Contributor Author

raineorshine commented Jul 6, 2020

I updated the OP with undoable and non-undoable actions.

A couple more requirements things came to mind:

Atomicity

Groups of patches should be applied atomically, i.e. it should be guaranteed that all are applied or none at all. This rules out dispatching individual undo actions for each patch. My guess is that you will need to use something like reducerFlow to apply several at once.

Syncing Updates

We are going to need to extract the thoughtIndexUpdates and contextIndexUpdates from the patch in order to then sync the changes with the server. Thoughts and contexts are sync'd individually, so it will not be possible to apply the patch to the remote state as a whole the same way we can to the in-memory Redux state.

@raineorshine
Copy link
Contributor Author

updateThoughts

updateThoughts is trickier than I thought, because it adds updates to the syncQueue. Since the syncQueue will likely be clear by the time the user calls undo, there will be nothing in the sync queue to revert. On the other hand, when the user attempts to redo an updateThoughts action, it may need to populate the syncQueue as usual.

This is also an issue not just for updateThoughts being dispatched, but also when it is called directly.

However we handle updateThoughts, we will need to make sure it does not conflict with any manual syncing that we will be doing to keep Dexie and Firebase synchronized with the state after an undo/redo.

@anmolarora1
Copy link
Contributor

Upon further analysis, I found out that immer patches store a full state clone (at least for our app) within each patch, and thus, are very inefficient for our use-case. So we need to have a more efficient way of generating patches/diffs that doesn't devour the storage.
JSON-Patch is one of the few options that I tried and it seems suitable for getting the job done - The patches are produced in accordance with the RFC6902 standard - hence much more accurate and capture only the diff and the path

@anmolarora1
Copy link
Contributor

updateThoughts

updateThoughts is trickier than I thought, because it adds updates to the syncQueue. Since the syncQueue will likely be clear by the time the user calls undo, there will be nothing in the sync queue to revert. On the other hand, when the user attempts to redo an updateThoughts action, it may need to populate the syncQueue as usual.

This is also an issue not just for updateThoughts being dispatched, but also when it is called directly.

However we handle updateThoughts, we will need to make sure it does not conflict with any manual syncing that we will be doing to keep Dexie and Firebase synchronized with the state after an undo/redo.

@raineorshine As discussed, we'd start with local undo/redo in the first phase, and extend the support for syncing with local and remote storage in the subsequent phases.

@anmolarora1
Copy link
Contributor

Upon further analysis, I found out that immer patches store a full state clone (at least for our app) within each patch, and thus, are very inefficient for our use-case. So we need to have a more efficient way of generating patches/diffs that doesn't devour the storage.
JSON-Patch is one of the few options that I tried and it seems suitable for getting the job done - The patches are produced in accordance with the RFC6902 standard - hence much more accurate and capture only the diff and the path

@raineorshine Our initial decision was to store the diffs for every action and combine them into a single diff as soon as an action related to the cursor movement was encountered. This was to ensure that if a user closes the browser window or app for some reason, the syncing mechanism stores the diffs (uptil his last action) to local and remote storage

I tried out this approach with the existingThoughtChange action, typing at a normal pace of ~30-40 words/min, and the heap memory shot up to ~900 MB levels. It comes down to two-digit numbers as soon as the typing stops, but the performance while typing wasn't close to good. So computing and storing the diffs on every action doesn't seem like a viable option, due to its performance constraints.
That said, I've thought of a workaround to handle this -
We can store the state clone of the last cursor action in the state when a cursor action is fired. We ignore all the actions fired post that and calculate a diff (combined diff) only when the next cursor action gets fired, and store it in the state. That diff would represent one undo operation. At the same time, we'd also update the value of the state clone.
If the app or the current window is closed, we'd still have a clone of the last state synced with the storage, and an undo operation would simply involve setting the current state value to the last state clone.

This alone, would still not be efficient if the user is just navigating through the thoughts because we'd be computing the diffs (and storing them) on every cursor action. To handle that, we can add a throttling mechanism.

Please share your thoughts.

@raineorshine
Copy link
Contributor Author

@raineorshine Our initial decision was to store the diffs for every action and combine them into a single diff as soon as an action related to the cursor movement was encountered. This was to ensure that if a user closes the browser window or app for some reason, the syncing mechanism stores the diffs (uptil his last action) to local and remote storage

We did discuss this, although I recall coming to a different conclusion. I had presented two options towards the end of our call:

  1. Store patches for each action, combining them when a new "group" was to be started.
  2. Combine the new patch on every action if it is part of the same group.

We had both agreed that the second option was preferable. That said, the performance issue you described probably applies to both approaches.

I tried out this approach with the existingThoughtChange action, typing at a normal pace of ~30-40 words/min, and the heap memory shot up to ~900 MB levels. It comes down to two-digit numbers as soon as the typing stops, but the performance while typing wasn't close to good. So computing and storing the diffs on every action doesn't seem like a viable option, due to its performance constraints.

Darn. So can you say for sure if it's more of a computational bottleneck, or memory bottleneck? If memory, that might deserve more exploration. Unless there is a memory leak, I don't see how typing would increase memory over time, assuming we utilize approach (2) as described above, which modifies the last patch repeatedly instead of appending new patches.

We can store the state clone of the last cursor action in the state when a cursor action is fired.

I have mentioned this before, but we cannot store the entire state, as it is too larger to persist to the local or remote db. A reference may not take much memory in the Redux state, but it is too larger once serialized for persistent storage. We have to store only patches.

We ignore all the actions fired post that and calculate a diff (combined diff) only when the next cursor action gets fired, and store it in the state.

I should note that the cursor navigation is not the defining action of an undo boundary. Take the following sequence of actions for example:

  • moveThoughtDown
  • indent
  • subcategorizeOne

These represent three distinct undoable states, although no cursor actions are involved.

Please share your thoughts.

I'm first curious about the answer and possible research into my above question regarding the performance bottleneck cause. If JSON diffs are too performance intensive, then we will have to explore other options. It would be great for you to do further investigation and brainstorming to assess potential options. Thanks!

@anmolarora1
Copy link
Contributor

@raineorshine Our initial decision was to store the diffs for every action and combine them into a single diff as soon as an action related to the cursor movement was encountered. This was to ensure that if a user closes the browser window or app for some reason, the syncing mechanism stores the diffs (uptil his last action) to local and remote storage

We did discuss this, although I recall coming to a different conclusion. I had presented two options towards the end of our call:

  1. Store patches for each action, combining them when a new "group" was to be started.
  2. Combine the new patch on every action if it is part of the same group.

We had both agreed that the second option was preferable. That said, the performance issue you described probably applies to both approaches.

I believe this is where I got wrong because I tried out the first approach, which included storing a patch for every action and combining them when a new group was to be started. It was a misunderstanding at my end and I apologize for that.

I tried out this approach with the existingThoughtChange action, typing at a normal pace of ~30-40 words/min, and the heap memory shot up to ~900 MB levels. It comes down to two-digit numbers as soon as the typing stops, but the performance while typing wasn't close to good. So computing and storing the diffs on every action doesn't seem like a viable option, due to its performance constraints.

Darn. So can you say for sure if it's more of a computational bottleneck, or memory bottleneck? If memory, that might deserve more exploration. Unless there is a memory leak, I don't see how typing would increase memory over time, assuming we utilize approach (2) as described above, which modifies the last patch repeatedly instead of appending new patches.

I tried approach (2) and the memory is not a bottleneck with that. I tried it out with ~100 thoughts and there were no noticeable performance issues so far.

We ignore all the actions fired post that and calculate a diff (combined diff) only when the next cursor action gets fired, and store it in the state.

I should note that the cursor navigation is not the defining action of an undo boundary. Take the following sequence of actions for example:

  • moveThoughtDown
  • indent
  • subcategorizeOne

These represent three distinct undoable states, although no cursor actions are involved.

Agreed! In that case, we need to identify all such undo boundaries. However, we can simply start with the cursor actions as the first use case, and keep our approach flexible enough to allow easy addition/modification of such boundaries.

Or else, I can try to find out all the use-cases that I can, have them noted in the issue, and then start working.

Please share your thoughts.

I'm first curious about the answer and possible research into my above question regarding the performance bottleneck cause. If JSON diffs are too performance intensive, then we will have to explore other options. It would be great for you to do further investigation and brainstorming to assess potential options. Thanks!

As I mentioned above, the performance looks good with the second approach that we agreed upon (the one that I forgot 😅 ).
However, I noticed that moving the cursor through the thoughts significantly impacts the memory, even without the patch-generator enhancer in place. Albeit not related to this issue but it's worth our attention.

@raineorshine
Copy link
Contributor Author

I tried approach (2) and the memory is not a bottleneck with that. I tried it out with ~100 thoughts and there were no noticeable performance issues so far.

Great!

We ignore all the actions fired post that and calculate a diff (combined diff) only when the next cursor action gets fired, and store it in the state.

I should note that the cursor navigation is not the defining action of an undo boundary. Take the following sequence of actions for example:

  • moveThoughtDown
  • indent
  • subcategorizeOne

These represent three distinct undoable states, although no cursor actions are involved.

Agreed! In that case, we need to identify all such undo boundaries. However, we can simply start with the cursor actions as the first use case, and keep our approach flexible enough to allow easy addition/modification of such boundaries.

I actually think it's fairly deterministic. Some pseudocode:

On each action...

patch = createPatch(action)
if ((isNavigation(action) && isNavigation(prevAction)) || 
    (isExistingThoughtChange(action) && isExistingThoughtChange(prevAction))) {
  mergeWithPrevPatch(patch)
}
else {
  appendPatch(patch)
}

On undo/redo...

if (isNavigation(action)) {
  re/undo() // undo navigation patch
}
re/undo() // undo action patch

Is this correct?

As I mentioned above, the performance looks good with the second approach that we agreed upon (the one that I forgot 😅 ).

So glad! Forgetting is a much easier problem to solve 😁

However, I noticed that moving the cursor through the thoughts significantly impacts the memory, even without the patch-generator enhancer in place. Albeit not related to this issue but it's worth our attention.

Noted. We'll be able to do meaningful performance testing again after #503.

@anmolarora1
Copy link
Contributor

I tried approach (2) and the memory is not a bottleneck with that. I tried it out with ~100 thoughts and there were no noticeable performance issues so far.

Great!

We ignore all the actions fired post that and calculate a diff (combined diff) only when the next cursor action gets fired, and store it in the state.

I should note that the cursor navigation is not the defining action of an undo boundary. Take the following sequence of actions for example:

  • moveThoughtDown
  • indent
  • subcategorizeOne

These represent three distinct undoable states, although no cursor actions are involved.

Agreed! In that case, we need to identify all such undo boundaries. However, we can simply start with the cursor actions as the first use case, and keep our approach flexible enough to allow easy addition/modification of such boundaries.

I actually think it's fairly deterministic. Some pseudocode:

On each action...

patch = createPatch(action)
if ((isNavigation(action) && isNavigation(prevAction)) || 
    (isExistingThoughtChange(action) && isExistingThoughtChange(prevAction))) {
  mergeWithPrevPatch(patch)
}
else {
  appendPatch(patch)
}

I think we should go for an inverse approach where we merge every action except the ones that define the undo/redo separation boundaries. 
The reason - there are some actions besides the navigation actions that shouldn’t create a new patch, like the setCursor action which just changes the dataNonce, editingValue, clearQueue (not sure about this one), error with null value, to name a few. These should either be ignored or merged together.
So, the pseudocode might look like - 



if (isBoundary(currentAction) && !isBoundary(prevAction)) {
    appendPatch(patch)
} else
    mergeWithPrevPatch(patch)



Alternatively, we can ignore such actions because calculating diffs for actions like the editingValue might be a gratuitous operation. Please correct me if I'm wrong here, and let me know what you think.

@anmolarora1
Copy link
Contributor

@raineorshine I've been able to get the basic undo/redo actions working. However, the cursor positioning seems trickier than we thought. Afaik, we're not updating the cursorOffset in the state when a thought/lexeme is changed. Also, we're not using the state values to position the cursor in the Editable component. So triggering the render manually after each undo/redo action won't help. Please correct me if I'm wrong.

If that's correct, however, we'd have to address that concern first as a separate issue and make sure that the redux store represents the state of the app at any given point in time.

@raineorshine
Copy link
Contributor Author

@raineorshine I've been able to get the basic undo/redo actions working. However, the cursor positioning seems trickier than we thought.

Let's refer to the browser selection as "browser selection" or "caret" so as not to confuse it with state.cursor.

Also, we're not using the state values to position the cursor in the Editable component. So triggering the render manually after each undo/redo action won't help. Please correct me if I'm wrong.

Actually, we set the caret position here: https://github.com/cybersemics/em/blob/dev/src/components/Editable.tsx#L214

If that's correct, however, we'd have to address that concern first as a separate issue and make sure that the redux store represents the state of the app at any given point in time.

Agreed

@anmolarora1
Copy link
Contributor

@raineorshine I've been able to get the basic undo/redo actions working. However, the cursor positioning seems trickier than we thought.

Let's refer to the browser selection as "browser selection" or "caret" so as not to confuse it with state.cursor.

That makes sense

Also, we're not using the state values to position the cursor in the Editable component. So triggering the render manually after each undo/redo action won't help. Please correct me if I'm wrong.

Actually, we set the caret position here: https://github.com/cybersemics/em/blob/dev/src/components/Editable.tsx#L214

Okay, let me see if we can use this one on every undo/redo

@anmolarora1
Copy link
Contributor

Also, we're not using the state values to position the cursor in the Editable component. So triggering the render manually after each undo/redo action won't help. Please correct me if I'm wrong.

Actually, we set the caret position here: https://github.com/cybersemics/em/blob/dev/src/components/Editable.tsx#L214

Okay, let me see if we can use this one on every undo/redo

In order to position the caret correctly on an undo or redo op, we need to re-render the thought that matches the cursor value. In our current code, we're updating the caret position based on the isEditing value and cursorOffset.

When an undo operation is triggered, albeit the cursor gets updated, these values ('isEditing' & 'cursorOffset') don't, and hence, a re-render is not triggered.

In order to position the caret correctly in case of undo/redo operations, we'd need to ensure that it is controlled directly by the state.

@raineorshine
Copy link
Contributor Author

In order to position the caret correctly on an undo or redo op, we need to re-render the thought that matches the cursor value. In our current code, we're updating the caret position based on the isEditing value and cursorOffset.

That makes sense.

When an undo operation is triggered, albeit the cursor gets updated, these values ('isEditing' & 'cursorOffset') don't, and hence, a re-render is not triggered.

Since isEditing is calculated from cursorBeforeEdit, wouldn't it get updated when cursorBeforeEdit is changed in the undo operation?

In order to position the caret correctly in case of undo/redo operations, we'd need to ensure that it is controlled directly by the state.

Okay

@anmolarora1
Copy link
Contributor

When an undo operation is triggered, albeit the cursor gets updated, these values ('isEditing' & 'cursorOffset') don't, and hence, a re-render is not triggered.

Since isEditing is calculated from cursorBeforeEdit, wouldn't it get updated when cursorBeforeEdit is changed in the undo operation?

You're right. It's being calculated from cursorBeforeEdit and gets re-calculated after the undo operations but the value remains the same, hence, no re-render. As discussed, we're undoing the any newThought actions along with the existingThoughtChange actions to get the caret on the last navigated thought. So an undo would always result in a state where the cursor's last thought cannot be empty -

cursor: [{ value: "value", rank: r }]

The cursorBeforeEdit, however, never matches that state (so isExisting doesn't change on undo). Currently, the cursorBeforeEdit only stores the values of every new thought, that looks like this

cursorBeforeEdit: [{ value: "", rank: r}]

We're updating it based on this condition in setCursor, which is being called by newThought.

I'm not quite sure what cursorBeforeEdit represents but theoretically, it should keep a track of all the previous values of the cursor.
After an existingThoughtChange action, it should be updated to represent the current cursor. This comment says we are intentionally avoiding that. But to get the isExisting working correctly (if it's not currently), we might have to reconsider this approach or find an alternative one.

Looking forward to your thoughts on this!

@raineorshine
Copy link
Contributor Author

As discussed, we're undoing the any newThought actions along with the existingThoughtChange actions to get the caret on the last navigated thought.

That's correct.

So an undo would always result in a state where the cursor's last thought cannot be empty

After creating a new thought and editing it, yes. Technically the user could delete an empty thought and then hit undo, which should result in the cursor restored to the empty thought.

cursor: [{ value: "value", rank: r }]

The cursorBeforeEdit, however, never matches that state (so isExisting doesn't change on undo). Currently, the cursorBeforeEdit only stores the values of every new thought, that looks like this

cursorBeforeEdit: [{ value: "", rank: r}]

I see! cursor and editingValue are up-to-date, but cursorBeforeEdit has not changed yet.

I'm not quite sure what cursorBeforeEdit represents but theoretically, it should keep a track of all the previous values of the cursor.

Due to a limitation in the ContentEditable component (which Bijay is fixing currently), the caret gets reset when the component gets re-rendered. This forced me very early on to avoid re-rendering the cursor thought during editing to avoid reseting the caret. Thus, components are connected to cursorBeforeEdit rather than cursor. cursor and editingValue change dynamically while the values changes, but cursorBeforeEdit only changes when it's safe to reset the caret, namely on blur or activating a command.

When Bijay fixes ContentEditable there is a chance that we will be able to re-render the component any time without messing up the caret, and thus eliminate the need for cursorBeforeEdit. This would be massively simplifying. It will require changing code in many places, but at that point it may be as simple as replacing cursorBeforeEdit with cursor everywhere.

After an existingThoughtChange action, it should be updated to represent the current cursor. This comment says we are intentionally avoiding that.

We cannot update cursorBeforeEdit on existingThoughtChange since it would re-render the component while the user is typing (albeit throttled).

But to get the isExisting working correctly (if it's not currently), we might have to reconsider this approach or find an alternative one.

If there is a way to set aside the caret issue for now until we know if Bijay has a solution to cursorBeforeEdit, that would be fine! You are welcome to check in with him via GitHub or Gitter. Unfortunately I do think his work on this is not on dev, but tied up in the Flat Render PR. It is worth asking about though.

@anmolarora1
Copy link
Contributor

As discussed, we're undoing the any newThought actions along with the existingThoughtChange actions to get the
Due to a limitation in the ContentEditable component (which Bijay is fixing currently), the caret gets reset when the component gets re-rendered. This forced me very early on to avoid re-rendering the cursor thought during editing to avoid reseting the caret. Thus, components are connected to cursorBeforeEdit rather than cursor. cursor and editingValue change dynamically while the values changes, but cursorBeforeEdit only changes when it's safe to reset the caret, namely on blur or activating a command.

When Bijay fixes ContentEditable there is a chance that we will be able to re-render the component any time without messing up the caret, and thus eliminate the need for cursorBeforeEdit. This would be massively simplifying. It will require changing code in many places, but at that point it may be as simple as replacing cursorBeforeEdit with cursor everywhere.

But to get the isExisting working correctly (if it's not currently), we might have to reconsider this approach or find an alternative one.

If there is a way to set aside the caret issue for now until we know if Bijay has a solution to cursorBeforeEdit, that would be fine! You are welcome to check in with him via GitHub or Gitter. Unfortunately I do think his work on this is not on dev, but tied up in the Flat Render PR. It is worth asking about though.

@shresthabijay As @raineorshine pointed out, I suppose that you're working on the ContentEditable issue - that is, keeping the component in sync with the cursor. How does it look so far? Do you think it'd be possible to dissect it from the Flat Render PR and push its code anytime soon?

@shresthabijay
Copy link
Contributor

@shresthabijay As @raineorshine pointed out, I suppose that you're working on the ContentEditable issue - that is, keeping the component in sync with the cursor. How does it look so far? Do you think it'd be possible to dissect it from the Flat Render PR and push its code anytime soon?

My solution prevents caret to reset when Editable is re-rendered. I will create a PR for that.

@shresthabijay
Copy link
Contributor

@anmolarora1 I have sent the PR #823. This will allow Editable to re-render without causing focusOffset to reset. Right now editing still doesn't cause re-render of Editable because of all the logic in place. This may require changing code in many places. @raineorshine can help you with that.

@raineorshine
Copy link
Contributor Author

Right now editing still doesn't cause re-render of Editable because of all the logic in place. This may require changing code in many places. @raineorshine can help you with that.

@shresthabijay Can you be more specific? Are you referring to the cursor and cursorBeforeEdit distinction?

@shresthabijay
Copy link
Contributor

shresthabijay commented Aug 6, 2020

@shresthabijay Can you be more specific? Are you referring to the cursor and cursorBeforeEdit distinction?

cursorBeforeEdit is being used instead of cursor. We may need to change those back to cursor eliminating use of cursorBeforeEdit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants