Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterable $dfs #6664

Merged
merged 6 commits into from
Sep 26, 2024
Merged

Iterable $dfs #6664

merged 6 commits into from
Sep 26, 2024

Conversation

zurfyx
Copy link
Member

@zurfyx zurfyx commented Sep 24, 2024

$dfs is handy but not very flexible. The fact that it computes the entire array every time means that it's either all or nothing.

A common use case for a DFS is data format conversion, and data conversion formats that are modular will delegate the processing of a part of the tree to the responsible modules. An iterable version of $dfs makes it possible to stop and resume the iteration without paying the redundant cost.

Iterable is not just limited to this one use case, I'm confident that this small addition to the utils library is not a one-off.

This is how the application of this looks like in the wild (not a one-liner but I also don't think we want to overoptimize in one use case).

const initialDepth = $getDepth(node);
let iterator = $dfsIterator(initialNode);
let currentDFSNode;
while ((currentDFSNode = iterator.next().value) != null) {
  const currentNode = currentDFSNode.node;
  if (delegateSubtreeProcessing) {
    delegate(currentNode);

    const [nextNode, depthDiff] =
      $getNextSiblingOrParentSibling(currentNode);
    if (
      nextNode === null ||
      currentDFSNode.depth + depthDiff <= nodeDepth
    ) {
      break;
    }
    iterator = $dfsIterator(nextNode);
  }
}

Copy link

vercel bot commented Sep 24, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
lexical ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 25, 2024 8:05pm
lexical-playground ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 25, 2024 8:05pm

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 24, 2024
@zurfyx zurfyx changed the title Iterable DFS Iterable $dfs Sep 24, 2024
Copy link

github-actions bot commented Sep 24, 2024

size-limit report 📦

Path Size
lexical - cjs 29.85 KB (0%)
lexical - esm 29.66 KB (0%)
@lexical/rich-text - cjs 38.3 KB (0%)
@lexical/rich-text - esm 31.55 KB (0%)
@lexical/plain-text - cjs 36.9 KB (0%)
@lexical/plain-text - esm 28.86 KB (0%)
@lexical/react - cjs 40.08 KB (0%)
@lexical/react - esm 32.91 KB (0%)


if (node !== null && node.is(end)) {
nodes.push({depth, node});
export function $getNextSiblingOrParentSibling(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentionally exported, this 0 added cost function is a handy utility to skip a subtree or (unrelated) help find the next sibling that is not directly connected to the same parent.

/**
* $dfs iterator. Tree traversal is done on the fly as new values are requested with O(1) memory.
* @param startNode - The node to start the search, if ommitted, it will start at the root node.
* @param endNode - The node to end the search, if ommitted, it will find all descendants of the startingNode.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param endNode - The node to end the search, if ommitted, it will find all descendants of the startingNode.
* @param endNode - The node to end the search, if omitted, it will find all descendants of the startingNode.

endNode?: LexicalNode,
): DFSIterator {
const start = (startNode || $getRoot()).getLatest();
const end =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this could be left optional and check depth < startDepth if undefined? Would save some traversing


if (node !== null && node.is(end)) {
nodes.push({depth, node});
export function $getNextSiblingOrParentSibling(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like exporting this, seems very useful for when you are sure you don't want to traverse all subtrees. It could probably use its own tests since it's exported though

* @param endNode - The node to end the search, if ommitted, it will find all descendants of the startingNode.
* @returns An iterator, each yielded value is a DFSNode. It will always return at least 1 node (the start node).
*/
export function $dfsIterator(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder what kind of trouble users will get themselves into when they mutate the document and continue iterating

Copy link
Member Author

@zurfyx zurfyx Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etrepum good observation, how much do we want to do to prevent misuse? I'm inclined to make it flexible and not just read-only. I believe we can control the flow by listening to editor updates when the iteration is started but I do foresee this traversal can be useful even in write mode to tweak some nodes as you go as long as you understand what you're doing...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should try very hard, we don't really have a way to do it even if we wanted to, but there ought to be some documentation about what you should or shouldn't do and what the semantics are. I'm writing this without another close look at the code but we should also be intentional about the next state in the iteration, e.g. we should probably know enough to find what the next key is going to be before we return the node from the iterator so that if the user replaces the given node the iterator doesn't halt, or if they insert something after that the new node doesn't end up in the iterator.

@zurfyx zurfyx added this pull request to the merge queue Sep 26, 2024
Merged via the queue into main with commit 62c51a9 Sep 26, 2024
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants