Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use data-diffKey to pass list update hints #74

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kemitchell
Copy link

This tiny PR adds a single if statement to the same(a, b) function used to find good list-reorder candidates in lists of elements. The new check identifies matches when a.dataset.diffkey === b.dataset.diffkey.

This has the effect of decoupling diff-hint keys from id values. A couple nice benefits:

  1. No pressure to make diff-hint keys globally unique. For diff optimization, the only constraint is that diff-hint keys should be unique within the element list.

  2. No risk of collision with other users of the id namespace, like <a href="#x">jump</a> targets.

I'm not particularly attached to data-diffkey as the attribute name, but it's short, self-documenting, and easy to access with element.dataset.diffkey.

Big thanks to @juliangruber for the recent refactor. Very clear.

@yoshuawuyts
Copy link
Member

Thanks for this PR! - I'm not sure this is the right resolution tho. In choo 6 every API we ship is backed by a web standard (or our interpretation of one). This means that the friction between the web platform, and the framework APIs is heavily reduced. The flip side of this is that we can't come up with new APIs, but in a way that's good.

I think the problems you outlined can be solved by creating a key generator with a different seed for each list. E.g. something like:

var createSeed = require('my-generator')
var seed = createSeed()   // create a new ID generator
var id = seed.next()      // get a new ID
var id = seed.next())     // get a new ID
var id = seed.next())     // get a new ID

I hope it makes sense why I'd favor a different approach than this PR - let me know if you need help with building a seed generator; always happy to help ✨ Cheers!

@yoshuawuyts yoshuawuyts closed this Jul 2, 2017
@kemitchell
Copy link
Author

@yoshuawuyts Thanks for your note.

I'm not quite sure I understand what you mean by adding no APIs. FWIW, <p data-prop=x> and element.dataset are spec'd in the living standard as "custom data attributes". Custom data attributes are here. DOM dataset is here. The former says:

Custom data attributes are intended to store custom data, state, annotations, and similar, private to the page or application, for which there are no more appropriate attributes or elements.

That describes diff algorithm hinting. ids need to be globally unique, and get exposed to users as hashes. That makes them inappropriate for diff hints that need not, and often naturally won't be, globally unique. Diff hints are also private implementation details that shouldn't be exposed to users. If you set a bunch of ids and folks start using them to link to specific parts of your pages, and you change your id-generation or diffing approach, you break their links.

A PRNG or similar is a fine way to generate names that are unlikely to collide, if you have a good source of seeds that you know are unique. It makes the name picking problem seem smaller---you have to pick unique seeds, not unique ids for every child---but doesn't avoid it. You could do the same with prefixing: ${seed}- plus some key unique in the list, but not globally.

The use case I have where this breaks down renders a kind of Merkle tree. Each branch and leaf in the tree comes with a cryptographic hash of its content. The content is some, but not all of the state that needs to be rendered at branches and leaves.

Hashes are guaranteed to correspond to content. The same content can appear at multiple positions in the tree. As a result, hashes aren't unique. If the same content recurs within the tree, so will the hash. So setting <div id=${hash}> is out. "Decorating" the hash with position or other information from the context, rather than the specific data to render at that point in the document, overspecializes the identifier, making it useless for diff optimization.

I suspect other apps dealing with data that's inherently tree-like, with deep nesting, will have the same problem. Those are also use cases where the performance limits of the functional-reactive come up fast. You have to cheat those limits with rerendering and diff optimizations.

@juliangruber
Copy link
Contributor

For me this far this was not a problem, as I'm using nanocomponent's implicit ID assignment for unique keys.

In my understanding that ID is already a concept for giving an Element a unique identifier and thus we're not introducing a new one. data-X needs to be remembered specifically.

From an implementation standpoint - in nanomorph - it wouldn't make any difference where it's the ID attribute or the dataset or anything else.

I can see this being an issue if you use a different abstraction than nanocomponent, with maybe it's own different concepts. Could you use that scheme for your merkle tree rendering? You would need to create one nano instance for each leaf, even if two leaves share the same content.

@juliangruber
Copy link
Contributor

I'm +1 for changing this if we can prove this is not flexible enough with a real world use case, and maybe rendering merkle trees does that.

@kemitchell
Copy link
Author

@juliangruber, thanks for your messages. I will have to take a look at nanocomponent. Hopefully this evening.

@kemitchell
Copy link
Author

@juliangruber: Had a look at nanocomponent. Looks like it creates a unique id per component by appending a monotonically increasing, global counter (INDEX) to a new Date()-seeded nonce. Does that actually ensure that all id attributes in a page are unique, if the same component is rendered multiple times to the view? Won't all "instances" or "renderings" of a single component have the same id?

In other words, if ids and components are 1:1, and components can be rendered >1 times on a page, won't id attributes recur?

The hypothetical app I've been thinking of---not my project, but tree-like---is a Git tree viewer. Two components: Tree and Blob. Tree renders its own data, plus any other Trees and Blobs within it. Will all Trees get the same nanocomponent id?

@kemitchell
Copy link
Author

By the way, would one of y'all mind reopening this? Helps me keep track of it.

@juliangruber
Copy link
Contributor

Due to DOM limitations, one instance of a component can only be rendered exactly once at a time. If you want to show multiple of a component type, create multiple instances -> they all have their unique IDs.

By the way, would one of y'all mind reopening this? Helps me keep track of it.

@yoshuawuyts can

@kemitchell
Copy link
Author

kemitchell commented Jul 6, 2017

@juliangruber, ah, I see. That solves the id uniqueness problem.

Thinking in terms of a Git tree renderer, with Tree and Blob components. First pass state:

Root
  Tree
    Tree
      Blob
  Tree
    Blob
    Tree
      Blob
    Tree
      Blob

All new elements:

Root
  Tree (id=ncid-NONCE-1)
    Tree (...2)
      Blob (3)
  Tree (4)
    Blob (5)
    Tree (6)
      Blob (7)
    Tree (8)
      Blob (9)

Then a state change adds another tree and blob:

Root * => modified
  Tree
    Tree
      Blob
  Tree *
    Blob
    Tree
      Blob
    Tree *
      Blob *
    Tree
      Blob

Assuming optimal _update() methods throughout, does that produce?:

Root
  Tree (1) (same)
    Tree (2) (same)
      Blob (3) (same)
  Tree (10)
    Blob (5)
    Tree (6)
      Blob (7)
    Tree (11)
      Blob (12)
    Tree (8)
      Blob (9)

Edit: Fix id numbers in last fenced block.

@yoshuawuyts yoshuawuyts reopened this Jul 6, 2017
@kemitchell
Copy link
Author

I'm going to cook up a little demo app and check. Hopefully the result is what I listed above, and not something like this, which defeats the purpose of the id === id check:

Root
  Tree (1) (same)
    Tree (2) (same)
      Blob (3) (same)
  Tree (10)
    Blob (11)
    Tree (12)
      Blob (13)
    Tree (14)
      Blob (15)
    Tree (16)
      Blob (17)

@juliangruber
Copy link
Contributor

this has more to do with how you manage component instances in JS land I think, because that's really where you ensure same ID gets matched with same ID, ie if you remove a node it gets removed from your array of nodes, etc.

@kemitchell
Copy link
Author

@juliangruber: Absolutely.

I've played around a bit with nanocomponent, and I think I see this clearly from the top down now.

If you use nanocomponent in functional-reactive/React style, where you rerender the entire UI tree on each state change, with new-versus-old-state checks like _update(props) // boolean to short-circuit rerendering when possible, then ids based on a global counter will work against same() and the list-reorder optimization of nanomorph. You'll instantiate new components for all children during each rerender of each dirty-state component, creating a totally new set of unique child id attributes. New, unique id attributes will never match existing id attributes in the rendered tree, so same() always returns false, and you get naive behavior in nanomorph's updateChildren().

You can cheat this by stepping away from the functional-reactive style in your components, caching elements from the last render pass and using DOM methods to mutate to what you need, based on how state has changed. This approach takes you back toward an older, more stateful component approach, the one a lot of us know from Backbone.

Not all the way back, but to a middle ground. Your components become function (oldState, newState, lastTree) // new tree, instead of function (state) // new tree. Both your "has my data changed" logic and your rendering logic have access to prior state and tree. Handling the updates in-place lets you reuse component elements with unique per-component-instance ids, so nanomorph's same() can still find matches.

The simple example in nanocomponent's README show this approach. Rerendering involves changing the existing DOM element, held with a pointer, via DOM methods. Not creating a new element. You couldn't write that component's logic in a pure functional language.

Where does that leave us?

I will probably run a patch to nanomorph for my own project, adding the else if statement for data-diffkey checks. The benefits are real for me, and I don't think my application is conducive to the more stateful component approach.

I still think the change is good for nanomorph generally, since folks will be interested to use in the FRP style, even if choo style ends up being more stateful.

I'd love to find out how you benchmark choo front-end perf, and try a well-known example with key-based diff hints. I suspect the benefit of better diff optimization would outweigh the cost of an additional property tree-wide in many use cases, but that's the kind of thing you have to try not to think about, and just bench.

@juliangruber
Copy link
Contributor

Excellent analysis, @kemitchell!

What do you think about making the ID attribute configurable, so you can in fact set it depending on the data - as in react - and then use the more functional style.

It's interesting how similar nano seems to react components yet using instances instead of data identifiers shows a huge conceptual difference. I wonder if this is a conscious decision or rather a (happy) accident of design? @yoshuawuyts

@kemitchell
Copy link
Author

@juliangruber Making the identify key configurable---either id or something else that you choose, but not both---would work for all uses cases I have in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants