Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluggable "Graph Coloring"/"Graph Walking" Algorithm #159

Open
nathanhammond opened this issue Apr 9, 2024 · 7 comments
Open

Pluggable "Graph Coloring"/"Graph Walking" Algorithm #159

nathanhammond opened this issue Apr 9, 2024 · 7 comments

Comments

@nathanhammond
Copy link
Member

Seems like everybody else here is talking about UI, so let's switch to build systems!

In a "live" (dev mode) build system graph it's plausible to create enough reactive components such that processing the reactive graph itself takes up a huge portion of the overall time. (Some builds are hundreds of compute hours, crammed into minutes of wall clock time.) One strategy that can been implemented in user space with this proposal is "fractal" signals, where sets of signals are coalesced and checked first, and then underlying signals second. (Think: folder, file, function, statement.)

Another strategy, which I don't see a way to accomplish, is optimistic subtree/subgraph traversal: "All of this work ends up in a signal, the output is likely to be consistent, but we can't prove that it doesn't matter until completing an expensive transpile + typecheck step. Proceed to do other things below based on these hypothesized sets of data at a particular signal node, while waiting on nodes higher in the graph to confirm."

This increases complexity in the "preventing glitches" step because you are effectively intentionally introducing numerous simultaneous forks based upon predictable outcomes and then slowly (in compute-hour time, not wall-clock time) working to collapse them into a single consistent graph. The goal here is to pull forward underlying work as much as possible, even if there is risk that it needs to be redone. It's branch prediction, at the Signal graph level, optionally informed by something that looks similar to Excel's memoized calculation chain.

In particular, in a build, many tasks often re-converge to a "consistent" state at a later node in the graph because as developers we write libraries with changes that are intentionally designed to be immaterial to all child nodes (SemVer!). But we can't prove that without doing a lot of work. This is different from DOM-tree and UI-related graphs which ultimately end up with fewer nodes that have multiple parents, and fewer points of convergence.

Low-fidelity example of how many steps in a build system may appear, with many orders of magnitude fewer nodes:

flowchart TD
  root --> library1 & library2 & library3

  library1 --> a & b & c;
  a --> a1 & a2 & a3 --> a';
  b --> b1 & b2 & b3 --> b';
  c --> c1 & c2 & c3 --> c';
  a' & b' & c' --> library1.out

  library2 --> x & y & z;
  x --> x1 & x2 & x3 --> x';
  y --> y1 & y2 & y3 --> y';
  z --> z1 & z2 & z3 --> z';
  x' & y' & z' --> library2.out

  library3 --> i & ii & iii;
  i --> i1 & i2 & i3 --> i';
  ii --> ii1 & ii2 & ii3 --> ii';
  iii --> iii1 & iii2 & iii3 --> iii';
  i' & ii' & iii' --> library3.out

  library1.out & library2.out & library3.out --> app
Loading

The distribution of tasks across a fleet of build boxes and getting those environments set up early (fixed time costs) while doing our best to create a full closure over the set of cached assets possibly needed for performing a final computation (even if the first attempt is mispredicted) can still be a significant performance win.

The core of this can be modeled as "N forks, work queue to settle the graph, plus invalidate and reschedule in cases where we mispredict." While probably also layering fractal signals on top. 😅


Builds also regularly hit the "set-inside-computed" pattern in order to do things for cyclic dependencies which is both unfortunate and unavoidable.

@littledan
Copy link
Member

littledan commented Apr 9, 2024

Thanks for the context! It's interesting to hear about these use cases and algorithms.

If the algorithm were made appropriately optimistic for build systems, do you think an API like this would work for them?

Does optimistic evaluation ever come up in UIs?

Builds also regularly hit the "set-inside-computed" pattern in order to do things for cyclic dependencies which is both unfortunate and unavoidable.

😭😭😭

@nathanhammond
Copy link
Member Author

"Coarsening Fractal" Calculation

You can skip processing these portions of the signal graph because I, the application, know that this other signal (with coarser granularity, but presumed full closure over children) has not changed.

To implement this I need to be able to define what amounts to a set of "parent" signals to subgraphs. The finer signal graph only gets entered at points where the coarser signal graph fails to match. That can't really be reduced to "coarser layer as signal input to finer fractal layer"; it's instead a new node "above" (and not participating in) a higher fidelity graph.

This would generally be automatically constructed for all subgraphs that look like the chart I included in the above post, but can also be constructed for non-converging overlays, as long as the underlying Signal graph is "resumeable" from particular nodes. That's actually complicated though: you need to be able to tell the graph walking algorithm, "I know you don't need to walk these, I don't have time to explain how I know, just trust me."

There's also a tradeoff here on layer count and bonus bookkeeping, but that's my problem, not the primitive's. I also don't believe that this can be done as a JS-engine generic optimization problem, it generally requires some knowledge of the domain of application to be able to construct the (presumed) full closure Signal.

Implied Feature Request: Signal Graph Coarsening

A way to specify that some Signal in a separate overlay graph is a "full closure" of a subgraph in a higher-fidelity Signal graph. This means being able to specify the root and the set of child nodes that represent the leaves of the otherwise-processed full-fidelity graph.

The test for full closure could be as simple as a lot of file hashes pulled from a git index (cheap), whereas the granularity of full fidelity graphs could be at the "type check this statement" level (expensive).

This type of construction is easily subject to user error by not actually being a full closure, but reducing expensive computations into approximations that are presumed to be full closures is basically the entire point of build systems so it's an "accepted" risk from the beginning.

@nathanhammond
Copy link
Member Author

nathanhammond commented Apr 9, 2024

Optimistic Work Division

These 12 files changed, I know that they're inputs to these N Signal nodes, and we're just going to kick off all of those simultaneously. We'll find out if any of those nodes' parents are affected later and recompute if we absolutely have to.

We don't want to walk the graph in full topological order if we can correctly guess that entering the graph in multiple places simultaneously on a dozen different build farm computers (or even just preparing them) will result in faster wall clock time. This is predictive of the work that has to be done, and can be wasteful of compute. It's an application concern to minimize wasted compute.

While waiting on the Signals to propagate we can complete processing whole sections of the build, aiming for eventually consistent at a particular generation number.


Does optimistic evaluation ever come up in UIs?

This basically requires multiple threads in order to get any ROI.

In my experience with Glimmer (I'm Ember community) I don't recall ever seeing anybody even asking for it, much less an implementation of it. Trying to hit a 16ms deadline means that generally you're trying to be as cheap as possible through the entire graph. That's tangentially related to @NullVoxPopuli's #151 where simply deciding "this work is cheaper than the bookkeeping of a Signal" makes more of an optimization difference.

Since JS can spawn processes and can be multi-threaded in DOM, optimizations like this are plausibly valuable in user space—but those boundaries to parallelism are expensive. I'm not convinced that "divide, marshal, send, unmarshal, calculate, marshal, send, unmarshal, merge, rerun mispredicts" will ever make sense for UI. I'm not certain of a user-space-driven use-case for this; it'd almost have to be an API-tunable behind-the-scenes optimization where the engine does parallelism for us.

(Given that precondition, it should probably not be included in any proposal other than a passing "we thought about it, and if we did it, it would probably follow this strategy.")

@backbone87
Copy link

I dont claim to fully understand the strategies proposed here, but from what I understand is that this should be an app level opt-in instead of an inherit feature of the proposal? The thing that the proposal should provide in those cases is the ability to track the state associated with those "optimistic computations".

Just scribbling down some ideas:

const a = new Signal.State(0); // input

const signal = new Signal.Eventually(
  /* initial value */ 0,
  /* cheap route */ (prevValue) => {
    const [value, confidence] = cheap(prevValue, a.get());
 
    switch(confidence) {
      case 'definite':
        // use this value, no expensive route
        return value; 
      case 'optimistic':
        // start expensive route, use given value in the mean time
        throw new Signal.Eventually.Optimistic(value, /* args */);
    }
  },
  /* expensive route */ async (optimisticValue, /* args */) => {
    return new Promise((resolve) => setTimeout(resolve, 10_000));
  }
)

I think that could even be implemented in the current proposal just in terms of State + Computed.
Would this somehow enable (some of) the described use cases?

@dead-claudia
Copy link
Contributor

Curious question: how much of this intersects with #165?

I didn't really dive that deep in trying to wrap my head around this, so apologies if my suspicion seems wildly off. Just seemed like some of my idea rhymes a lot with yours.

Also, I will note yours is more related to async signals (which do have a use case, even outside build systems!) than sync ones (this proposal currently). And async Computed signals in particular (with cancellation) would prove very handy. It could especially help in taking React Router's data stuff and splitting it from the router-specific bits.

@shaylew
Copy link
Collaborator

shaylew commented Apr 21, 2024

@nathanhammond Do you have ideas for what sorts of APIs these sorts of use cases end up wanting, or thoughts on which decisions in the proposal as it stands preclude certain use cases you're interested in?

I'm not sure how you'd want to express the graph-coarsening transformations here. Watchers do at least give you the ability to react to "might-have-changed-ness" of a Computed, before rerunning its dependencies to ascertain their really-changed-ness.

Is Salsa's "durability" system relevant here, or is it too coarse/too much of a special case?

That's actually complicated though: you need to be able to tell the graph walking algorithm, "I know you don't need to walk these, I don't have time to explain how I know, just trust me."

I've prototyped some systems where the "dirtying" (what happens if something upstream changed-definitively or might-have-changed) and "verification" (determine if this node needs to rerun or is truly unchanged) methods are overridable. There are definitely things you can express that way, and it means things like the version number trick can be left to user space. The tricky bits are:

  • What global guarantees (...if any) can you preserve in the face of that user-space power?
  • Do the additional hooks slow down core algorithms, allow user code to observe questionable intermediate states, or make it (even) harder to get predictable orderings of callbacks?

It might be worth us working out what the maximally pluggable system would look like, just to see if we can still salvage good behavior from it, because some of these specific bells and whistles are unlikely to make the cut individually but might still be implementable in userland on top of a more expressive core.

@doeixd
Copy link

doeixd commented Apr 22, 2024

@nathanhammond

Something like this: https://www.pzuraq.com/blog/on-signal-relays might allow for a more coarse representation of the reactive graph.

@doeixd doeixd mentioned this issue Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants