Skip to content

🌊 Streams: Selectors for derived samples#213638

Merged
flash1293 merged 17 commits intoelastic:mainfrom
flash1293:flash1293/derived-store
Apr 1, 2025
Merged

🌊 Streams: Selectors for derived samples#213638
flash1293 merged 17 commits intoelastic:mainfrom
flash1293:flash1293/derived-store

Conversation

@flash1293
Copy link
Contributor

@flash1293 flash1293 commented Mar 7, 2025

Simplified massively from first state and just plugging in reselect in places where that's suitable (here to calculate the currently relevant sample documents).

Also does a drive-by layout fix.

Introduces a new xstate helper for derived data.

In most cases, the actor and state machine model of xstate is great, but for derived data using pure functions, the semantics of the useMemo hook with defined dependencies is often easier to understand and eliminates the risk of forgetting to update the derived data correctly in some cases.

It's about using the right tool for the right job - you don't need to choose between the dependency list of useMemo and the actor model of xstate, you can use what fits the case, without compromising performance.

This is the API:

const myActorContext = withMemoizedSelectors(
  createActorContext(myMachine),
  {
    derivedView: createSelector(
      [
        (ctx: MyContextType) => {
          return ctx.dependency1;
        },
        (ctx: MyContextType) =>
          ctx.dependency2,
      ],
      (dependency1, dependency2) => {
        return // expensive calculation only running when necessary
      }
    ),
  },
  (context) => (context.subMachine ? [context.subMachine] : []) // optional subscribe to changes of submachines as well
);


// in react use useMemoizedSelector hook
// this will cause the component to rerender if the selector is returning a new value
myActorContext.useMemoizedSelector('derivedView')

This is using reselect to declare the dependencies similar to a react useMemo hook - the actual selector will only run if the dependencies change, leading to similar semantics as useMemo, with the additional benefit that if the value is used in multiple places, it's still just calculated once. The component calling withMemoizedSelectors only re-renders if the value returned by the selector changes. The selector itself only re-runs if one of the declared dependencies changes.

Everything is type-safe by capturing the types of the reselect selector object via inferred type param and using it in the useMemoizedSelector type.

@flash1293 flash1293 changed the title Xstate derived store Xstate with memoized selectors Mar 12, 2025
@flash1293 flash1293 added release_note:skip Skip the PR/issue when compiling release notes Team:obs-onboarding Observability Onboarding Team backport:version Backport to applied version labels Feature:Streams This is the label for the Streams Project v9.1.0 v8.19.0 labels Mar 12, 2025
@flash1293 flash1293 marked this pull request as ready for review March 12, 2025 22:35
@flash1293 flash1293 requested a review from a team as a code owner March 12, 2025 22:35
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-logs-team (Team:obs-ux-logs)

@tonyghiani tonyghiani self-requested a review March 24, 2025 10:02
@tonyghiani
Copy link
Contributor

When we first talked about this I liked the idea of using reselect for this purpose, but I had something much simpler in mind. Maybe you went throught the same mental steps, so please correct me if I'm missing something.

Using reselect, results are cached unless any of the specified dependencies change.
With that in mind, we shouldn't need an additional context wrapping around the machine provider, but in general the whole abstraction could be removed.

The approach I thought you had in mind was simpler, purely based on defining selectors and use them along with useSelector from xstate. For instance, in the first case to derive previewDocuments, we could just have

// ./state_management/simulation_state_machine/selectors.ts
const EMPTY_ARRAY: [] = []

export const selectPreviewDocuments = createSelector(
  [
    (snapshot: SimulationActorSnapshot) => snapshot.context.samples,
    (snapshot: SimulationActorSnapshot) => snapshot.context.previewDocsFilter,
    (snapshot: SimulationActorSnapshot) => snapshot.context.simulation?.documents,
  ],
  (samples, previewDocsFilter, documents) => {
    return (
      (previewDocsFilter && documents
        ? filterSimulationDocuments(documents, previewDocsFilter)
        : samples) || EMPTY_ARRAY
    );
  }
);

// On any component needing the preview documents
const previewDocuments = useSimulatorSelector(selectPreviewDocuments);

This wouldn't need to create an additional layer on top of the machine with all the related subscription, but we can derive the value we need directly on usage site, relying on the stronger selection mechanism by reselect.
For the use cases we have now, I believe it would make runtime computation cheaper and DX faster given we need to only define the selector.

I didn't try all of this, let me know if you want me to give it a shot or you tried this already 👌

@flash1293
Copy link
Contributor Author

@tonyghiani Good call - I didn't test this, I will try. I think it won't work as-is because reselect is also memoizing on the argument and it's probably stable? Basically it's missing this line I had to sneak in there: https://github.com/elastic/kibana/pull/213638/files#diff-6323649842f84a1364b2e0be97b01fe708826d3ea4759452cf02a54fc060a5a3R84

But that's most likely fixable with a very lightweight wrapper - great call for sure! Definitely simplifies a bit the setup and it's nice that it only calculates the derived view if there is actually a consumer currently rendered somewhere.

I will look into this and update the PR.

@tonyghiani
Copy link
Contributor

I think it won't work as-is because reselect is also memoizing on the argument and it's probably stable?

This is right, in that case might be worth to not use the stable reference of the snapshot but directly pass the context as the selector input:

// ./state_management/simulation_state_machine/selectors.ts
const EMPTY_ARRAY: [] = []

export const selectPreviewDocuments = createSelector(
  [
    (context: SimulationContext) => context.samples,
    (context: SimulationContext) => context.previewDocsFilter,
    (context: SimulationContext) => context.simulation?.documents,
  ],
  (samples, previewDocsFilter, documents) => {
    return (
      (previewDocsFilter && documents
        ? filterSimulationDocuments(documents, previewDocsFilter)
        : samples) || EMPTY_ARRAY
    );
  }
);

// On any component needing the preview documents
const previewDocuments = useSimulatorSelector(snapshot => selectPreviewDocuments(snapshot.context));

This should work as the context object will update once anything is assigned to the itself from on a transition

@flash1293
Copy link
Contributor Author

flash1293 commented Mar 24, 2025

const previewDocuments = useSimulatorSelector(snapshot => selectPreviewDocuments(snapshot.context));

Now that you put it there - it won't work in practice because a change in a sub machine is

  • Not creating a new context reference for the top level machine (so memoization with cause it to skip)
  • Not triggering a re-evaluation of useSimulatorSelector in the first place (so it will not even be called)

@tonyghiani
Copy link
Contributor

tonyghiani commented Mar 24, 2025

Now that you put it there - it won't work in practice because a change in a sub machine is

  • Not creating a new context reference for the top level machine (so memoization with cause it to skip)

I forgot to add a step, my bad!
In this work you moved in the previewDocuments derived value from the simulation actor to the StreamEnrichmentContext.

What I mean is that the selector could act directly against the simulation actor, using the useSimulatorSelector directly. In this case the hook should trigger a re-evaluation when its internal state changes, which would be enough to recompute the the previewDocuments.
I didn't try it as I said, but the useSimulatorSelector defined here could make the trick.

That's already used in some places, not sure if would fit well with the selector but it's worth trying 😄

@flash1293
Copy link
Contributor Author

Oh, right, that should work. I'll put it together like this, thanks 👍

@flash1293 flash1293 changed the title Xstate with memoized selectors Selectors for derived samples Mar 28, 2025
@flash1293
Copy link
Contributor Author

@tonyghiani ready for another look

@flash1293 flash1293 changed the title Selectors for derived samples 🌊 Streams: Selectors for derived samples Mar 28, 2025
Copy link
Contributor

@tonyghiani tonyghiani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this change, using reselect makes it much cleaner and we don't need to make it more complex with wrappers on top of the machine 👌

@flash1293 flash1293 enabled auto-merge (squash) March 31, 2025 05:48
@flash1293 flash1293 merged commit c5e0b05 into elastic:main Apr 1, 2025
9 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/14192518199

@elasticmachine
Copy link
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] Jest Tests #14 / should render empty component if missing workflow_user value
  • [job] [logs] Jest Tests #20 / Templates renders empty templates correctly

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
streamsApp 424 425 +1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
streamsApp 392.1KB 392.1KB -62.0B

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
streamsApp 9.6KB 9.6KB +60.0B

History

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Apr 1, 2025
Simplified massively from first state and just plugging in reselect in
places where that's suitable (here to calculate the currently relevant
sample documents).

Also does a drive-by layout fix.

~Introduces a new xstate helper for derived data.~

~In most cases, the actor and state machine model of xstate is great,
but for derived data using pure functions, the semantics of the
`useMemo` hook with defined dependencies is often easier to understand
and eliminates the risk of forgetting to update the derived data
correctly in some cases.~

~It's about using the right tool for the right job - you don't need to
choose between the dependency list of useMemo and the actor model of
xstate, you can use what fits the case, without compromising
performance.~

~This is the API:~
```ts
const myActorContext = withMemoizedSelectors(
  createActorContext(myMachine),
  {
    derivedView: createSelector(
      [
        (ctx: MyContextType) => {
          return ctx.dependency1;
        },
        (ctx: MyContextType) =>
          ctx.dependency2,
      ],
      (dependency1, dependency2) => {
        return // expensive calculation only running when necessary
      }
    ),
  },
  (context) => (context.subMachine ? [context.subMachine] : []) // optional subscribe to changes of submachines as well
);

// in react use useMemoizedSelector hook
// this will cause the component to rerender if the selector is returning a new value
myActorContext.useMemoizedSelector('derivedView')
```

~This is using reselect to declare the dependencies similar to a react
useMemo hook - the actual selector will only run if the dependencies
change, leading to similar semantics as useMemo, with the additional
benefit that if the value is used in multiple places, it's still just
calculated once. The component calling `withMemoizedSelectors` only
re-renders if the value returned by the selector changes. The selector
itself only re-runs if one of the declared dependencies changes.~

~Everything is type-safe by capturing the types of the reselect selector
object via inferred type param and using it in the `useMemoizedSelector`
type.~

(cherry picked from commit c5e0b05)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Apr 1, 2025
# Backport

This will backport the following commits from `main` to `8.x`:
- [🌊 Streams: Selectors for derived samples
(#213638)](#213638)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Joe
Reuter","email":"johannes.reuter@elastic.co"},"sourceCommit":{"committedDate":"2025-04-01T09:47:31Z","message":"🌊
Streams: Selectors for derived samples (#213638)\n\nSimplified massively
from first state and just plugging in reselect in\nplaces where that's
suitable (here to calculate the currently relevant\nsample
documents).\n\nAlso does a drive-by layout fix.\n\n~Introduces a new
xstate helper for derived data.~\n\n~In most cases, the actor and state
machine model of xstate is great,\nbut for derived data using pure
functions, the semantics of the\n`useMemo` hook with defined
dependencies is often easier to understand\nand eliminates the risk of
forgetting to update the derived data\ncorrectly in some
cases.~\n\n~It's about using the right tool for the right job - you
don't need to\nchoose between the dependency list of useMemo and the
actor model of\nxstate, you can use what fits the case, without
compromising\nperformance.~\n\n~This is the API:~\n```ts\nconst
myActorContext = withMemoizedSelectors(\n
createActorContext(myMachine),\n {\n derivedView: createSelector(\n [\n
(ctx: MyContextType) => {\n return ctx.dependency1;\n },\n (ctx:
MyContextType) =>\n ctx.dependency2,\n ],\n (dependency1, dependency2)
=> {\n return // expensive calculation only running when necessary\n }\n
),\n },\n (context) => (context.subMachine ? [context.subMachine] : [])
// optional subscribe to changes of submachines as well\n);\n\n\n// in
react use useMemoizedSelector hook\n// this will cause the component to
rerender if the selector is returning a new
value\nmyActorContext.useMemoizedSelector('derivedView')\n```\n\n~This
is using reselect to declare the dependencies similar to a
react\nuseMemo hook - the actual selector will only run if the
dependencies\nchange, leading to similar semantics as useMemo, with the
additional\nbenefit that if the value is used in multiple places, it's
still just\ncalculated once. The component calling
`withMemoizedSelectors` only\nre-renders if the value returned by the
selector changes. The selector\nitself only re-runs if one of the
declared dependencies changes.~\n\n~Everything is type-safe by capturing
the types of the reselect selector\nobject via inferred type param and
using it in the
`useMemoizedSelector`\ntype.~","sha":"c5e0b05454b124949de754556ec8ba5289445ab3","branchLabelMapping":{"^v9.1.0$":"main","^v8.19.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:obs-ux-logs","backport:version","Feature:Streams","v9.1.0","v8.19.0"],"title":"🌊
Streams: Selectors for derived
samples","number":213638,"url":"https://github.com/elastic/kibana/pull/213638","mergeCommit":{"message":"🌊
Streams: Selectors for derived samples (#213638)\n\nSimplified massively
from first state and just plugging in reselect in\nplaces where that's
suitable (here to calculate the currently relevant\nsample
documents).\n\nAlso does a drive-by layout fix.\n\n~Introduces a new
xstate helper for derived data.~\n\n~In most cases, the actor and state
machine model of xstate is great,\nbut for derived data using pure
functions, the semantics of the\n`useMemo` hook with defined
dependencies is often easier to understand\nand eliminates the risk of
forgetting to update the derived data\ncorrectly in some
cases.~\n\n~It's about using the right tool for the right job - you
don't need to\nchoose between the dependency list of useMemo and the
actor model of\nxstate, you can use what fits the case, without
compromising\nperformance.~\n\n~This is the API:~\n```ts\nconst
myActorContext = withMemoizedSelectors(\n
createActorContext(myMachine),\n {\n derivedView: createSelector(\n [\n
(ctx: MyContextType) => {\n return ctx.dependency1;\n },\n (ctx:
MyContextType) =>\n ctx.dependency2,\n ],\n (dependency1, dependency2)
=> {\n return // expensive calculation only running when necessary\n }\n
),\n },\n (context) => (context.subMachine ? [context.subMachine] : [])
// optional subscribe to changes of submachines as well\n);\n\n\n// in
react use useMemoizedSelector hook\n// this will cause the component to
rerender if the selector is returning a new
value\nmyActorContext.useMemoizedSelector('derivedView')\n```\n\n~This
is using reselect to declare the dependencies similar to a
react\nuseMemo hook - the actual selector will only run if the
dependencies\nchange, leading to similar semantics as useMemo, with the
additional\nbenefit that if the value is used in multiple places, it's
still just\ncalculated once. The component calling
`withMemoizedSelectors` only\nre-renders if the value returned by the
selector changes. The selector\nitself only re-runs if one of the
declared dependencies changes.~\n\n~Everything is type-safe by capturing
the types of the reselect selector\nobject via inferred type param and
using it in the
`useMemoizedSelector`\ntype.~","sha":"c5e0b05454b124949de754556ec8ba5289445ab3"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/213638","number":213638,"mergeCommit":{"message":"🌊
Streams: Selectors for derived samples (#213638)\n\nSimplified massively
from first state and just plugging in reselect in\nplaces where that's
suitable (here to calculate the currently relevant\nsample
documents).\n\nAlso does a drive-by layout fix.\n\n~Introduces a new
xstate helper for derived data.~\n\n~In most cases, the actor and state
machine model of xstate is great,\nbut for derived data using pure
functions, the semantics of the\n`useMemo` hook with defined
dependencies is often easier to understand\nand eliminates the risk of
forgetting to update the derived data\ncorrectly in some
cases.~\n\n~It's about using the right tool for the right job - you
don't need to\nchoose between the dependency list of useMemo and the
actor model of\nxstate, you can use what fits the case, without
compromising\nperformance.~\n\n~This is the API:~\n```ts\nconst
myActorContext = withMemoizedSelectors(\n
createActorContext(myMachine),\n {\n derivedView: createSelector(\n [\n
(ctx: MyContextType) => {\n return ctx.dependency1;\n },\n (ctx:
MyContextType) =>\n ctx.dependency2,\n ],\n (dependency1, dependency2)
=> {\n return // expensive calculation only running when necessary\n }\n
),\n },\n (context) => (context.subMachine ? [context.subMachine] : [])
// optional subscribe to changes of submachines as well\n);\n\n\n// in
react use useMemoizedSelector hook\n// this will cause the component to
rerender if the selector is returning a new
value\nmyActorContext.useMemoizedSelector('derivedView')\n```\n\n~This
is using reselect to declare the dependencies similar to a
react\nuseMemo hook - the actual selector will only run if the
dependencies\nchange, leading to similar semantics as useMemo, with the
additional\nbenefit that if the value is used in multiple places, it's
still just\ncalculated once. The component calling
`withMemoizedSelectors` only\nre-renders if the value returned by the
selector changes. The selector\nitself only re-runs if one of the
declared dependencies changes.~\n\n~Everything is type-safe by capturing
the types of the reselect selector\nobject via inferred type param and
using it in the
`useMemoizedSelector`\ntype.~","sha":"c5e0b05454b124949de754556ec8ba5289445ab3"}},{"branch":"8.x","label":"v8.19.0","branchLabelMappingKey":"^v8.19.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Joe Reuter <johannes.reuter@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:version Backport to applied version labels Feature:Streams This is the label for the Streams Project release_note:skip Skip the PR/issue when compiling release notes Team:obs-onboarding Observability Onboarding Team v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants