Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSR Cache RFC #16

Closed
wants to merge 2 commits into from
Closed

SSR Cache RFC #16

wants to merge 2 commits into from

Conversation

adam-26
Copy link

@adam-26 adam-26 commented Jan 23, 2018

Create RFC for SSR cache

Initial commit

```js

interface CacheStrategy {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interface CacheStrategy<T> {
  getCacheState(...): T | undefined
  render(..., cacheState: T, ...): string
}

?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update, thanks!

Make the cache strategy interface type safe
@aickin
Copy link

aickin commented Feb 2, 2018

I'm generally a fan of allowing cache hooks in SSR, and I'm excited to see this proposal. Thanks for it!

I'm a bit concerned about the level of the API, though. I tend to think that a component author is the best authority on whether or not a component can be cached, rather than the developer who is calling renderToString, so I would prefer a standardized lifecycle method, perhaps like your getCacheKey example.

I understand that getCacheKey can be implemented in user space with this proposal through cache strategies, but I think that's a recipe for getting multiple, incompatible new lifecycle methods defined by different cache strategy authors. What happens if I'm using component A from library A which expects a cache strategy with a getCacheKey lifecycle method at the same time I'm using component B from library B which expects a cache strategy with a componentWillCache lifecycle method?

@gaearon
Copy link
Member

gaearon commented Feb 2, 2018

Could components do this instead (without pluggable strategies)?

For example, what if components had optional static hooks like getCachedRenderedString() and saveRenderedStringToCache()? (Naming is strawman)

If getCachedRenderedString() exists, it gets called with props. If it returns a string, that string is used instead of rendering the component. Otherwise, component is rendered, and saveRenderedStringToCache() is called with the props and the string.

Sure, third-party components wouldn't know what to do, but IMO you wouldn't want them to implement caching at all—you'd do this at a higher level for your own components.

Am I fundamentally misunderstanding the problem?

@gaearon
Copy link
Member

gaearon commented Feb 2, 2018

Alternatively there could be a <CachingBoundary> component.

<CachingBoundary hash={hash} saveToCache={...} loadFromCache={...}>
  <RouteHandler />
</CachingBoundary>

On the client it would just render the children, on the server it would be treated specially.

@streamich
Copy link

This is not exactly related to this PR, but to SSR in general. Would be nice to have React natively support async data fetching on the server side, something like what react-resolver does.

@gaearon
Copy link
Member

gaearon commented Feb 2, 2018

The discussion on data fetching is a very complex one and I ask to refrain from it from this RFC, or it will get completely derailed. :-)

@aickin
Copy link

aickin commented Feb 3, 2018

For example, what if components had optional static hooks like getCachedRenderedString() and saveRenderedStringToCache()? (Naming is strawman)

If getCachedRenderedString() exists, it gets called with props. If it returns a string, that string is used instead of rendering the component. Otherwise, component is rendered, and saveRenderedStringToCache() is called with the props and the string.

This is very close to what I'm proposing but slightly different. I think there are two important concerns here which need to be separate because they will be decided by different people:

  1. Is one rendering of ComponentA that happened in the past the same as the rendering of ComponentA that we are about to do and should we therefore reuse the markup from last time?
  2. We just rendered ComponentA and need to somehow store the results so that we can use it next time.

I think concern 1 is the domain of the component author. It is almost identical to the answer that componentShouldUpdate answers when rendering to DOM (or Native, etc.): is this new render the same as or different from the last?

I think concern 2 is the domain of the web app author: should I put the cache in memory on each web server? Or maybe a shared Redis cluster? Or in DynamoDB? MS Access?

The straw man you proposed (getCachedRenderedString() and saveRenderedStringToCache()) conflates these two concerns, I think, and requires the component author to think about how the cache is stored and persisted. I would prefer a component lifecycle method like getCacheKey(props) that returns a string to use as lookup key in the cache and use for adding to the cache. This would answer the question in concern 1.

Then I would advocate for a CacheStrategy object that is passed into renderToString that has events for adding to and looking up from the cache. This could be implemented for many different backends (memory, Redis, etc). This would answer the question in concern 2.

Full disclosure: I implemented something like this in react-dom-stream as an experiment: https://www.npmjs.com/package/react-dom-stream#experimental-feature-component-caching

My proposal wouldn't allow for the template feature that @adam-26 wants. I am not as excited by that particular use case, to be frank, but I may not be understanding why it's awesome, so I'm more than happy to hear more about it.

The discussion on data fetching is a very complex one and I ask to refrain from it from this RFC, or it will get completely derailed. :-)

100% agreed. I think that discussion, while important, is totally unrelated to the decisions we make here.

@adam-26
Copy link
Author

adam-26 commented Feb 5, 2018

@aickin, I like the concept of introducing a new lifecycle method to get a cache key for a component, it makes adding caching to an application really simple. And clearly separating the responsibilities of rendering and caching definitely reduces complexity. From an implementation perspective, I'm guessing this would be a simpler version of what @gaearon suggested, a single static hook static getCacheKey(props) {...}? No need to create a component instance if its cached.

The original motivation behind the <Template> was to support the ability to inject dynamic content into cached content. A simple example would be, rendering a cached <Header> component and injecting the currently authenticated users <Avatar>. I find that when adding cache to an application its often simpler (in terms of deciding what to cache) to cache with exceptions.

I'm not sure what your opinions on such an approach are, but I think this could be supported with a getCacheKey() static hook method by supporting nested cached components, and it would be much simpler than attempting to use the Template approach. For this to succeed, the following would be required:

  1. when the parent component is rendering to be cached, the unmasked context of the parent and child component would need to be compared to ensure the context has not been changed - if the context has changed then nested component caching can not be supported, display a warning.
  2. returning false from the child components getCacheKey() static hook method prevents the component from being cached (in which case its rendered on every request)
  3. The props to be passed to the nested cache component would also need to be cached with the parent component cache - and passed to the nested component on future renders (when rendering from the cache). There may need to be some restrictions on what PropTypes are supported in this scenario.. I'll need to spend some more time thinking about this - it's probably the most complicated requirement of nested caching.

If you'd like me to provide a more detailed explanation of these points, please let me know.

This may be a little off topic as its regarding implementation specifics, but for requests that use multiple cached components for rendering, I would prefer that cache keys are batched and only a single request is made to the cache provider.

Looking forward to your feedback.

@tpreusse
Copy link

tpreusse commented Feb 26, 2018

User Space Solution

import React from 'react'
import PropTypes from 'prop-types'

let getHtml
if (!process.browser) {
  const ReactDOMServer = require('react-dom/server')
  const cache = require('lru-cache')({
    max: 1000 * 1000 * 200, // 200mb
    length: d => d.length
  })

  getHtml = (key, children) => {
    if (cache.has(key)) {
      return cache.get(key)
    }
    const html = ReactDOMServer.renderToStaticMarkup(children())
    cache.set(key, html)
    return html
  }
}

const SSRCachingBoundary = ({cacheKey, children}) => getHtml
  ? <div dangerouslySetInnerHTML={{
    __html: getHtml(cacheKey, children)
  }} />
  : <div>{children()}</div>

SSRCachingBoundary.propTypes = {
  cacheKey: PropTypes.string.isRequired,
  children: PropTypes.func.isRequired
}

export default SSRCachingBoundary

Usage

export default ({data}) => (
  <Frame>
    <Loader loading={data.loading} error={data.error} render={() => (
      <SSRCachingBoundary cacheKey={data.article.hash}>
        {() => renderBigTree(data.article)}
      </SSRCachingBoundary>
    )} />
  </Frame>
)

real world example

Drawbacks

  • an extra div
  • no (automatic) context
  • dynamic requires need to be excluded from webpack with externals

Probably the solution could be improved upon by passing the cache and a render to html function via context on the server side. Is there something like unstable_renderSubtreeIntoContainer or portals that could be used easily on the server?

@Minivera
Copy link

I have a few questions related to this RFC and my particular use case which, I think, could add to this discussion. To start, I'll add a quick explanation of my use case. You can skip it if necessary, but I think it explains better what I would like to achieve from a caching standpoint using this RFC.

Basically, I am currently building a page builder using React. I compile the react structure to JSON, save it inside my database and reload it through a GraphQl API on the frontend for rendering.

With this system, I allow the users to create a template (Like a master page in ASP), widgets (Reusable components) and pages. These three component may contain DataSource, which links to the API or the Redux store.

To simplify some of the processes, we have made some considerations :

  1. Unless data has changed (In which case we will invalidate the Redis cache for that specific data), we can consider that, for the same query, the API will always return the same data.
  2. Redux will only be used to keep the general state of the application (If a modal is opened, the offset of a paginated list...) and as hooks to our monitoring strategies.
  3. There will be no statefull components (Except third-party ones)

This keeps the rendering simple and predictable and I can easily cache queries.

To get back to the RFC, I am very interested in the capabilities of templates and how they could allow me to cache the renderer structure. Since my users can pretty much do anything, I need an easy way to cache on a per component basis and inject data in them.

Currently, I use a bunch of HOC to hydrate my components with data from various sources and connect them with their props. Since I don't know what my structure may look like, it's easier to use these than try to create logic in all my available components.

Using templates, how would I be able to know if a children is templated and cached? For example, I have a connectToDataSource HOC that gives a data prop to the children who can then read from it and replace templated props (String literals) with the data there. How would the child get access to that data prop if it is templated so that it can render itself using these templates?

Again with templates, my dataSource component can render the same child(ren) multiple times depending on the data, for example if I had three children with six data entries, it would render this :

  • Child 1 (Entry 1)
  • Child 2 (Entry 2)
  • Child 3 (Entry 3)
  • Child 1 (Entry 4)
  • Child 2 (Entry 5)
  • Child 3 (Entry 6)

How would that be handled with templates? Each of these children would be cached with a template and I would need to choose and hydrate each of them. Would I use a {children} template string that receives the children cached string? (In that case, would the rendering go from leaf to root rather than root to leaf? (I am unsure which one it is right now))

One of my strategies for cache was to cache queries. In my query components, I would use the props as indexes in my redis store and check if an entry less than 24 hours old exists and if yes, take that entry instead. If using an apollo graphQL query, apollo already exposes a function that fetches the data from the API before going to renderToStream, so its fine there. That would probably get into conflict with the caching due to the async nature of queries and cache only "loading" components. Would I need to patch the apollo function that runs through the structure and fetch all data before caching or could this be handled with some kind of wait mechanism to cache loaded components?

Finally, how would caching strategies that need to fetch the cached data from an async location be handled using this method? I could hardly see the rendering wait for the cache to resolve or huge cache be loaded in memory for that (Like in my case where I would be able to cache an theoretically infinite amount of components).

In any case, I can't wait to see where this RFC will go.

@raphaeleidus
Copy link

Coming from a rails background I am used to relying heavily on Partial Caching for delivering high performance. Right now react is way behind in not offering this out of the box, that being said the templatization aspect is amazing, this would take React to being one of the best solutions for SSR

@hueyhe
Copy link

hueyhe commented Oct 27, 2020

I'm very interested in this RFC. Is this topic still active? Recently I'm working on a migration from using ejs template rendering to using React server-side rendering. The overall performance is what I'm concerned about most. A caching strategy could be a great solution.

@gaearon
Copy link
Member

gaearon commented Aug 18, 2021

The problem itself is definitely still of interest. But quite a few things changed in the landscape which make this RFC pretty outdated. We moved to Hooks as the primary API. The SSR is significantly redesigned to support true streaming (waiting for data to load). Server Components (#188) become an important architectural consideration. The exact way they are layered with new SSR is still a question and there is further research needed there. The data fetching story has first-class support for IO sources like disk, databases, etc, built with Suspense adapters. A cache would need to be integrated with these since there would need to be a way to specify what invalidates it (e.g. disk change, timestamp change, and so on). To reiterate, this is an important design area that we're interested in, but the landscape has changed so much that this proposal does not satisfy the needs.

@gaearon gaearon closed this Aug 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants