Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache redesign part 1: Don't remove expired objects #739

Closed
artlowel opened this issue Jul 1, 2020 · 3 comments · Fixed by #961
Closed

Cache redesign part 1: Don't remove expired objects #739

artlowel opened this issue Jul 1, 2020 · 3 comments · Fixed by #961
Assignees
Labels
bug code task Difficulty: High Estimated at more than 16 hours e/70 Estimate in hours high priority performance / caching Related to performance, caching or embedded objects
Milestone

Comments

@artlowel
Copy link
Member

artlowel commented Jul 1, 2020

References

This ticket corresponds to issues 1, 2, 3, 4 and 6 from #718.

That ticket was too big to tackle at once so we split it up in to smaller, more manageable chunks of work.

The first two were deemed the most critical to address ASAP. As they leave dspace 7 fundamentally broken.

The others in this ticket will be addressed at the same time because their solutions are very closely tied to the solution for the first two.

Description

When an object's cache time to live (TTL) has elapsed. It is automatically removed from the cache

It is unclear for a component developer that this has happened and how to deal with it when it does.

As a consequence if you leave a dspace 7 page open in your browser, without doing anything for more than 15 minutes, more and more things will start to break. If you wait long enough the app may stop working entirely, depending on the page. This is a bug that can not be left unresolved.

Responses and objects are cached in separate parts of the store and have separate TTL.

But both the response and the object are necessary when we need the object in a component. If you mix requests with long and short TTLs for the same objects these can get out of sync, where the response is still valid, but the object isn't anymore. The result is hard to diagnose issues in the UI. These occur often when working with the submission and mydspace due to the very short cache times in use there.

Due to this issue a request made by component A can break another unrelated component B simply because the response for A contains the an object B was using. This is also a high prio issue to address.

An object can be removed from the cache before it's been used once.

If you use a very short TTL (a few ms) it's perfectly possible that the object is already removed by the time you first try to render it, in the best case this results in pages that sometimes work, sometimes don't (if the processing takes slightly too long), or in the worst case: infinite loops of requests.

RemoteDataObjects are constructed every time they're needed.

To construct them we combine info from the response and the object cache, have some logic to determine the state, but that logic basically duplicates what we already do when parsing a request, but doesn't work exactly te same way, which leads to issues: e.g. a RemoteDataObject isn't deemed to be successful if it has no payload, but if the response was a 204 that's incorrect. Constructing them when needed also leads to long, hard to debug methods, and takes up resources.

We can't take advantage of an embedded lists of objects

because lists aren't cached in the object cache, but stored as part of the response. Only lists that were requested using their self link can be cached at the moment.

Proposed solution

To deal with these issues I propose we do the following:

  • We remove the TTL from the object cache, we store it only with the response.
    • Objects already have a link back to the request id they came with. We can use that to find the response and check if it's still valid.
    • So checking an object's validity will be identical to checking the validity of the response it came in with.
  • We store the response as a RemoteDataObject
    • The way we use response info is by translating it to a RemoteDataObject first, so we're better off just storing it that way immediately.
    • That way we remove the duplicate logic for determining the state, everything will happen in the reducer.
    • Combined with the point above that means: the state of the object in the cache = the state of the response it came in on = the state of the RemoteDataObject.
    • The only thing that needs to happen when it is requested is to resolve the link to the object in the cache, which can happen the same way it happens for all other links.
  • We no longer remove expired objects from the cache automatically
    • instead we add an extra state to RemoteDataObjects: stale.
    • An object's state will change to stale when its validity is checked and its expiration date has passed.
    • It can also be set to stale when we know it is invalid for other reasons, more below.
    • Stale objects won't disappear from any component they're being shown in.
    • The developer for a component can choose to use the stale state to let the user know, but they're not obliged to.
    • For performance reasons, objects won't automatically be set to stale if their TTL expires because that would mean keeping track of hundreds of timers. The state only changes when it is retrieved or otherwise invalidated.
    • As a consequence, if you're e.g. looking at an item page, and leave it open for a few hours without interacting, nothing will break (unlike now) but the cache won't set that item to stale because we have no cause to check its validity. Clicking the full item page link would, because it means it is retrieved from the cache once again and checked.
  • Lists will be stored in the object cache as well.
    • Lists have self links in the rest response, so there is no good reason not to cache them.
    • Currently those self links are attached to the pageinfo object inside the paginated list, but that doesn't correspond to the response.
    • We should move the _links section to the list itself.
    • Each object inside the list is already cached separately in the object cache. We'll model that by adding a page link to the paginated list's _links section. It will contain HALLinks to every object in the list.
  • Dataservice methods can automatically re-retrieve stale objects
    • We'll add a tap() to the bottom of any dataservice method that represents a GET request,
    • Inside we'll check if the RemoteDataObject being returned has switched to stale
    • If it has, we'll send a new request to the same URL.
    • Doing it in a tap() will ensure that this re-request will only happen if something is still subscribed to the result, i.e. when it's still being used.
    • You'll be able to disable this behavior when calling the dataservice method, but it will be enabled by default.

This will take an estimated 60 - 80h

@artlowel artlowel mentioned this issue Jul 1, 2020
@artlowel artlowel added bug code task Difficulty: High Estimated at more than 16 hours performance / caching Related to performance, caching or embedded objects labels Jul 1, 2020
@artlowel artlowel changed the title Cache redesign part 1: don't remove expired objects Cache redesign part 1: Don't remove expired objects Jul 1, 2020
@tdonohue tdonohue added this to the 7.0beta4 milestone Jul 1, 2020
@tdonohue tdonohue assigned tdonohue and artlowel and unassigned tdonohue Jul 1, 2020
@tdonohue
Copy link
Member

tdonohue commented Jul 1, 2020

@heathergreerklein and I discussed this yesterday and have decided to schedule first part for beta4 & assign to @artlowel. Work can begin as soon as beta3 is complete (i.e. likely as soon as next week)

@tdonohue
Copy link
Member

As discussed in today's meeting, this ticket & #740 are both assigned to @artlowel for the Atmire team to complete (in parallel to the effort to upgrade to Angular 10, see #787 ). This ticket is deemed as high priority (as we cannot start other large tasks like theming until completed). @lieven-atmire estimated both these Cache redesign tickets will be completed by/around the first week of Dec.

@artlowel
Copy link
Member Author

artlowel commented Dec 3, 2020

Time spent so far: 91 hours

@artlowel artlowel mentioned this issue Dec 14, 2020
6 tasks
kosarko pushed a commit to ufal/dspace-angular that referenced this issue Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug code task Difficulty: High Estimated at more than 16 hours e/70 Estimate in hours high priority performance / caching Related to performance, caching or embedded objects
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants