Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Unshared blockstore #150

Merged
merged 3 commits into from
Jun 22, 2023
Merged

Unshared blockstore #150

merged 3 commits into from
Jun 22, 2023

Conversation

willscott
Copy link
Collaborator

This removes the complexity of notifier / shared context between different fetches.
If blocks from a car are not in the order they are requested from the gateway, blocks will be immediately queried to fill any missing gaps.

  • I expect this will lead to a significantly higher number of block fetches. i don't know exactly how many.
  • I'm interested to see if this changes our observed response sizes

@willscott willscott changed the title [WIP] unshared blockstore Unshared blockstore Jun 22, 2023
@willscott
Copy link
Collaborator Author

@aschmahmann - any objections to running this in prod for the weekend? we don't see a meaningful increase in block requests as a result of this change.

@aschmahmann
Copy link
Contributor

aschmahmann commented Jun 22, 2023

Nope. We're not really serving traffic to anyone so even prod is effectively experimentation. If so might want to deploy sooner so it can be reverted before the weekend in case there's a problem. Saturn folks (e.g. @guanzo) might care about prod timing

@aarshkshah1992 aarshkshah1992 merged commit edb5cd4 into main Jun 22, 2023
29 checks passed
@aarshkshah1992 aarshkshah1992 deleted the test/unshared-blockstore branch June 22, 2023 13:43
@aarshkshah1992 aarshkshah1992 mentioned this pull request Jun 22, 2023
} else {
return nil, nil, errors.New("failed to get notifier")
}
bstore, err := NewCacheBlockStore(1024)
Copy link
Collaborator

@lidel lidel Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 @willscott @aarshkshah1992 @aschmahmann

Just a small ask that if we make breaking(?) change like this, we should also break (rename) metrics, just to be safe. Otherwise, we end up with confusing Grafana history making it impossible to reason about the system 😅

It also avoids situations where metric means different things when a different backend is used (block vs car).

To illustrate, the meaning of blockstore_cache_hit and blockstore_cache_requests changed, it is no longer "global hit/miss", but "hit per single http request". BUT, on _existing_Grafana it is graphed in the same place, and looks like we've only lost about ~10% of cache hits (but these are no longer across all requests, but per every single request):

2023-06-22_19-43

@lidel
Copy link
Collaborator

lidel commented Jun 22, 2023

in case this helps next week, this change exploded memory usage and prod seems to OOM and get restarted when that happens:

2023-06-22_20-21

willscott added a commit that referenced this pull request Jul 18, 2023
guanzo pushed a commit that referenced this pull request Jul 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants