Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ring: add GetWithOptions method to adjust per call behavior #620

Closed
wants to merge 3 commits into from

Conversation

56quarters
Copy link
Contributor

What this PR does:

This change adds a new method that accepts 0 or more Option instances that modify the behavior of the call. These options can (currently) be used to adjust the replication factor for a particular key or use buffers to avoid excessive allocation.

Which issue(s) this PR fixes:

Part of grafana/mimir#9944

Checklist

  • Tests updated
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

This change adds a new method that accepts 0 or more `Option` instances
that modify the behavior of the call. These options can (currently) be
used to adjust the replication factor for a particular key or use buffers
to avoid excessive allocation.

Part of grafana/mimir#9944
ring/ring.go Outdated Show resolved Hide resolved
@56quarters
Copy link
Contributor Author

56quarters commented Dec 18, 2024

I wasn't able to avoid the option functions escaping to the heap. Benchmarking the two methods shows that GetWithOptions is about 14% slower. I'm going to leave these as mostly separate methods and work on adding tests to GetWithOptions. After that I'll mark this as ready for review and we can start experimenting with higher replication factors in Mimir.

$ benchstat old.txt new.txt 
goos: linux
goarch: amd64
pkg: github.com/grafana/dskit/ring
cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
                                                         │   old.txt    │                new.txt                 │
                                                         │    sec/op    │    sec/op      vs base                 │
Ring_Get/without_zone_awareness-16                         863.2n ± ∞ ¹   1009.0n ± ∞ ¹        ~ (p=1.000 n=1) ²
Ring_Get/without_zone_awareness,_not_enough_instances-16   654.2n ± ∞ ¹    758.2n ± ∞ ¹        ~ (p=1.000 n=1) ²
Ring_Get/with_zone_awareness-16                            1.138µ ± ∞ ¹    1.252µ ± ∞ ¹        ~ (p=1.000 n=1) ²
Ring_Get/one_excluded_zone-16                              855.0n ± ∞ ¹    962.0n ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                    861.0n          979.7n        +13.80%

@56quarters
Copy link
Contributor Author

After adding unit tests for this change, it turns out to not be compatible with zone-awareness. There's an assumption in the ring that only a single instance per zone is returned during quorum checks. The result of this is that there are Ring.Get() calls that should fail when not enough zones are available that would now succeed since we return multiple instances per zone and the existing code only checks the number of instances, not zones.

@56quarters
Copy link
Contributor Author

I'm going to close this PR. The changes to the ring required for grafana/mimir#9944 are more complicated than expected. I'll put the various concerns into a design doc for some discussion with the Mimir team and other database teams as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants