Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate use of stdlib rand #13

Closed
wants to merge 2 commits into from

Conversation

kentquirk
Copy link

@kentquirk kentquirk commented Jun 30, 2023

I've been using this project in one of mine as a way to track traceIDs that have traversed our sampling proxy. In doing some profiling under heavy load, we've realized that there's a significant performance block through the use of the default global random number generator provided by Go's standard library. The global math.rand takes a lock on every invocation and can be needlessly expensive.

PR #12 addresses that issue as well as others, but the submitter has been asked to break it up into smaller pieces. In particular, it modifies Encode and Decode for better performance, but as my use case doesn't use those, I haven't touched them here.

In a similar fashion to that, this PR uses a fast, well-distributed hashing algorithm (wyhash) as a pseudorandom number generator, and attaches it to the filter object so that no additional locks are needed. The particular version of wyhash used here is different from that in #12 and was mainly chosen because I have experience with it and am already using it elsewhere in my project.

For now, we're going to depend on our own fork, but I'd much rather see the issue addressed upstream, either by merging this, #12, or some other variant.

Another option would be to allow an RNG matching an interface to be injected rather than depending on the global one. This would also make certain kinds of testing easier.

This PR includes:

  • Move the RNG into a member of the filter
  • Implement Intn and CoinFlip functions on it
  • Write some probabalistic tests for them to verify they're reasonably well-distributed
  • Call them in place of math.rand functions
  • Replace custom bitbanging in getNextPow2 with Go's standard library for it

Thank you for this library!

Copy link
Owner

@panmari panmari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, this is looking pretty good! Please also post a benchmark comparison in the PR.

return &Filter{
buckets: buckets,
count: 0,
bucketIndexMask: uint(len(buckets) - 1),
rng: &rng,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline here as &wyhash.Rng(time.Now().UnixNano())

// means the bias is on the order of 10^-13. For our use case, that's well below
// the noise floor.
func (cf *Filter) Intn(n int) int {
// we need to make sure it's strictly positive, so mask off the sign bit
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting to uint would make this more straight-forward.

// purposes since n is on the order of 10^6 and our rng is 63 bits (10^19); this
// means the bias is on the order of 10^-13. For our use case, that's well below
// the noise floor.
func (cf *Filter) Intn(n int) int {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lowercase, no need to make this public


// Coinflip returns either i1 or i2 randomly by examining the least significant
// bit of the RNG.
func (cf Filter) Coinflip(i1, i2 uint) uint {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lowercase, no need to make this public.

}

// Coinflip returns either i1 or i2 randomly by examining the least significant
// bit of the RNG.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment nit:

Coinflip returns either i1 or i2 randomly with about equal chance.

The rest is an implementation detail.

@panmari
Copy link
Owner

panmari commented Sep 18, 2023

I've been doing some digging as well. The runtime internally has a fastrand which is thread local, i.e. doesn't use any mutex. Here's how the change would look like: fcadf94

There's an open question if the dependency on a runtime internal doesn't make code too brittle.

@panmari
Copy link
Owner

panmari commented Sep 22, 2023

Thanks for the suggestion. I want to avoid adding additional dependencies, hence I'm going with #15 as alternative to the std lib random.

I'd still be interested in merging the following change:

Replace custom bitbanging in getNextPow2 with Go's standard library for it

Could you split this off into a separate PullRequest?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants