-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-10551: [Rust] Fix unreproducible benches by seeding random number generator #8635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-10551: [Rust] Fix unreproducible benches by seeding random number generator #8635
Conversation
jorgecarleitao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vertexclique thanks a lot for the PR.
Could you please explain in the PR description:
- how is the current code broken?
- how is this PR solving it
- why do we need to take another dependency (i.e. why doesn't
randomwork)?
It is really difficult for me to even understand what is the issue in the first place.
|
Answered |
|
I am checking / testing this PR out locally. |
|
In my measurements, the I think one issue, as @vertexclique has expalined, is that that different random numbers are being used between runs (and between threasd) and thus the actual computations performed from run to run are changing. Seeding the random number generator so it always produces the same sequence of random numbers is definitely a classic way to reduce such variability and seems like a good idea to me. Seeding the random number generators can be done using the existing Instead of Use Here are some measurements I ran on my machine: Master @ f7027b4 ARROW-10551-fix-unreproducible-benches Master @ f7027b4 w/ |
What I am trying to understand is why this is a concern: in the benchmarks, the data is generated only once, during the setup and not on every iteration of the benchmark. Thus, any randomness will be frozen for every iteration of the benchmark. If the argument is that two runs of the same benchmark do not agree due to randomness, then I do not understand how having the same seed per thread or different seeds per thread changes this behavior: the seed will still be set when the process starts and will be different per run. EDIT: @alamb beat me to seconds (and evidence) |
|
So my personal suggestion is change all the benches to use |
|
Can you use https://github.com/vertexclique/zor/blob/master/zor if you are on Linux? |
I don't think that it should be different per run per benchmark, that's why it is variating.
Yeah, I didn't want to write a const compiled seed and people forget to use the same seed every time. My implementation does some extra faster stuff compared to rand too. Here: But I am ok with that too if we won't forget it in benches 😄 |
I am running on mac osx -- I looked at zor and it looked like a good tool to try and make reproducible results by clearing kernel caches, etc. I wonder if you can try using the seeded random approach with zor to see if that is as reproducible.
That looks cool, but I am not enough of a numerical algorithms expert to evaluate the random ness properties of that algorithm |
will do.
dw, I will check if that's like how you reported, I will change to constant seed with seededrng. |
|
@alamb I am unsure about the seedable_rng behavior right now. But let's go with this one. You can check the results here: |
6b72627 to
494bd8d
Compare
494bd8d to
12ae5dc
Compare
|
Anything blocking in this one with the current status quo? |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
|
Thanks again for this @vertexclique -- it is much appreciated |
This PR fixes all unreproducible benchmarks and tests.
Current, code is reseeding after 32 MB of data, and every thread has different randomness. So no data is same as another data and totally different in different use cases.
By introducing a code that have fixed seed for all threads at the all times. and all threads are at the same random number at the given time of a single benchmark.
random usees thread_rng because of the reason mentioned in 1 it is not working and thread_rng also xors with the seed given from the threaded itself. So it is mostly random at any given time, but not consistent across concurrent behavior like benchmarks and things like that. Dependency is bastion runtime's utility crate. So runtime is using it already.
Seeing that we are not using rand there a lot, we can actually remove the rand with this PR too.