Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add seed parameter #335

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft

Add seed parameter #335

wants to merge 5 commits into from

Conversation

gustaphe
Copy link
Contributor

@gustaphe gustaphe commented Sep 21, 2023

Partial solution for #201 . Since it needs to be serializable, I opted for just a seed number. seed! is not super well documented, so Int may not be the perfect type, but it's pretty handy. Negative seeds appear to be invalid, so I could use that as a stand-in for nothing, as Union{Int, Nothing} ruined serialization.

  • PR
  • Complete tests
  • Documentation

@gustaphe
Copy link
Contributor Author

gustaphe commented Sep 21, 2023

  • Don't rely on 1.7 kwarg syntax

(I thought about this and forgot to fix it)

@gustaphe
Copy link
Contributor Author

There is a problem in that this only works if the benchmarks take the same number of samples. I don't know if this should be enforced or marked as a caveat: "The same seed guarantees the same sequence of random numbers, but the benchmarks may still pick an unequal number of random numbers".

In practice, consistency increases even if it's not guaranteed to be perfect.

@gdalle
Copy link
Collaborator

gdalle commented Sep 23, 2023

Probably a dumb question, but why not give a seed to each benchmarkable instead?

@gustaphe
Copy link
Contributor Author

I guess one could. But the seed needs to be set before an entire benchmark run, not for each sample. And if it's not for a group you can just run seed! before running the benchmark.

An alternative is to add a hook to run, like run(group; setup=(seed!(1234))) - I don't have a strong opinion.

@gdalle
Copy link
Collaborator

gdalle commented Sep 26, 2023

Given this recent Discourse thread, I think a seed for each benchmarkable would also make sense:

https://discourse.julialang.org/t/making-benchmark-outputs-statistically-meaningful-and-actionable/104256/

What do you think @gustaphe ?

@gustaphe
Copy link
Contributor Author

I didn't think that would work, but it took me like a second to convince myself it would. If nobody's made a PR like that by Saturday I will

@gdalle
Copy link
Collaborator

gdalle commented Sep 29, 2023

Wait, isn't it already possible to set the seed in the setup code of each benchmarkable?

@gustaphe
Copy link
Contributor Author

gustaphe commented Oct 1, 2023

Wait, isn't it already possible to set the seed in the setup code of each benchmarkable?

No. If you set the seed in the setup, it gets reset for every sample, not just at the start of the benchmark run. Consider

julia> b = @benchmarkable sleep(rand([0, 0.5])) setup=(Random.seed!(1234))
Benchmark(evals=1, seconds=5.0, samples=10000)

julia> r = run(b)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  1.531 μs  48.249 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.739 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.836 μs ±  1.147 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

         ▃▆██▅▄▃▁
  ▂▂▂▃▃▅▇█████████▇▆▅▄▄▃▃▃▃▃▃▃▃▂▃▂▃▃▂▃▂▂▂▂▂▂▂▂▂▂▂▁▁▁▂▂▁▂▁▁▁▂ ▃
  1.53 μs        Histogram: frequency by time        2.57 μs <

 Memory estimate: 192 bytes, allocs estimate: 5.

vs

julia> b = @benchmarkable sleep(rand([0, 0.5])) setup=(Random.seed!(1235))
Benchmark(evals=1, seconds=5.0, samples=10000)

julia> r = run(b)
BenchmarkTools.Trial: 10 samples with 1 evaluation.
 Range (min  max):  501.700 ms  502.117 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     501.726 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   501.802 ms ± 156.542 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █           ▃
  █▇▇▁▇▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇▁▁▁▁▁▁▇ ▁
  502 ms           Histogram: frequency by time          502 ms <

 Memory estimate: 192 bytes, allocs estimate: 5.

each of these runs only ever use a single result from rand. What you want is a series of random numbers, but in a repeatable sequence.

@gustaphe
Copy link
Contributor Author

gustaphe commented Oct 1, 2023

I would guess that the most useful use of this parameter is supplying it as run(::BenchmarkGroup; seed=something), but this really looks like the best implementation.

@gustaphe gustaphe changed the title [WIP] Add seed to BenchmarkGroup Add seed parameter Oct 1, 2023
@@ -85,6 +85,7 @@ You can pass the following keyword arguments to `@benchmark`, `@benchmarkable`,
- `gcsample`: If `true`, run `gc()` before each sample. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.gcsample = false`.
- `time_tolerance`: The noise tolerance for the benchmark's time estimate, as a percentage. This is utilized after benchmark execution, when analyzing results. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.time_tolerance = 0.05`.
- `memory_tolerance`: The noise tolerance for the benchmark's memory estimate, as a percentage. This is utilized after benchmark execution, when analyzing results. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.memory_tolerance = 0.01`.
- `seed`: A seed number to which the global RNG is reset every benchmark run, if it is non-negative. This ensures that comparing two benchmarks gives actionable results, even if the running time depends on random numbers. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.seed = -1` (indicating no seed reset)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this can have surprising side effects since it changes the global rng. It's a good thing that seed = -1 disables it by default, I'm just trying to imagine a way this can come back to bite us

Copy link
Contributor

@Seelengrab Seelengrab Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negative seeds will soon be allowed JuliaLang/julia#51416

It's better to use a different value, e.g. nothing, as the "no seed given" sentinel.

@rfourquet
Copy link

rfourquet commented Oct 10, 2023

I'm not clear on the details here, but instead of asking for a seed, why not copy the state of the global RNG at the beginning rng_copy = copy(Random.default_rng()), and then before each benchmark run, restore this RNG state (copy!(Random.default_rng(), rng_copy). This is way faster than seeding the RNG, and could even be done by default. If the use then wants to try a specific seed, then seed!(seed) can be done manually at the beginning.

EDIT: this is what @testset does BTW.

@Seelengrab
Copy link
Contributor

I'd go one step further and let the user provide the entire RNG object they want to use, defaulting to a state-resetting copy of the default RNG.

@gdalle
Copy link
Collaborator

gdalle commented Oct 30, 2023

@gustaphe what do you think of this proposal?

@gustaphe
Copy link
Contributor Author

I've been waiting for a good time to read up on this/try it out but haven't found it. The obstacles I see are:

  • Serialization - as is, benchmark objects are serialized, by some method I don't quite understand, so using an integer worked out-of-the-box but the other methods (possibly even the suggestion of using nothing as a sentinel) don't.
  • Is there a good way to supply an RNG object and have it reset every time? Because it's no use if this is just an RNG that is set once and then evolves throughout the experiment.

Like I said, I don't know if these are reasonable objections, I just haven't had time to think carefully about them.

@Seelengrab
Copy link
Contributor

Seelengrab commented Nov 1, 2023

Serializing an e.g. Xoshiro should be perfectly possible, it just needs to serialize the internal state & then reconstruct the RNG object from that state. I'm kind of surprised this isn't already implemented in Base. Resetting it every time means just copying the existing RNG object before the benchmark, and using the copy for the benchmark. I'd definitely make that toggleable though.

@gdalle gdalle marked this pull request as draft January 17, 2024 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants