Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the global Random.seed in runtests.jl #2381

Merged
merged 2 commits into from
Oct 30, 2024
Merged

Conversation

mhauru
Copy link
Member

@mhauru mhauru commented Oct 29, 2024

Based on local testing, this should [EDIT: not] close #2371.

EDIT: After conversation with @penelopeysm, agreed that this should not close #2371, since when the tests fail, the discrepancies are sometimes in fact quite large, which seems suspicious. Still worth fixing the seed and making this deterministic.

@mhauru mhauru requested a review from penelopeysm October 29, 2024 14:31
Copy link

codecov bot commented Oct 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.39%. Comparing base (f6fdc91) to head (e6623d5).
Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2381   +/-   ##
=======================================
  Coverage   86.39%   86.39%           
=======================================
  Files          22       22           
  Lines        1573     1573           
=======================================
  Hits         1359     1359           
  Misses        214      214           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@coveralls
Copy link

coveralls commented Oct 29, 2024

Pull Request Test Coverage Report for Build 11592008026

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 78.685%

Totals Coverage Status
Change from base Build 11590681486: 0.0%
Covered Lines: 1233
Relevant Lines: 1567

💛 - Coveralls

@yebai
Copy link
Member

yebai commented Oct 29, 2024

the discrepancies are sometimes in fact quite large, which seems suspicious. Still worth fixing the seed and making this deterministic.

IIRC, we have a few test models whose posteriors have a large variance, which can cause challenges for MCMC samplers.

@torfjelde
Copy link
Member

torfjelde commented Oct 29, 2024

IIRC, we have a few test models whose posteriors have a large variance, which can cause challenges for MCMC samplers.

Specifically, all the DynamicPPL.TestUtils.DEMO_MODELS are examples the Inverse-Gamma-Normal model because it has a closed-from posterior so it's easy to compare samples to ground-truth. In hindsight, this was a bad idea. The reason for this is that the InverseGamma prior has infinite variance 😬 Once we condition on some observations, the variance should become finite, but because we only condition on a few observations (either 1 or 2 in all our "demo models"), the variance will still be huge. This in turn increases the probability of one of the samplers just completely falling on its butt 😕 This is why having some random (but low-probability) cases failiing "horribly" isn't too surprising, in particularly not for MH.

There are a few ways we can try to make this better, e.g. choose a different base model for testing, or just use more observations in the posterior (though this increases the computational burden of the tests).

@torfjelde
Copy link
Member

Also, regarding fixing the seeds: note that this won't be consistent across hardware nor Julia versions. For that we'd have to use StableRNGs.jl, but this in turn comes with additional computational overhead 😕 However, I do agree that maybe fixing the seeds can be helpful.

@mhauru
Copy link
Member Author

mhauru commented Oct 30, 2024

That's good to know, both about StableRNGs and the large variance of the demo model. I would still like to merge this though, because it definitely removes a lot of indeterminacy, even if StableRNGs would be better still.

There are a few ways we can try to make this better, e.g. choose a different base model for testing, or just use more observations in the posterior (though this increases the computational burden of the tests).

I would like to avoid the latter. I would rather like to aim to reduce the amount of sampling we do our tests, it would make especially local development easier.

Copy link
Member

@penelopeysm penelopeysm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add the call to seed! slightly closer to the point of use? Specifically, I'm looking at test/mcmc/abstractmcmc.jl and inserting them after lines 120, 161, and 185.

We would still have the issue with seed! not being consistent across Julia versions, but at least we wouldn't get different results depending on which tests are run.

(I was pleasantly surprised to find that the PRNG is actually consistent across operating systems, at least! I didn't test architecture, though)

@mhauru
Copy link
Member Author

mhauru commented Oct 30, 2024

@testset actually already resets the seed for us:

julia> module MWE

       using Test, Random

       @testset "A" begin
          @show rand()
       end
       @testset "B" begin
          @show rand()
       end

       end
rand() = 0.05361041735414662
Test Summary: |Time
A             | None  0.0s
rand() = 0.05361041735414662
Test Summary: |Time
B             | None  0.0s
Main.MWE

However, without fixing the seed at the start globally, the local seed set by @testset will be different in different Julia sessions.

@penelopeysm does that mitigate the issue you worry about?

@penelopeysm
Copy link
Member

Oh, yes!

That does correspondingly suggest that we might have too many calls to seed! in other parts of the test, but that's far less pressing.

@mhauru
Copy link
Member Author

mhauru commented Oct 30, 2024

Yup:

$ rg "seed\!" test | wc -l
      70

I would leave fixing that for later though, since it's mostly a style/clean-up question. Any other changes you want to see here @penelopeysm?

@penelopeysm
Copy link
Member

Nope, feel free to go ahead :) And thanks for diagnosing it!

@penelopeysm
Copy link
Member

GitHub's mobile app doesn't seem to let me approve so take this comment as an approval

@mhauru mhauru merged commit 5426eca into master Oct 30, 2024
69 of 70 checks passed
@mhauru mhauru deleted the mhauru/fix-test-seed branch October 30, 2024 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AdvancedMH external-sampler tests fail intermittently with 2 threads
5 participants