-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use distributions in ML backends for better Poisson approximation #268
Use distributions in ML backends for better Poisson approximation #268
Conversation
@matthewfeickert something similar for pytorch? |
I'm doing them all. :) Just after dinner. |
awesome! ✨ |
@lukasheinrich @kratsg So at the moment the backend tests are failing because of the agreement between answers: Printing the values and the std for
and
So I guess the question is should we allow a STD of greater than The ML backends all agree with each other very well. |
I think i'd be fine with that. Let's cut is as close to the vest as possible with a tol of 5e-5? @kratsg and I are wrestling with a similar issue in the interp code. At some point we migh need to figure out how to really compare this properly (by e.g. forcing single precision in numpy?) |
67083d6
to
14122ca
Compare
@lukasheinrich @kratsg Given that this PR removes the need for using the Normal approximation of the Poisson distribution, is there any reason to keep the self.pois_from_norm = kwargs.get('poisson_from_normal',False) in the NumPy backend anymore? Or can I remove that everywhere? |
Remove. The only reason it was there was to ameliorate differences and make it easier to compare cross-backend. |
14122ca
to
d9493da
Compare
I seem to have broken something in the CI. The test suite all passes and then it core dumps. 😢 Things all pass fine on my local machine with Python 3.6.6, so I'll need to follow up on this. I'm guessing it has something to do with the huge numbers of
np.exp((np.log(lam) * n) - lam - gammaln(n + 1.))
|
yeah +1 on removing the normal approx |
420e1cd
to
dac3303
Compare
@kratsg @lukasheinrich If you have any thoughts on this |
Running out of memory. You're allocated some amount of memory to run python and you've exceeded it. |
Yeah, that part I understood. Sorry, I should have been more explicit. Do you have any ideas on why changing this approximation seems to have caused (what I'm assuming is) NumPy to cause the memory to grow out of the allowed limit? |
@matthewfeickert can you try adding each backend one by one? it might be only a single one that is misbehaving |
3aba184
to
691ea56
Compare
For reasons not fully understood, using TensorFlow Probability causes Travis CI to use too much memory and a core dump occurs. So until this is understood all use of tfp should be reverted to a combination of This is really annoying, but at least a temporary fix that doesn't break all functionality and gains. However, even with this still enough memory is being eaten up that I needed to break the notebook tests off into their own parallel tests in the For additional reference on this issue c.f. |
@lukasheinrich @kratsg With the exception of the coverage, everything is passing now. This is a large and strange PR, sorry about that, and it is not perfect in that there will need to be additional fixes in the future to actually get Let me know your thoughts and if you want additional changes. |
This may want to wait until PR #251 is in to make the rebase easier then this one going first. That being said, if there are no changes it would be nice to have this for tomorrow's talk. |
Resolves issue of ResourceWarning: unclosed file <_io.TextIOWrapper name='<your path here>/pyhf/pyhf/data/spec.json' mode='r' encoding='UTF-8'> Additionally apply autopep8
Resolves issue of ResourceWarning: unclosed file
If pytest runs all tests at once then it exceeds the alloted memory for it by Travis CI. To deal with this, tests/test_notebooks is run by itself in a second run of pytest. This is a bit strange, but seems to work. As a result of this and the order that pyest runs the tests, the docstring tests finish just before the import tests run. This means that the last backend tested is still in memory, and so the NumPy backend needs to get set as that is what is needed for the test_import.py
For reasons not fully understood, using TensorFlow Probability causes Travis CI to use too much memory and a core dump occurs. So until this is understood all use of tfp should be reverted to a combination of tf.distributions and tf.contrib.distributions (for the Poisson). This is really annoying, but at least a temporary fix that doesn't break all functionality and gains. c.f. - pytest-dev/pytest#3527 - travis-ci/travis-ci#9153
Calling the set default backend before and ensures that the backend gets reset even if it just came off of a docstring test outside of the tests dir As a result, remove the explicit calls to set the NumPy backend in test_import.py
Mention that the continuous approximation is done using n! = Gamma(n+1) Additionally use :code: and :math: qualifiers on different terms to give specific highlighting in the rendered docs
975805b
to
94b6e8e
Compare
Rebased against master to prepare to merge with PR #280 |
This reverts commit cbb899f.
Thanks for the support on getting this in @kratsg. 👍 |
Description
Resolves #267
This essentially removes the use of a Normal approximation the Poisson distribution (
poisson_from_normal=True
) and instead uses either a tensor distribution or the approximation that makes use ofn! = Gamma(n+1)
to put the Poisson distribution's p.m.f. in a numerically stable continuous approximation.This PR also eliminates some warnings that arise from files in the tests not getting properly closes. This came up when I was trying to debug memory issues in Travis.
autopep8
is applied at various places when I was working.Checklist Before Requesting Approver