Adding the Poisson distribution#15814
Conversation
leepface
left a comment
There was a problem hiding this comment.
Believe these are the maven-checks errors you're seeing. You may also need to rebase for the other errors that look unrelated.
presto-main/src/main/java/com/facebook/presto/operator/scalar/MathFunctions.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/operator/scalar/MathFunctions.java
Outdated
Show resolved
Hide resolved
4a26de4 to
9213486
Compare
presto-main/src/test/java/com/facebook/presto/operator/scalar/TestMathFunctions.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/operator/scalar/MathFunctions.java
Outdated
Show resolved
Hide resolved
0ab7b2a to
3311a86
Compare
|
A point to notice: A followup to this discussion is that I should also change inverse_chisquare_cdf to not allow p=1 (right now it allows it, and returns Inf as expected, but that cases an inconsistent behavior between it and inverse_normal_cdf) |
|
(I may wish to move this to int from bigint, need to verify further. https://commons.apache.org/proper/commons-math/javadocs/api-3.5/org/apache/commons/math3/distribution/IntegerDistribution.html ) |
|
Both checkstyle error and test failures are related. Please fix |
|
Will do, thanks.
…On Wed, Apr 28, 2021, 22:45 Rongrong Zhong ***@***.***> wrote:
Both checkstyle error and test failures are related. Please fix
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#15814 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHOJBVCEF2IM57IVLX4FRTTLBQVBANCNFSM4ZAHIVSA>
.
|
9b5c9c9 to
c1f94d9
Compare
|
@rongrong - diff was fixed, and all tests now came back green. It's back to you for review :) |
929026e to
01c3bac
Compare
|
Thanks @rongrong , I've now updated the text in the diff. I've also expanded a bit the description of inverse_poisson_cdf so it's clearer which value it return. |
|
Hey @rongrong - this diff includes all the fixes we've discussed. Could you please review and let me know if it's good to merge, or if there are other steps to take? |
|
Thanks @rongrong - made the fixes, diff is now ready for merging :) |
|
Thank you, @talgalili, for the contribution. |
|
Thanks @mbasmanova for the help in merging this. :) |
| @SqlType(StandardTypes.INTEGER) long value) | ||
| { | ||
| checkCondition(value >= 0, INVALID_FUNCTION_ARGUMENT, "value must be a non-negative integer"); | ||
| checkCondition(lambda > 0, INVALID_FUNCTION_ARGUMENT, "lambda must be greater than 0"); |
There was a problem hiding this comment.
Since the library API is already doing this check, I say just do a try/catch and throw user_error
There was a problem hiding this comment.
Interesting. Since this is the method used by all the distribution functions (i.e.: normal, beta, chi-square, binomial), do you think it should be changed there as well?
If so - could you please help with this change? (I'm not experienced in Java, so want to make sure I understand what you're proposing and which error will be thrown)
There was a problem hiding this comment.
Just do something like:
try { ... } catch(NotStrictlyPositiveException notStrictlyPositiveException) { throw new PrestoException(GENERIC_USER_ERROR, ...)
Look at StandardErrorCodes.java and other files to see the pattern what they do.
There was a problem hiding this comment.
And yeah it will be good to do for all of them.
| { | ||
| checkCondition(value >= 0, INVALID_FUNCTION_ARGUMENT, "value must be a non-negative integer"); | ||
| checkCondition(lambda > 0, INVALID_FUNCTION_ARGUMENT, "lambda must be greater than 0"); | ||
| PoissonDistribution distribution = new PoissonDistribution(lambda); |
There was a problem hiding this comment.
Is the lambda going to be generally fixed in a query? If so, you should find a way to avoid new object creation to improve memory perf.
There was a problem hiding this comment.
Also interesting. As I wrote to your other comment - this is the method used by all the distribution functions (i.e.: normal, beta, chi-square, binomial), do you think it should be changed there as well?
If so - could you please help with this change / propose how to do it?
Thanks upfront.
There was a problem hiding this comment.
I don't have a good suggestion here :( I looked in the code but can't find other examples. I will look further and comment here if I find something.
Adding the Poisson distribution, which is central to many statistical procedures (https://en.wikipedia.org/wiki/poisson_distribution) (#15798)
Test plan (adding unit-tests)
Following the diff template of: https://github.com/prestodb/presto/pull/11981/files
== RELEASE NOTES ==
General Changes
(like the beta_cds: https://prestodb.io/docs/current/release/release-0.215.html?highlight=beta_cds)