-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pareto k diagnostics and Pareto smoothing #283
Conversation
Please let me know once this PR is ready for review. |
I've now added tests for the I suppose tests for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks quite good already. I hope my comments help to further improve it. Thank you for working on this!
R/gpd.R
Outdated
|
||
#' @rdname GPD | ||
#' @export | ||
dgpd <- function(x, mu = 0, sigma = 1, k = 0, log = FALSE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be sensible to make all the functions of the GPD vectorized across mu, sigma, and k?
Als the names are pretty short. How about (d|p...)generalized_pareto?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about these functions, I realised that only dgpd
is used internally for the calculation of the marginal posterior in gpdpost
. extraDistr
already provides GPD (using a C++ implementation), so I'm not sure posterior needs to have reimplementations of them. @avehtari can you see a clear use for having the GPD functions (q*
, p*
, r*
) directly available via posterior or could we just have dgpd
for internal use and only export gpdpost
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I have removed most of these functions from the PR, leaving only the necessary ones for point estimate diagnostics and pareto smoothing. I have also changed the remaining to internal functions, not exported.
|
||
#' @rdname pareto_khat | ||
#' @export | ||
pareto_khat.rvar <- function(x, ...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mjskay could you have a look and check if this pattern for rvars is the most sensible approach (same for pareto_smooth below)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look, and I'm not sure the resulting output format is quite what I'd expect for rvars given how we've tended to define other diagnostics and transformations on rvars, which tend to return rvars (the above returns the draws and the diagnostics of the smoothed rvar as a flattened list of arrays).
If you do something like this for pareto_smooth
:
pareto_smooth.rvar <- function(x, ...) {
draws_diags <- summarise_rvar_by_element_with_chains(x, pareto_smooth.default, ...)
dim(draws_diags) <- dim(draws_diags) %||% length(draws_diags)
margins <- seq_along(dim(draws_diags))
list(
x = rvar(apply(draws_diags, margins, function(x) x[[1]]$x), nchains = nchains(x)),
khat = apply(draws_diags, margins, function(x) x[[1]]$diagnostics$khat)
)
}
Then the smoothed sample is returned as an rvar with the same structure as the input, and the khat
diagnostic is returned as an array with the same dimensions as the input; e.g.:
set.seed(1234)
x = rvar_rng(rnorm, 4, 1:4)
dim(x) = c(2,2)
pareto_smooth(x)
# $x
# rvar<4000>[2,2] mean ± sd:
# [,1] [,2]
# [1,] 1 ± 1.00 3 ± 0.99
# [2,] 2 ± 0.99 4 ± 1.00
#
# $khat
# [,1] [,2]
# [1,] -0.12174198 -0.07504085
# [2,] -0.05649535 -0.01960201
I think something along those lines would make the most sense to me, unless I've misunderstood something about the intent here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this suggestion. I have modified the code accordingly, and also made it handle the extra diagnostics. The output is now like this for rvars:
pareto_smooth(x, extra_diags = TRUE)
$x
rvar<4000>[2,2] mean ± sd:
[,1] [,2]
[1,] 1 ± 1.00 3 ± 0.99
[2,] 2 ± 0.99 4 ± 1.00
$diagnostics
$diagnostics$khat
[,1] [,2]
[1,] -0.12174198 -0.07504085
[2,] -0.05649535 -0.01960201
$diagnostics$min_ss
[,1] [,2]
[1,] 10 10
[2,] 10 10
$diagnostics$khat_threshold
[,1] [,2]
[1,] 0.7223811 0.7223811
[2,] 0.7223811 0.7223811
$diagnostics$convergence_rage
[,1] [,2]
[1,] 1 1
[2,] 1 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mjskay can you check if this is what you had in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, looks good to me!
Thanks for the review and comments @paul-buerkner and @mjskay! I'll have an updated version in the coming days with these suggestions addressed. |
After thinking about this PR more, I am going to split it into two separate PRs:
As I still need to think a bit more about 2, I will remove the parts pertaining to it from this PR, so they do not hold it up. |
Note, a fix is needed for the extra_diags:
|
Please let me know, once the PR is ready to merge from your side. |
There may be a slight delay with finishing this PR as the repository of the fork seems to be corrupted on GitHub and is not able to accept new commits or even be cloned. I've filed a ticket with GitHub, hoping for a quick response. |
After discussion with @avehtari we decided to make some further adjustments:
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if c7101a2 is merged into master:
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if ada117a is merged into master:
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if a344f01 is merged into master:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through the doc parts. Found things to improve and some typos
R/pareto_smooth.R
Outdated
|
||
# automatically calculate relative efficiency | ||
if (is.null(r_eff)) { | ||
r_eff <- ess_basic(x) / S |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, now that I rehink this, ess_tail would be more appropriate when r_eff is used to determine tail_length
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 4962965 is merged into master:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs ok now
Great! @avehtari can I merge the PR now? |
I approve merging |
Perfect, thank you both so much for adding this functionality to posterior! |
Indeed, thank you! I've been preoccupied with a bunch of other Stan R package issues, so I hadn't looked through this yet. Very cool! |
Summary
This PR adds functionality for Pareto smoothing, following on from work by @ozanstats and @avehtari to address #237. It adds two primary user-facing functions:
pareto_smooth
andpareto_khat
.pareto_smooth
performs smoothing on the tail(s) of the input and returns the smoothed draws and diagnostic values, whereaspareto_khat
callspareto_smooth
but only returns the diagnostic values.pareto_khat
is designed for use in similar ways to the convergence functionsrhat
andess_bulk
. It is made to work insummarise_draws
. On the other hand,pareto_smooth
is probably most useful if one wants to do Pareto smoothing but not use theloo
package for this (for example to implement moment-matching + psis in theiwmm
package).I believe the only functionality it does not implement that is mentioned in #237 is allowing
r_eff
to be a vector of different values (one for each variable). Ifr_eff
is specified, it must be just a single value, as I'm not sure how to allow differentr_eff
values for each variable when usingsummarise_draws
. However, when it is automatically calculated (default), then it works as intended and is calculated separately for each variable.Current status
The functions should be working as intended, and the documentation is more or less complete (ready for comments). There are no tests yet, so any suggestions for recommended tests would be helpful.
Example functionality
Copyright and Licensing
By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the following licenses: