-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the API consistent with Statistics.jl/StatsBase.jl #9
base: main
Are you sure you want to change the base?
Conversation
@huangziwei this is marked as a draft because it is not yet complete, but before I continue, it would be helpful to have a chat (synchronous or asynchronous on here) about the discussion points highlighted above. |
@sethaxen cool! I will have a look later and get back to you here soon! |
@sethaxen This PR is a good start to polish and push further this package, however, it wouldn't necessarily be a breaking version. Supporting The other thing i had planed is to improve the documentation and CI testing, right now, doc only explain the function signature, without examples, plots. The formula used only mentioned of their reference in comments. it would be good to add the precise math Latex in doc and full references. This can promote the packages among several other similar ones and provide learning materials for users too. It should also be straightforward to add GitHub flows with code coverage. |
I agree that methods taking
Yes this would be nice. The implementations themselves also need some updates to avoid unnecessary allocations, type-promotions, and type-instability, but these updates are orthogonal to this PR, which strictly focuses on API updates. |
gosh, time flies so fast and it's been a year, but I still haven't got the time to go through this (and I still can't properly review it as I still don't do Julia at all...). I'd suggest @sethaxen you take over the Julia version (no need to involve me, for now) and we only communicate on the keeping API design level for consistency, if necessary? |
@huangziwei, sorry for the delay in replying. I opened this PR as part of my work at the @mlcolab and intended it as part of a larger effort we had discussed last year to improve consistency with both the Julia stats ecosystem and with other circstats packages (where applicable). However, priorities have since changed. I will soon be moving on from my position at the colab and no longer plan to continue the effort; thus it would not be ideal for me to make changes to the package if no-one else is able to maintain those changes. If there's future interest in your group (or from someone else) in refreshing this package, then I think this PR is still a good starting point for that. |
This PR makes a number of breaking changes to make the API of
circ_foo
more consistent with its corresponding functionfoo
(if defined) in Statistics/StatsBase.Main changes
Supporting
StatsBase.AbstractWeights
Weighted scalar statistics are handled in StatsBase using
StatsBase.AbstractWeights
types, which can represent not only frequency weights but also other kinds of weights. Unlike the current API, weights in StatsBase are always vectors, so when they are used, thedims
keyword is not supported, and instead an optional positional argumentdim::Int
is provided, specifying the single dimension that will be reduced to a singleton. i.e., the signatures areThis PR adopts the same signatures for
circ_foo
, which unfortunately does require some code duplication. Future refactors could reduce this code duplication.Avoiding recomputing
circ_mean
andcirc_r
When a function requires
r
ormean
, it now accepts one or both of these as a keyword argument, allowing for some speed-ups. See howcirc_stats
does this for example. It also addscirc_mean_and_r
, analogous tomean_and_var
andmean_and_std
in StatsBase.Return a single statistic
Now
circ_mean
returns just the mean, whilecirc_std
andcirc_var
accept akind
keyword to specify the kind of statistic to return. It would be nice if we could do something similar forcirc_skewness
orcirc_kurtosis
as well.circ_moment
could be similarly simplified to return the complex moment, but as this isn't consistent with the other moment functions, this doesn't seem ideal.What's left
circ_skewness
,circ_kurtosis
, andcirc_moment
AbstractWeights
weight types.Discussion
Before this PR, this package supported multidimensional arrays of weights. Is there a clear use case for this? JuliaStats/StatsBase.jl#776 discusses adding something similar to StatsBase, but if this happens, it wouldn't happen for some time.
cc @huangziwei, @Meteore