-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deprecate histogram functionality #6842
Conversation
This might be the way forward but I don't think we're going to do this in 0.3. |
6c7c7e3
to
1a4c02f
Compare
Bump. Now is the time to do this. |
+1 |
Great idea. |
I support this too, but one concern is that the implementation in StatsBase looks like it will probably be painfully slow. Has anyone benchmarked it vs the one in Base? (esp. with respect to #8952) |
@timholy I think the implementation in StatsBase has yet to be optimized. What about moving your PR there? |
I'll give that a whirl (after a couple of other priorities). |
I have thought about this, and generally have mixed feelings about not being able to do histograms in base julia. It is common enough that one may not want to install StatsBase and use it from a module. I am ok with renaming, better APIs, but perhaps we should move Histogram from StatsBase to Base than the other way around. |
Is it just me or does anyone else find the Matlab-style hist function really hard to use? If I do |
Does one use |
I tried to use |
We call that Coming from a statistics background, I'm not really excited about mixing |
Matlab seems to have deprecated |
On this issue, I agree with @johnmyleswhite and consider a histogram a density estimate and what @StefanKarpinski asks for a frequency count or countmap (I think the latter word is more common in machine learning). I also think of a histogram as the graphical representation and have been surprised by the https://groups.google.com/forum/#!topic/julia-stats/CO3Pgc89Y7A which was a use I was not aware of. Is it also used like this in CS? |
Counting the occurrences of discrete items and summarizing the distribution of continuous values using bins (i.e. constructing histograms) are different concepts and usually used in different contexts. Obviously, we should use different functions for them. |
We should remove this from Base in 0.4 in favour of the functionality in StatsBase. |
Bump. A couple of the tests rely on |
We can comment them out for now, and rewrite them, or move those tests to StatsBase. |
+1 for removing, and also the older comments that it invites confusion (I tried to use it in the count-uniques sense earlier today...). StatsBase is great and very widely used - most users will have it installed anyway. |
I'm fine with moving histogram functionality to StatsBase, but do have some issues with the implementation in StatsBase.
|
Having a function called |
I've updated the PR. @BobPortmann Those are good points: would you mind opening an issue on StatsBase? |
Bump. |
This was a 0.4.x milestone PR; still relevant? Should we remove |
Marking 0.5.0 for the triage team. |
Closed in favour of #16450 |
The new
Histogram
type is now in StatsBase.jl. This deprecates the histogram functions (hist
,hist!
,histrange
,midpoints
) from Base. See discussion in #6601.