-
-
Notifications
You must be signed in to change notification settings - Fork 19.2k
ENH Consistent apply output when grouping with freq #12362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Consistent apply output when grouping with freq #12362
Conversation
|
tests! |
|
I wanted to make sure it wouldn't be outright rejected first. :-) I'll write tests now. |
69dc772 to
5571ac7
Compare
|
Tests. On the master branch, |
pandas/core/groupby.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create an actual Grouping object here rather than a mock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, that looks a lot better. Fixed.
5571ac7 to
1ad4ebe
Compare
|
Can u have a look thru online groupby issues (there are a few!) and see if this has some similarity with any I recall one that had to do with the dtype changing in apply (and had to do with that code u removed) so maybe this would fix another issue |
|
A search through issues tagged "Groupby", or containing "TimeGrouper" and "apply", or "Grouper", "apply", and "freq" doesn't show anything that looks to me like it would be fixed by this. |
|
your PR might close #5839 |
|
i'll have to look at this more, but pls add a whatsnew |
apply output when grouping with freq1ad4ebe to
1a2bdc5
Compare
|
I added a What's New entry. Not sure the best way to describe / categorize this -- perhaps it belongs under bugs? It doesn't look like #5839 is affected. I checked to be sure, and the examples there are unchanged by this PR. This PR only affects I scanned through the docs and didn't see any examples or instructions that would be affected by this PR. |
pandas/tests/test_groupby.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also tests that these produce the same output when doing a .resample(..).apply(..) (both functions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call this result instead of grp_out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the name of grp_out to expected, since that's the thing left unchanged by this PR.
|
I am not convinced this is doing the right thing. notice the return types dtypes. |
|
I'm not sure if it's doing the right thing, but this PR doesn't change the output of When you suggest testing the output of Corresponding outputs using a resampling groupby, this PR: Outputs on master branch: The test failure is in |
|
you can rebase on master; the |
c41eb84 to
046a0fb
Compare
|
Rebased. |
|
can you rebase/update (and move note to 0.18.1). and i'll take a look. |
046a0fb to
ad087fc
Compare
|
@jreback , done. |
doc/source/whatsnew/v0.18.1.txt
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't list the PR number, only the issue number. give some more context, like: .groupby(...).apply(...). I think a brief example might be waranted here (e.g. show old and new)
c5db074 to
c7b5b3e
Compare
The `BinGrouper.apply` and `BaseGrouper.apply` have different output types. To make them consistent, remove `BinGrouper.apply` and let it use the same method as the superclass `BaseGrouper`. This requires changing `BinGrouper.groupings` to return a list of `Grouping` objects (there will always only be one) instead of `None`.
c7b5b3e to
d73f332
Compare
|
@jreback , I added extra detail and examples to the What's New, and rebased. Sorry for the delay. |
doc/source/whatsnew/v0.18.1.txt
Outdated
|
|
||
| New Behavior: | ||
|
|
||
| .. code-block:: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make the new an ipython block (so the code runs)
|
thanks @stephen-hoover some doc-comments. ping when pushed and i can look. |
3d43dde to
3b9c4f7
Compare
3b9c4f7 to
8cf618f
Compare
|
@jreback , modified the docs. Let me know if this looks good. |
|
thanks @stephen-hoover |
The
BinGrouper.apply(used by theTimeGrouper) andBaseGrouper.apply(used by theGrouper) have different output types. To make them consistent, removeBinGrouper.applyand let it use the same method as the superclassBaseGrouper. This requires changingBinGrouper.groupingsto return an object which looks sufficiently like aGroupingobject forBaseGrouper.applyto be happy. Use a namedtuple to create an object with the needed attributes.I think this change will only affect the output of
applywith a custom function, and only when grouping by aTimeGrouper(although aGrouperwith afreqspecified turns into aTimeGrouper, so that counts).Closes #11742 .