Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement expanding.groupby.count in Series and Frame #991

Merged
merged 1 commit into from
Nov 6, 2019

Conversation

HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Nov 1, 2019

This PR implements expanding.groupby.count in Series and Frame

>>> import databricks.koalas as ks
>>> kser = ks.Series([2, 3, float("nan"), 10])
>>> kser.groupby(kser).expanding().count()
0
3.0   1    1.0
2.0   0    1.0
10.0  3    1.0
Name: 0, dtype: float64
>>> df = kser.to_frame()
>>> df.groupby(df['0']).expanding().count()
          0
0
3.0  1  1.0
2.0  0  1.0
10.0 3  1.0

Relates to #977

@HyukjinKwon HyukjinKwon changed the title [WIP] [WIP] Implement expanding.groupby.count in Series and Frame Nov 1, 2019
@codecov-io
Copy link

codecov-io commented Nov 1, 2019

Codecov Report

Merging #991 into master will decrease coverage by 1.44%.
The diff coverage is 97.95%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #991      +/-   ##
==========================================
- Coverage   94.79%   93.34%   -1.45%     
==========================================
  Files          34       34              
  Lines        6568     6601      +33     
==========================================
- Hits         6226     6162      -64     
- Misses        342      439      +97
Impacted Files Coverage Δ
databricks/koalas/missing/window.py 100% <ø> (ø) ⬆️
databricks/koalas/groupby.py 91.39% <100%> (ø) ⬆️
databricks/koalas/window.py 93.57% <97.87%> (+2.66%) ⬆️
databricks/koalas/usage_logging/__init__.py 24.54% <0%> (-72.73%) ⬇️
databricks/koalas/usage_logging/usage_logger.py 50% <0%> (-50%) ⬇️
databricks/koalas/__init__.py 80.85% <0%> (-6.39%) ⬇️
databricks/conftest.py 93.61% <0%> (-4.26%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f302c6d...3d4113d. Read the comment docs.

@HyukjinKwon HyukjinKwon changed the title [WIP] Implement expanding.groupby.count in Series and Frame Implement expanding.groupby.count in Series and Frame Nov 5, 2019
@HyukjinKwon
Copy link
Member Author

This should be ready for a look.

cc @itholic too since you're working on rolling.

@softagram-bot
Copy link

Softagram Impact Report for pull/991 (head commit: 3d4113d)

⭐ Change Overview

Showing the changed files, dependency changes and the impact - click for full size
(Open in Softagram Desktop for full details)

📄 Full report

Impact Report explained. Give feedback on this report to [email protected]

@HyukjinKwon HyukjinKwon requested a review from ueshin November 6, 2019 00:36
@HyukjinKwon
Copy link
Member Author

I'm merging this to proceed forward. I am still touching this file so please let me know if there are some comments.

@HyukjinKwon HyukjinKwon merged commit 5fc4b61 into databricks:master Nov 6, 2019
@HyukjinKwon HyukjinKwon deleted the groupby-expanding branch November 6, 2019 02:20
@itholic
Copy link
Contributor

itholic commented Nov 6, 2019

@HyukjinKwon great! okay i'll start Rolling.GroupBy now.


internal = _InternalFrame(sdf=sdf,
data_columns=[c._internal.data_columns[0] for c in applied],
index_map=new_index_map)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to preserve column_index?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me take a look and fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants