Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix stat functions with no numeric columns. #1967

Merged
merged 1 commit into from
Dec 16, 2020

Conversation

ueshin
Copy link
Collaborator

@ueshin ueshin commented Dec 12, 2020

Some statistic functions fail if there are no numeric columns.

>>> kdf = ks.DataFrame({"A": pd.date_range("2020-01-01", periods=3), "B": pd.date_range("2021-01-01", periods=3)})
>>> kdf.mean()
Traceback (most recent call last):
...
ValueError: Current DataFrame has more then the given limit 1 rows. Please set 'compute.max_rows' by using 'databricks.koalas.config.set_option' to retrieve to retrieve more than 1 rows. Note that, before changing the 'compute.max_rows', this operation is considerably expensive.

The functions which allow non-numeric columns by default are:

  • count
  • min
  • max

@codecov-io
Copy link

codecov-io commented Dec 12, 2020

Codecov Report

Merging #1967 (9d673d3) into master (b65891d) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1967   +/-   ##
=======================================
  Coverage   94.60%   94.60%           
=======================================
  Files          49       49           
  Lines       10890    10890           
=======================================
+ Hits        10302    10303    +1     
+ Misses        588      587    -1     
Impacted Files Coverage Δ
databricks/koalas/frame.py 96.79% <100.00%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b65891d...9d673d3. Read the comment docs.

@ueshin ueshin requested a review from xinrong-meng December 14, 2020 19:13
@@ -27,35 +27,46 @@


class StatsTest(ReusedSQLTestCase, SQLTestUtils):
def _test_stat_functions(self, pdf, kdf):
def _test_stat_functions(self, pdf_or_pser, kdf_or_kser):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great refactoring!

@xinrong-meng
Copy link
Contributor

LGTM! Thank you!

Copy link
Contributor

@itholic itholic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HyukjinKwon HyukjinKwon merged commit bd73c30 into databricks:master Dec 16, 2020
@ueshin ueshin deleted the stats_with_no_numeric_columns branch December 16, 2020 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants