Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove implicit switch-ons of "compute.ops_on_diff_frames" #1953

Merged
merged 4 commits into from
Dec 7, 2020

Conversation

xinrong-meng
Copy link
Contributor

@xinrong-meng xinrong-meng commented Dec 4, 2020

Proposal

  • Remove implicit switch-ons of "compute.ops_on_diff_frames".
  • Fix doctests
  • Fix unit tests

@xinrong-meng
Copy link
Contributor Author

Shall we modify the docstrings of each touched function to reflect this change?

@xinrong-meng xinrong-meng changed the title Remove implicit switch-on of "compute.ops_on_diff_frames" Remove implicit switch-ons of "compute.ops_on_diff_frames" Dec 4, 2020
@xinrong-meng xinrong-meng marked this pull request as ready for review December 4, 2020 22:46
@codecov-io
Copy link

codecov-io commented Dec 4, 2020

Codecov Report

Merging #1953 (a049d87) into master (347ce57) will decrease coverage by 0.00%.
The diff coverage is 90.90%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1953      +/-   ##
==========================================
- Coverage   94.64%   94.64%   -0.01%     
==========================================
  Files          49       49              
  Lines       10822    10828       +6     
==========================================
+ Hits        10243    10248       +5     
- Misses        579      580       +1     
Impacted Files Coverage Δ
databricks/koalas/namespace.py 84.19% <87.50%> (-0.04%) ⬇️
databricks/koalas/series.py 96.96% <100.00%> (-0.08%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 347ce57...a049d87. Read the comment docs.

@HyukjinKwon
Copy link
Member

I tend to agree with this change ... but to confirm WDYT @ueshin and @itholic?

@itholic
Copy link
Contributor

itholic commented Dec 7, 2020

I agree, too!

Shall we modify the docstrings of each touched function to reflect this change?

For this, IMO, I think we don't necessarily need to modify the docstrings since it already shows a proper error message - Cannot combine the series or dataframe because it comes from a different dataframe. In order to allow this operation, enable 'compute.ops_on_diff_frames' option. -.

@xinrong-meng xinrong-meng requested a review from ueshin December 7, 2020 19:15
Copy link
Collaborator

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

combined = combine_frames(self.to_frame(), other.to_frame())
if not self.index.equals(other.index):
raise ValueError("Can only compare identically-labeled Series objects")
combined = combine_frames(self.to_frame(), other.to_frame())
Copy link
Collaborator

@ueshin ueshin Dec 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, we should consider the case where other is from the same anchor:

>>> s1 = ks.Series(["a", "b", "c", "d", "e"])
>>> s1.compare(s1)
Traceback (most recent call last):
...
AssertionError: We don't need to combine. `this` and `that` are same.

Of course, this fix should be done in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Let me file a new PR for this.

@xinrong-meng xinrong-meng merged commit 58993f8 into databricks:master Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants