Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add scatter plot for Frame #719

Merged
merged 9 commits into from
Aug 30, 2019

Conversation

charlesdong1991
Copy link
Contributor

@charlesdong1991 charlesdong1991 commented Aug 29, 2019

Screen Shot 2019-08-30 at 8 47 22 AM
Screen Shot 2019-08-30 at 8 47 40 AM
Screen Shot 2019-08-30 at 8 47 50 AM

@charlesdong1991
Copy link
Contributor Author

charlesdong1991 commented Aug 29, 2019

ideally, should use docstring with is identical to what pandas have, however, i encountered quite annoying linting error issue with docstring which I have no idea how to fix, so for now, i keep s and c docstring very simple. Will change/complete them in future PR where I update docstring for all functions.

@codecov-io
Copy link

codecov-io commented Aug 29, 2019

Codecov Report

Merging #719 into master will decrease coverage by 1.75%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #719      +/-   ##
==========================================
- Coverage   94.18%   92.43%   -1.76%     
==========================================
  Files          32       32              
  Lines        5559     5567       +8     
==========================================
- Hits         5236     5146      -90     
- Misses        323      421      +98
Impacted Files Coverage Δ
databricks/koalas/plot.py 94.91% <100%> (+0.11%) ⬆️
databricks/koalas/usage_logging/__init__.py 23.14% <0%> (-74.08%) ⬇️
databricks/koalas/usage_logging/usage_logger.py 50% <0%> (-50%) ⬇️
databricks/koalas/__init__.py 77.5% <0%> (-7.5%) ⬇️
databricks/conftest.py 93.02% <0%> (-4.66%) ⬇️
databricks/koalas/internal.py 95.83% <0%> (-0.47%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5537d71...cc87273. Read the comment docs.

@@ -509,6 +509,17 @@ def _make_plot(self):
super(KoalasBarhPlot, self)._make_plot()


class KoalasScatterPlot(ScatterPlot, TopNPlot):
max_rows = 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, yeah, removed. and added pics in description, i thought i did it yesterday. my bad

@softagram-bot
Copy link

Softagram Impact Report for pull/719 (head commit: cc87273)

⭐ Change Overview

Showing the changed files, dependency changes and the impact - click for full size
(Open in Softagram Desktop for full details)

📄 Full report

Impact Report explained. Give feedback on this report to [email protected]

@HyukjinKwon
Copy link
Member

@charlesdong1991, before we merge, can you check if Koalas plot adds "showing top 1,000 elements only" in the output chart when the rows are larger than 1000?

@charlesdong1991
Copy link
Contributor Author

@HyukjinKwon good call! cannot really read it because font size is a bit too small to me 😅 but we probably can tell from density of points that koalas plot less.

Screen Shot 2019-08-30 at 9 40 53 AM

@HyukjinKwon HyukjinKwon merged commit a1efa61 into databricks:master Aug 30, 2019
@HyukjinKwon
Copy link
Member

Thanks for testing it out, @charlesdong1991.

@Ankurneural
Copy link

How to use all the rows for the plot?
Setting, ks.set_option('compute.max_rows', 50000)
ks.set_option("display.max_rows", 50000) seems not to be working

@HyukjinKwon
Copy link
Member

plotting.max_rows should work for that. There's a documentation here - https://koalas.readthedocs.io/en/latest/user_guide/options.html#available-options

@Ankurneural
Copy link

plotting.max_rows should work for that. There's a documentation here - https://koalas.readthedocs.io/en/latest/user_guide/options.html#available-options
Thanks! I somehow missed it. It is working as anticipated.

@Ankurneural
Copy link

@HyukjinKwon I am trying to make subplots using Koalas Dataframe but it comes blank.
d1 is a koalas dataframe.
Sample Code:
fig, (ax1, ax2) = plt.subplots(1, 2)
ax1 = d1.plot.scatter(x="col1", y="col2")
ax2 = d1.plot.scatter(x="col1", y="col2")
display(fig)

Any suggestions or help on this?

@HyukjinKwon
Copy link
Member

@Ankurneural can you create a new ticket with the full codes and output?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants