-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Fix scatter norm keyword #45966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix scatter norm keyword #45966
Conversation
mroeschke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will also need an entry in doc/source/whatsnew/v1.5.0.rst in the bug fix / visualization section
|
small comment & if you can merge master; ping on green |
| ax = df.plot.scatter(x="a", y="b", c="c") | ||
| color_min_max = (df.c.min(), df.c.max()) | ||
| default_norm = mpl.colors.Normalize(*color_min_max) | ||
| assert all(df.c.apply(lambda x: ax.collections[0].norm(x) == default_norm(x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be done without apply? It will be easier to debug if this assertion doesn't go through necessary code paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the better approach? Just a loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the old check you had prior was good. The data doesn't have to be random as well if that helps shorten your loop and allows looping over less values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll revert to the previous check. I think looping over the values in the dataframe makes sense; it's replicating what's going on in the actual plot, and there's no real benefit to including values that are out of bounds - that's matplotlib normalize functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes made
jreback
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm over to you @mroeschke
|
Great PR @Andy-Grigg, thanks. |
| - Bug in :meth:`DataFrame.boxplot` that prevented passing in ``xlabel`` and ``ylabel`` (:issue:`45463`) | ||
| - Bug in :meth:`DataFrame.boxplot` that prevented specifying ``vert=False`` (:issue:`36918`) | ||
| - | ||
| - Bug in :meth:`DataFrame.scatter` that prevented specifying ``norm`` (:issue:`45809`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this PR has now been merged, but I think this is supposed to be `DataFrame.plot.scatter`. Are the changelogs worth fixing?
|
sure can do another PR to update |
* TST: add test for norm scatter plot parameter (pandas-dev#45809) * BUG: don't duplicate norm parameter for scatter plots (pandas-dev#45809) * TST: Add github issue numbers (GH45809) * TST: remove dependence on private attributes (pandas-dev#45809) * DOC: add entry to visualization bug fixes (pandas-dev#45809) * TST: reduce number of norm comparisons (pandas-dev#45809) * TST: simplify test (pandas-dev#45809)
normkeyword fails with 'multiple values for keyword argument' TypeError #45809doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.This PR addresses a bug where the
normparameter is used for a scatter plot where theccolor is a non-categorical type (e.g. a numeric type). This combination of inputs would result in thenormparameter being specified both as an explicit keyword argument and inself.kwds.The behavior has changed such that
normis only ever specified once, and in the scenario described above the value specified by the user will be honored.I tried to test that the specific
normvalue ended up in the plot, but there doesn't seem to be a good way of comparingnorm._scaleobjects without comparing the individual attributes. I'm happy to add this in if required.