Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix from_pandas to handle the same index name as a column name. #1419

Merged
merged 4 commits into from
Apr 13, 2020

Conversation

ueshin
Copy link
Collaborator

@ueshin ueshin commented Apr 11, 2020

When the input pandas DataFrame has the same index name as a column name, ks.from_pandas() fails with the following error:

>>> pdf = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
>>> pdf.index.name = "a"
>>> pdf
   a  b
a
0  1  4
1  2  5
2  3  6

>>> kdf = ks.from_pandas(pdf)
Traceback (most recent call last):
...
ValueError: cannot insert a, already exists

Resolves #1361, Closes #1375.

@codecov-io
Copy link

codecov-io commented Apr 11, 2020

Codecov Report

Merging #1419 into master will increase coverage by 0.09%.
The diff coverage is 62.50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1419      +/-   ##
==========================================
+ Coverage   95.04%   95.13%   +0.09%     
==========================================
  Files          34       34              
  Lines        7970     7958      -12     
==========================================
- Hits         7575     7571       -4     
+ Misses        395      387       -8     
Impacted Files Coverage Δ
databricks/koalas/groupby.py 91.76% <0.00%> (+1.30%) ⬆️
databricks/koalas/frame.py 96.61% <100.00%> (ø)
databricks/koalas/internal.py 96.35% <100.00%> (+0.29%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7b48858...3ea9cfc. Read the comment docs.

@HyukjinKwon HyukjinKwon merged commit d63e747 into databricks:master Apr 13, 2020
@ueshin ueshin deleted the from_pandas branch April 13, 2020 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicated key when groupby -> apply is used
3 participants