Skip to content

Conversation

@jbrockmendel
Copy link

xref pandas-dev/pandas#38134 pandas will require passing ndim to make_block, this updates to the new usage

cc @jorisvandenbossche

@github-actions
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on JIRA?
https://issues.apache.org/jira/browse/ARROW

Then could you also rename pull request title in the following format?

ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}

See also:

@jorisvandenbossche jorisvandenbossche changed the title CLN: pass ndim to pandas make_block ARROW-10768: [Python] pass ndim to pandas make_block Dec 21, 2020
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this. The tests are failing, though.

For future reference, please reference the JIRA issue that you opened in the PR

elif 'timezone' in item:
dtype = make_datetimetz(item['timezone'])
cls = dtype.construct_array_type()
block_arr = cls._simple_new(block_arr, dtype=dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line seems to cause the failures on latest pandas. So if the previous way was working, I would keep it that way (in addition, although pyarrow is already using private APIs here, _simple_new is even more a private implementation detail, so I would prefer not using it)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, the dtype.construct_array_type() line is failing on older pandas releases. So reverting this change might solve that as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the motivation here was to stop passing dtype to make_block, since that kwarg isnt used within pandas and AFAICT this usage is the blocker to deprecating/removing it. could be considered separate from the matter at hand, but while im in the neighborhood

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I understand.
Now, next problem: also DatetimeArray is not yet available on the older pandas versions ..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(now we could do something else depending on the pandas version)

@github-actions
Copy link

@jorisvandenbossche
Copy link
Member

@jbrockmendel Do you want to update this PR?

except ImportError:
# older pandas versions
from pandas import DatetimeIndex
block_arr = DatetimeIndex(block_arr, dtype=dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also failing on older pandas. I assume because DatetimeArray and DatetimeIndex might be interpreting datetime64 differently? (wall time vs unix time? didn't check, but I remember discussions about it)

@jbrockmendel
Copy link
Author

closing, as pandas can now deal with this internally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants