[DataFrame] Fix bug with consistency between IndexMetadata and partitions#2088
[DataFrame] Fix bug with consistency between IndexMetadata and partitions#2088pschafhalter wants to merge 1 commit intoray-project:masterfrom
Conversation
|
Test PASSed. |
a2ef058 to
9802010
Compare
|
Test PASSed. |
python/ray/dataframe/utils.py
Outdated
There was a problem hiding this comment.
Can this not be fixed in create_blocks_helper? It looks like that would be the source of the problem.
There was a problem hiding this comment.
It could if we want to add another argument to create_blocks_helper to specify the number of rows in each partition.
I wasn't sure if we wanted to change the behavior of create_blocks_helper.
9802010 to
8dd2fb3
Compare
|
Test PASSed. |
8dd2fb3 to
3f5a0d8
Compare
|
Test PASSed. |
|
All tests passed on private travis for current commit. |
|
Deprecated due to #2118. |
Calling
_correct_column_dtypesrebuilds partitions usingcreate_blocks_helper. However,create_blocks_helpercreates partitions of equal size which results in an inconsistency betweenIndexMetadataand column partitions. This PR aims to fix the issue by ensuring that_correct_column_dtypesconstructs partitions of the same size as before.Code which demonstrates the bug: