Commit 138e59a
authored
[data] Handle nullable fields in schema across blocks for parquet files (#48478)
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->
<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->
## Why are these changes needed?
When writing blocks to parquet, there might be blocks with fields that
differ ONLY in nullability - by default, this would be rejected since
some blocks might have a different schema than the ParquetWriter.
However, we could potentially allow it to happen by tweaking the schema.
This PR goes through all blocks before writing them to parquet, and
merge schemas that differ only in nullability of the fields.
It also casts the table to the newly merged schema so that the write
could happen.
<!-- Please give a short summary of the change and the problem this
solves. -->
## Related issue number
Closes #48102
---------
Signed-off-by: rickyx <[email protected]>1 parent bcee207 commit 138e59a
File tree
2 files changed
+33
-5
lines changed- python/ray/data
- _internal/datasource
- tests
2 files changed
+33
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
| 61 | + | |
61 | 62 | | |
62 | 63 | | |
63 | 64 | | |
| |||
72 | 73 | | |
73 | 74 | | |
74 | 75 | | |
75 | | - | |
76 | | - | |
77 | | - | |
| 76 | + | |
78 | 77 | | |
79 | 78 | | |
80 | 79 | | |
81 | 80 | | |
82 | | - | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
83 | 87 | | |
84 | | - | |
| 88 | + | |
85 | 89 | | |
86 | 90 | | |
87 | 91 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1331 | 1331 | | |
1332 | 1332 | | |
1333 | 1333 | | |
| 1334 | + | |
| 1335 | + | |
| 1336 | + | |
| 1337 | + | |
| 1338 | + | |
| 1339 | + | |
| 1340 | + | |
| 1341 | + | |
| 1342 | + | |
| 1343 | + | |
| 1344 | + | |
| 1345 | + | |
| 1346 | + | |
| 1347 | + | |
| 1348 | + | |
| 1349 | + | |
| 1350 | + | |
| 1351 | + | |
| 1352 | + | |
| 1353 | + | |
| 1354 | + | |
| 1355 | + | |
| 1356 | + | |
| 1357 | + | |
1334 | 1358 | | |
1335 | 1359 | | |
1336 | 1360 | | |
| |||
0 commit comments