Commit e61b8d8
[Data] Support serializing zero-length numpy arrays (#57858)
## Description
Ray data can't serialize zero (byte) length numpy arrays:
```python3
import numpy as np
import ray.data
array = np.empty((2, 0), dtype=np.int8)
ds = ray.data.from_items([{"array": array}])
for batch in ds.iter_batches(batch_size=1):
print(batch)
```
What I expect to see:
```
{'array': array([], shape=(1, 2, 0), dtype=int8)}
```
What I see:
```
/Users/chris.ohara/Downloads/.venv/lib/python3.12/site-packages/ray/air/util/tensor_extensions/arrow.py:736: RuntimeWarning: invalid value encountered in scalar divide
offsets = np.arange(
2025-10-17 17:18:09,499 WARNING arrow.py:189 -- Failed to convert column 'array' into pyarrow array due to: Error converting data to Arrow: column: 'array', shape: (1, 2, 0), dtype: int8, data: []; falling back to serialize as pickled python objects
Traceback (most recent call last):
File "/Users/chris.ohara/Downloads/.venv/lib/python3.12/site-packages/ray/air/util/tensor_extensions/arrow.py", line 672, in from_numpy
return cls._from_numpy(arr)
^^^^^^^^^^^^^^^^^^^^
File "/Users/chris.ohara/Downloads/.venv/lib/python3.12/site-packages/ray/air/util/tensor_extensions/arrow.py", line 736, in _from_numpy
offsets = np.arange(
^^^^^^^^^^
ValueError: arange: cannot compute length
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/chris.ohara/Downloads/.venv/lib/python3.12/site-packages/ray/air/util/tensor_extensions/arrow.py", line 141, in convert_to_pyarrow_array
return ArrowTensorArray.from_numpy(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/chris.ohara/Downloads/.venv/lib/python3.12/site-packages/ray/air/util/tensor_extensions/arrow.py", line 678, in from_numpy
raise ArrowConversionError(data_str) from e
ray.air.util.tensor_extensions.arrow.ArrowConversionError: Error converting data to Arrow: column: 'array', shape: (1, 2, 0), dtype: int8, data: []
2025-10-17 17:18:09,789 INFO logging.py:293 -- Registered dataset logger for dataset dataset_0_0
2025-10-17 17:18:09,815 WARNING resource_manager.py:134 -- 1 parent adef7b5 commit e61b8d8
File tree
2 files changed
+25
-6
lines changed- python/ray
- air/util/tensor_extensions
- data/tests
2 files changed
+25
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
732 | 732 | | |
733 | 733 | | |
734 | 734 | | |
| 735 | + | |
| 736 | + | |
735 | 737 | | |
736 | | - | |
737 | | - | |
738 | | - | |
739 | | - | |
740 | | - | |
741 | | - | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
742 | 747 | | |
743 | 748 | | |
744 | 749 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
507 | 507 | | |
508 | 508 | | |
509 | 509 | | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
510 | 524 | | |
511 | 525 | | |
512 | 526 | | |
| |||
0 commit comments