You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the ray.data.RandomAccessDataset.multiget expected return a None for missing records, in fact, I got an unexpected value for the missing key.
I find this PR update the _RandomAccessWorker.multiget: #24825, and it use the np.searchsorted to speed up the multiget, but the np.searchsorted will return the insertion points for missing records and it use the search result directly to get the row from the block without test col[i] == key, just like the code here:
The text was updated successfully, but these errors were encountered:
sunyakun
added
bug
Something that is supposed to be working; but isn't
triage
Needs triage (eg: priority, bug/not-bug, and owning component)
labels
Apr 16, 2024
sunyakun
changed the title
[<Ray component: Data>] RandomAccessDataset.multiget return unexpected values for missing keys.
[Data] RandomAccessDataset.multiget return unexpected values for missing keys.
Apr 16, 2024
What happened + What you expected to happen
the ray.data.RandomAccessDataset.multiget expected return a None for missing records, in fact, I got an unexpected value for the missing key.
I find this PR update the _RandomAccessWorker.multiget: #24825, and it use the np.searchsorted to speed up the multiget, but the np.searchsorted will return the insertion points for missing records and it use the search result directly to get the row from the block without test col[i] == key, just like the code here:
ray/python/ray/data/random_access_dataset.py
Lines 266 to 269 in d8c7234
Versions / Dependencies
Ray: latest master
Python: 3.9.2
OS: linux
Reproduction script
Issue Severity
None
The text was updated successfully, but these errors were encountered: