-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The zero-copy behaviour is not valid for np.recarray
#37573
Comments
Ray just build readonly buffer objects referencing to blocks of shared memory when deserialize buffers. Upper level framework decides whether to copy this buffer or not. e.g. If upper level framework needs a writeable buffer, but it found a readonly one when deserializing, then the framework may copy the buffer or raise an exception. |
Thanks for your reply @fyrestone . I can simplify my test case: # %%
import numpy as np
import ray
from ray.util import inspect_serializability
print(np.__version__)
print(ray.__version__)
# Step 3: Initialize Ray
ray.init()
print("np.ndarray".center(80, '-'))
# Step 5: Create and store array in shared memory using Ray
array_size = 1000000
old_array = np.arange(array_size)
print(old_array.flags)
shared_array_id = ray.put(old_array)
shared_array = ray.get(shared_array_id)
print(shared_array.flags)
# Step 8: Check memory addresses
original_address = shared_array.__array_interface__['data'][0]
new_address = ray.get(shared_array_id).__array_interface__['data'][0]
print("Original Address:", original_address)
print("New Address:", new_address)
print("np.recarray".center(80, '-'))
# Step 5: Create and store recarray in shared memory using Ray
recarray_size = 1000000
recarray_dtype = np.dtype([('id', np.int64), ('value', np.float64)])
old_recarray = np.recarray(recarray_size, dtype=recarray_dtype)
print(old_recarray.flags)
shared_recarray_id = ray.put(old_recarray)
shared_recarray = ray.get(shared_recarray_id)
print(shared_recarray.flags)
# Step 8: Check memory addresses
original_address = shared_recarray.__array_interface__['data'][0]
new_address = ray.get(shared_recarray_id).__array_interface__['data'][0]
print("Original Address:", original_address)
print("New Address:", new_address)
assert original_address == new_address Output is:
You can see, I just create a recarray, and put & get it using ray, don't do anything related to upper framework, ray already copy this buffer as the output shows. |
So, I think this should be either a bug or a feature request for Ray. |
In this case, numpy is the upper level framework running on ray. |
You mean it's still a numpy pickle issue, unrelated with Ray? |
I think so. Ray only provides readonly buffers. |
I really don't think so. As you can see from similar issues/PRs: |
Oh, sry. You're right @fyrestone Thanks! Will try to forward this issue to numpy community. |
Seems like this is an external issue? |
We will close the issue for now. But please reopen it if there's an action item from our end |
What happened + What you expected to happen
The zero-copy behaviour is not valid for np.recarray.
The output is:
Versions / Dependencies
np=1.23.5
,ray=2.3.1
Reproduction script
Issue Severity
Medium: It is a significant difficulty but I can work around it.
The text was updated successfully, but these errors were encountered: