-
Notifications
You must be signed in to change notification settings - Fork 1k
Add Serializable ABC for Python
#5139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
31c65ec
df37353
4b085dc
e37b4a1
42353eb
ad2abb2
d17e586
9d19df7
ba0d225
2163072
4e1611f
430786c
b43ea58
7994dc7
87d1d5c
9fea0ad
51e9471
9175e5e
9391f68
3440bd7
3491152
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| # Copyright (c) 2020, NVIDIA CORPORATION. | ||
|
|
||
| import abc | ||
| import pickle | ||
| from abc import abstractmethod | ||
|
|
||
| import numpy | ||
|
|
||
| import rmm | ||
|
|
||
| import cudf | ||
|
|
||
|
|
||
| class Serializable(abc.ABC): | ||
| @abstractmethod | ||
| def serialize(self): | ||
| pass | ||
|
|
||
| @classmethod | ||
| @abstractmethod | ||
| def deserialize(cls, header, frames): | ||
| pass | ||
|
|
||
| def device_serialize(self): | ||
| header, frames = self.serialize() | ||
| assert all((type(f) is cudf.core.buffer.Buffer) for f in frames) | ||
| header["type-serialized"] = pickle.dumps(type(self)) | ||
| header["lengths"] = [f.nbytes for f in frames] | ||
| return header, frames | ||
|
|
||
| @classmethod | ||
| def device_deserialize(cls, header, frames): | ||
| for f in frames: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This isn't going to be true with pack/unpack serialization which packs into a single host buffer that stores metadata, and a single device buffer that stores data.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, that makes sense. Happy to change this as we see fit. The bigger idea here is that when we make these changes we can now do them in one place. So hopefully that makes pack/unpack and other changes in the future easier :)
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. I think then we can leave this as is for now. :)
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added PR ( #5309 ), which should handle a mixture of host and device frames. |
||
| # some frames are empty -- meta/empty partitions/etc | ||
| if len(f) > 0: | ||
| assert hasattr(f, "__cuda_array_interface__") | ||
|
|
||
| typ = pickle.loads(header["type-serialized"]) | ||
| obj = typ.deserialize(header, frames) | ||
|
|
||
| return obj | ||
|
|
||
| def host_serialize(self): | ||
| header, frames = self.device_serialize() | ||
| frames = [f.to_host_array().view("u1").data for f in frames] | ||
| return header, frames | ||
|
|
||
| @classmethod | ||
| def host_deserialize(cls, header, frames): | ||
| frames = [ | ||
| rmm.DeviceBuffer.to_device(memoryview(f).cast("B")) for f in frames | ||
| ] | ||
| obj = cls.device_deserialize(header, frames) | ||
| return obj | ||
|
|
||
| def __reduce_ex__(self, protocol): | ||
| header, frames = self.host_serialize() | ||
| if protocol >= 5: | ||
| frames = [pickle.PickleBuffer(f) for f in frames] | ||
| else: | ||
| frames = [numpy.asarray(f) for f in frames] | ||
| return self.host_deserialize, (header, frames) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 one from me as well! I especially like moving away from this list to anything which implements Serializable