-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support native serialization format #106
Conversation
croaring/src/bitmap/serialization.rs
Outdated
|
||
use std::ffi::{c_char, c_void}; | ||
|
||
pub struct PortableSerializer {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend using a trait here, something like
pub trait Serializer {
fn serialize_into<'a>(bitmap: &Bitmap, dst: &'a mut Vec<u8>) -> &'a [u8];
fn get_serialized_size_in_bytes(bitmap: &Bitmap) -> usize;
fn try_deserialize(buffer: &[u8]) -> Option<Bitmap>;
}
// enum rather than struct, make it impossible to construct a value
// and unsuffixed, expected to be used as e.g `serialization::Portable`
pub enum Portable {}
impl Serializer for Portable { ... }
Also, for @saulius, thoughts on using a breaking change here, and introduce something like
impl Bitmap {
pub fn serialize_into<S: Serialize>(&self, dst: &'a mut Vec<u8>) -> &'a [u8] { ... }
pub fn serialized_size<S: Serialize>(&self) -> usize { ... }
pub fn try_deserialize<S: Serialize>(buffer: &[u8]) -> Option<Self> { ... }
}
We also could bring in BitmapView
in a similar way
pub trait SerializeView {
unsafe fn deserialize_view(data: &[u8]) -> BitmapView<'_>;
}
impl SerializeView for Portable { ... }
impl SerializeView for Frozen { ... }
impl<'a> BitmapView<'a> {
pub unsafe fn new<S: SerializeView>(data: &'a [u8]) -> Self { ... }
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dr-Emann, thanks for weighing in 🙏🏻
I've tried implementing your suggestion in 61e8433 -- is that what you had in mind? I decided to introduce three traits: Serializer
, Deserializer
and ViewDeserializer
. This is to capture that Frozen
only supports deserializing to a BitmapView
and not to a Bitmap
:
Format | Serializer | Deserializer | ViewDeserializer |
---|---|---|---|
Portable |
✅ | ✅ | ✅ |
Native |
✅ | ✅ | ❌ |
Frozen |
✅ | ❌ | ✅ |
Note that tests and documentation is still missing, but I was hoping to get the implementation right first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's what I was thinking, yeah. @saulius, I personally think this is worth the breaking change (e.g. a call to bitmap.get_serialized_size_in_bytes()
will no longer compile because it needs a Serializer type argument), but I'd like your opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also prefer this way to the way serialization is done in treemap, but we should probably make them consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, indeed. I also agree with you that this is worth a breaking change.
So far my line of thinking is this. Release current master as 0.9.0
- the last release under saulius
. Then we can can merge this, work on treemap rework and transfer over to RoaringBitmap
org (or the other way around). Newly refactored serialization interfaces can be a part of 1.0.0
released from under RoaringBitmap
. How does that sound? If there are any other breaking-worthy changes you both have in mind, that would also be a good time to incorporate.
I'll rebase on master when #107 is merged. |
61e8433
to
b65070d
Compare
Thanks for the update, @saulius! I've rebased on |
Alright, I've fixed tests and added rudimentary tests for the native format, so I think the changes are ready to be reviewed. |
|
||
/// The `Portable` format is meant to be compatible with other roaring bitmap libraries, such as Go or Java. | ||
/// It's defined here: https://github.com/RoaringBitmap/RoaringFormatSpec | ||
pub enum Portable {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it might make sense to pull Portable
, Frozen
, and Native
out of crate::bitmap::serialize
into a crate::serialize
, since they're not bitmap specific (we use the same types for treemap serialization)
Just a thought, definitely not worth holding up the PR for.
ReadBitmapOp::GetSerializedSizeInBytes => { | ||
b.get_serialized_size_in_bytes(); | ||
ReadBitmapOp::GetPortableSerializedSizeInBytes => { | ||
b.get_serialized_size_in_bytes::<Portable>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to import Portable/Native/Frozen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks -- fixed in a455945. I wonder why make fuzz
did not fail for me locally 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like make fuzz
just runs the fuzz_ops fuzzer
Thanks, @andreas, this looks great! |
This PR adds support for the native serialization format from CRoaring. It builds on #105 to get support for
roaring_bitmap_deserialize_safe
. I'm admittedly an absolute newcomer to Rust, so I'd appreciate feedback on how to do this best. I tried a few approaches and found this struck the best balance in terms of aligning with the serialization approach used forTreeMap
, not breaking backwards compatibility, and avoiding a proliferation of (de-)serialization functions onBitmap
. If this approach is acceptable, I'm happy to add tests.