You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The new IR format (key-value pair IR format) serializes data directly from a msgpack map. Currently, we support serializing msgpack strings into string values in our IR format. However, limited by msgpack's type spec here, UTF8 is not enforced in string objects. This means non-UTF8 byte sequences might be given and successfully serialized. The serialized IR can be successfully deserialized through clp::ffi::ir_stream::Deserializer, but it will trigger issues when converting the deserialized results into other formats, such as JSON string or Python dictionaries through Python ffi.
Solutions:
We should enforce UTF8 checking for string types at some point
We should add additional types to support serializing raw byte sequence (similar to BINARY type in msgpack)
Bug
The new IR format (key-value pair IR format) serializes data directly from a msgpack map. Currently, we support serializing msgpack strings into string values in our IR format. However, limited by msgpack's type spec here, UTF8 is not enforced in string objects. This means non-UTF8 byte sequences might be given and successfully serialized. The serialized IR can be successfully deserialized through
clp::ffi::ir_stream::Deserializer
, but it will trigger issues when converting the deserialized results into other formats, such as JSON string or Python dictionaries through Python ffi.Solutions:
BINARY
type in msgpack)CLP version
0c00a94
Environment
Any
Reproduction steps
clp::ffi::ir_stream::Serializer
clp::ffi::ir_stream::Deserializer
clp::ffi::KeyValuePairLogEvent::serialize_to_json
and it will trigger a JSON exceptionThe text was updated successfully, but these errors were encountered: