Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canonicalization rfc8949 and rfc7049 implementation for structs #143

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 23 additions & 4 deletions ciborium/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,10 @@ Ciborium contains CBOR serialization and deserialization implementations for ser

## Quick Start

You're probably looking for [`from_reader()`](crate::de::from_reader)
and [`into_writer()`](crate::ser::into_writer), which are
the main functions. Note that byte slices are also readers and writers and can be
passed to these functions just as streams can.
You're probably looking for [`from_reader()`](crate::de::from_reader),
[`to_vec()`](crate::ser::to_vec), and [`into_writer()`](crate::ser::into_writer),
which are the main functions. Note that byte slices are also readers and writers
and can be passed to these functions just as streams can.

For dynamic CBOR value creation/inspection, see [`Value`](crate::value::Value).

Expand Down Expand Up @@ -89,4 +89,23 @@ be avoided because it can be fragile as it exposes invariants of your Rust
code to remote actors. We might consider adding this in the future. If you
are interested in this, please contact us.

### Canonical Encodings

The ciborium crate has support for various canonical encodings during
serialization.

- [`NoCanonicalization`](crate::canonical::NoCanonicalization): the default,
numbers are still encoded in their smallest form, but map keys are not
sorted for maximum serialization speed.
- [`Rfc7049`](crate::canonical::Rfc7049): the canonicalization scheme from
RFC 7049 that sorts map keys in a length-first order. Eg.
`["a", "b", "aa"]`.
- [`Rfc8949`](crate::canonical::Rfc8949): the canonicalization scheme from
RFC 8949 that sorts map keys in a bytewise lexicographic order. Eg.
`["a", "aa", "b"]`.

To use canonicalization, you must enable the `std` feature. See the examples
in [`to_vec_canonical`](crate::ser::to_vec_canonical) and
[`into_writer_canonical`](crate::ser::into_writer_canonical) for more.

License: Apache-2.0
73 changes: 73 additions & 0 deletions ciborium/src/canonical.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
//! Canonicalization support for CBOR serialization.
//!
//! Supports various canonicalization schemes for deterministic CBOR serialization. The default is
//! [NoCanonicalization] for the fastest serialization. Canonical serialization is around 2x slower.

/// Which canonicalization scheme to use for CBOR serialization.
///
/// Can only be initialized with the `std` feature enabled.
#[doc(hidden)]
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
pub enum CanonicalizationScheme {
/// Sort map keys in output according to [RFC 7049]'s deterministic encoding spec.
///
/// Also aligns with [RFC 8949 4.2.3]'s backwards compatibility sort order.
///
/// Uses length-first map key ordering. Eg. `["a", "b", "aa"]`.
#[cfg(feature = "std")]
Rfc7049,

/// Sort map keys in output according to [RFC 8949]'s deterministic encoding spec.
///
/// Uses bytewise lexicographic map key ordering. Eg. `["a", "aa", "b"]`.
#[cfg(feature = "std")]
Rfc8949,
}

/// Don't sort map key output.
pub struct NoCanonicalization;

/// Sort map keys in output according to [RFC 7049]'s deterministic encoding spec.
///
/// Also aligns with [RFC 8949 4.2.3]'s backwards compatibility sort order.
///
/// Uses length-first map key ordering. Eg. `["a", "b", "aa"]`.
#[cfg(feature = "std")]
pub struct Rfc7049;

/// Sort map keys in output according to [RFC 8949]'s deterministic encoding spec.
///
/// Uses bytewise lexicographic map key ordering. Eg. `["a", "aa", "b"]`.
#[cfg(feature = "std")]
pub struct Rfc8949;

/// Trait for canonicalization schemes.
///
/// See implementors:
/// - [NoCanonicalization] for no canonicalization (fastest).
/// - [Rfc7049] for length-first map key sorting.
/// - [Rfc8949] for bytewise lexicographic map key sorting.
pub trait Canonicalization {
/// True if keys should be cached and sorted.
const IS_CANONICAL: bool;

/// Determines which sorting implementation to use.
const SCHEME: Option<CanonicalizationScheme>;
}

impl Canonicalization for NoCanonicalization {
const IS_CANONICAL: bool = false;
const SCHEME: Option<CanonicalizationScheme> = None;
}

#[cfg(feature = "std")]
impl Canonicalization for Rfc7049 {
const IS_CANONICAL: bool = true;
const SCHEME: Option<CanonicalizationScheme> = Some(CanonicalizationScheme::Rfc7049);
}

#[cfg(feature = "std")]
impl Canonicalization for Rfc8949 {
const IS_CANONICAL: bool = true;
const SCHEME: Option<CanonicalizationScheme> = Some(CanonicalizationScheme::Rfc8949);
}
55 changes: 55 additions & 0 deletions ciborium/src/de/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -786,6 +786,24 @@ where
/// If you want to deserialize faster at the cost of more memory, consider using
/// [`from_reader_with_buffer`](from_reader_with_buffer) with a larger buffer,
/// for example 64KB.
///
/// # Example
/// ```rust
/// use ciborium::from_reader;
///
/// #[derive(Debug, serde::Deserialize, Eq, PartialEq)]
/// struct Example {
/// a: u64,
/// aa: u64,
/// b: u64,
/// }
///
/// let cbor = hex::decode("a36161182a61621910686261611901a4").unwrap();
/// let expected = Example { a: 42, aa: 420, b: 4200 };
///
/// let deserialized: Example = from_reader(cbor.as_slice()).unwrap();
/// assert_eq!(deserialized, expected);
/// ```
#[inline]
pub fn from_reader<T: de::DeserializeOwned, R: Read>(reader: R) -> Result<T, Error<R::Error>>
where
Expand All @@ -798,6 +816,25 @@ where
/// Deserializes as CBOR from a type with [`impl
/// ciborium_io::Read`](ciborium_io::Read), using a caller-specific buffer as a
/// temporary scratch space.
///
/// # Example
/// ```rust
/// use ciborium::from_reader_with_buffer;
///
/// #[derive(Debug, serde::Deserialize, Eq, PartialEq)]
/// struct Example {
/// a: u64,
/// aa: u64,
/// b: u64,
/// }
///
/// let cbor = hex::decode("a36161182a61621910686261611901a4").unwrap();
/// let expected = Example { a: 42, aa: 420, b: 4200 };
///
/// let mut scratch = [0; 8192];
/// let deserialized: Example = from_reader_with_buffer(cbor.as_slice(), &mut scratch).unwrap();
/// assert_eq!(deserialized, expected);
/// ```
#[inline]
pub fn from_reader_with_buffer<T: de::DeserializeOwned, R: Read>(
reader: R,
Expand All @@ -820,6 +857,24 @@ where
/// will result in [`Error::RecursionLimitExceeded`] .
///
/// Set a high recursion limit at your own risk (of stack exhaustion)!
///
/// # Example
/// ```rust
/// use ciborium::de::from_reader_with_recursion_limit;
///
/// #[derive(Debug, serde::Deserialize, Eq, PartialEq)]
/// struct Example {
/// a: u64,
/// aa: u64,
/// b: u64,
/// }
///
/// let cbor = hex::decode("a36161182a61621910686261611901a4").unwrap();
/// let expected = Example { a: 42, aa: 420, b: 4200 };
///
/// let deserialized: Example = from_reader_with_recursion_limit(cbor.as_slice(), 1024).unwrap();
/// assert_eq!(deserialized, expected);
/// ```
#[inline]
pub fn from_reader_with_recursion_limit<T: de::DeserializeOwned, R: Read>(
reader: R,
Expand Down
39 changes: 31 additions & 8 deletions ciborium/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@
//!
//! # Quick Start
//!
//! You're probably looking for [`from_reader()`](crate::de::from_reader)
//! and [`into_writer()`](crate::ser::into_writer), which are
//! the main functions. Note that byte slices are also readers and writers and can be
//! passed to these functions just as streams can.
//! You're probably looking for [`from_reader()`](crate::de::from_reader),
//! [`to_vec()`](crate::ser::to_vec), and [`into_writer()`](crate::ser::into_writer),
//! which are the main functions. Note that byte slices are also readers and writers
//! and can be passed to these functions just as streams can.
//!
//! For dynamic CBOR value creation/inspection, see [`Value`](crate::value::Value).
//!
Expand Down Expand Up @@ -83,6 +83,25 @@
//! be avoided because it can be fragile as it exposes invariants of your Rust
//! code to remote actors. We might consider adding this in the future. If you
//! are interested in this, please contact us.
//!
//! ## Canonical Encodings
//!
//! The ciborium crate has support for various canonical encodings during
//! serialization.
//!
//! - [`NoCanonicalization`](crate::canonical::NoCanonicalization): the default,
//! numbers are still encoded in their smallest form, but map keys are not
//! sorted for maximum serialization speed.
//! - [`Rfc7049`](crate::canonical::Rfc7049): the canonicalization scheme from
//! RFC 7049 that sorts map keys in a length-first order. Eg.
//! `["a", "b", "aa"]`.
//! - [`Rfc8949`](crate::canonical::Rfc8949): the canonicalization scheme from
//! RFC 8949 that sorts map keys in a bytewise lexicographic order. Eg.
//! `["a", "aa", "b"]`.
//!
//! To use canonicalization, you must enable the `std` feature. See the examples
//! in [`to_vec_canonical`](crate::ser::to_vec_canonical) and
//! [`into_writer_canonical`](crate::ser::into_writer_canonical) for more.

#![cfg_attr(not(feature = "std"), no_std)]
#![deny(missing_docs)]
Expand All @@ -92,23 +111,27 @@

extern crate alloc;

pub mod canonical;
pub mod de;
pub mod ser;
pub mod tag;
pub mod value;

// Re-export the [items recommended by serde](https://serde.rs/conventions.html).
#[doc(inline)]
pub use crate::de::from_reader;
pub use crate::de::{from_reader, from_reader_with_buffer, Deserializer};

#[doc(inline)]
pub use crate::de::from_reader_with_buffer;
pub use crate::ser::{into_writer, Serializer};

#[doc(inline)]
pub use crate::ser::into_writer;
#[cfg(feature = "std")]
pub use crate::ser::{into_writer_canonical, to_vec, to_vec_canonical};

#[cfg(feature = "std")]
#[doc(inline)]
pub use crate::ser::into_vec;
#[deprecated(since = "0.3.0", note = "Please use `to_vec` instead")]
pub use crate::ser::to_vec as into_vec;

#[doc(inline)]
pub use crate::value::Value;
Expand Down
Loading
Loading