Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduces Baseline Header Interface #171

Closed
wants to merge 1 commit into from

Conversation

pulltab
Copy link

@pulltab pulltab commented Dec 29, 2017

It is useful to be able to inspect an Avro header. For
example, to assert certain naming conventions are met prior to ingestion.

The following changes have been made to expose header values to users:

  • Header now exposes an interface into its metadata. Said interface
    users to retrieve the schema's name as a string and retrieve copies of
    metadata value by key.

  • Deserializer now stores the parsed header. This new value is used by
    a new callback, name, which reaches into the parsed header and
    returns the name contained in the header's metadata.

It is useful to be able to inspect an Avro header.  For
example, to assert certain naming conventions are met prior to ingestion.

The following changes have been made to expose header values to users:

* Header now exposes an interface into its metadata.  Said interface
  users to retrieve the schema's name as a string and retrieve copies of
  metadata value by key.

* Deserializer now stores the parsed header.  This new value is used by
  a new callback, `name`, which reaches into the parsed header and
  returns the name contained in the header's metadata.
@pulltab
Copy link
Author

pulltab commented Jan 17, 2018

@dflemstr Bump.

Feedback?

Copy link
Owner

@dflemstr dflemstr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for slow review, busy at work!

@@ -65,20 +66,24 @@ impl<'a, R> Deserializer<'a, R>
where R: io::Read + read::Limit
{
pub fn new(input: R,
header: &'a header::Header,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking API change, just to highlight that!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you opposed to having a new2 and a new_cow2?

I figure this is safest to ensure we don't break the API.

borrow::Cow::Borrowed(registry),
borrow::Cow::Borrowed(schema))
}

fn new_cow(input: R,
header: borrow::Cow<'a, header::Header>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

@@ -116,12 +122,18 @@ impl<'a, R> Deserializer<'a, read::Blocks<R>>
.ok_or(Error::from(ErrorKind::NoRootType))?
.into_resolved(&registry);

let blocks = read::Blocks::new(input, codec, header.sync.to_vec());
let blocks = read::Blocks::new(input, codec, header.clone().sync.to_vec());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to clone the entire header here?

@@ -1,9 +1,28 @@
use serde;
use std::collections;
use std::{collections};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious change!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is. I must have added something here and removed it later. I will get rid of the extra {}.

impl Header {

/// Returns a copy of the given header's name field as a String.
pub fn name(&self) -> String {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can return &str here and let the user copy if needed

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also return as an Option?


/// Returns a copy of the raw bytes stored at the given key
/// for the given header.
pub fn get(&self, key: &str) -> Option<serde::bytes::ByteBuf> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to copy? Returning reference seems to be more ergonomic. Also the serde buffer type is internal, would prefer &[u8]

@pulltab
Copy link
Author

pulltab commented Jan 25, 2018

@dflemstr My turn to get buried in work. :)

I plan on addressing your feedback tomorrow.

@dflemstr
Copy link
Owner

dflemstr commented Oct 9, 2018

Going to close this due to inactivity now :)

@dflemstr dflemstr closed this Oct 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants