Skip to content
This repository has been archived by the owner on Sep 24, 2022. It is now read-only.

Support spans when deserializing serde structures #239

Merged
merged 7 commits into from
May 9, 2018
Merged

Support spans when deserializing serde structures #239

merged 7 commits into from
May 9, 2018

Conversation

udoprog
Copy link
Contributor

@udoprog udoprog commented May 7, 2018

This PR introduces Spanned based on the previous POC by @dtolnay.

Internally we deserialize spans as "intermediate structs" with special naming conventions, which communicates to the Deserializer that it should emit a Map with similarly special names that we can pick up.

This PR contains a couple of changes:

  • Introduces a Spanned<T> type that can be used in structs to encapsulate spans.
  • Unified the naming convention of constants for datetime and spanned.
  • Added eat_spanned and expect_spanned to Tokenizer to correctly parse tables and arrays.

I'm unsure it is complete, the infrastructure needed to support Datetime permeates more of the deserialize infrastructure, but I haven't been able to decipher why yet. Any feedback is appreciated.

Future work

Ironically this doesn't help me with my original problem, but it's a fair step down the road. I use toml::Value in a similar way that Cargo uses it to permit more flexible decoding than is afforded through databinding. Introducing spans into the existing toml::Value would not be a backwards compatible change, so the only alternative I see is to have separate types (e.g. toml::SpannedValue) which contains the necessary spans. Suggestions are welcome.

@udoprog
Copy link
Contributor Author

udoprog commented May 7, 2018

Note: I'm also pushing it through the build system to check for 1.15 compatibility. Initial builds are probably gonna break.

@dtolnay
Copy link
Contributor

dtolnay commented May 7, 2018

Would it be possible to support spans in a map key? BTreeMap<Spanned<String>, Spanned<String>>

@udoprog
Copy link
Contributor Author

udoprog commented May 7, 2018

@dtolnay I was fiddling with some tests for cases which currently does not work. Pushed them out in dffb952.

Cargo.toml Outdated
@@ -20,7 +20,7 @@ travis-ci = { repository = "alexcrichton/toml-rs" }

[dependencies]
serde = "1.0"
serde_derive = "1.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you wrote out the Deserialize impl so this is no longer used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

src/spanned.rs Outdated
{
let start = visitor.next_key::<StartKey>()?;

if start.is_none() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend cutting out the key_deserialize macro and using:

if visitor.next_key()? != Some(START) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

src/de.rs Outdated
K: de::DeserializeSeed<'de>,
{
if self.start.is_some() {
seed.deserialize(spanned::START.into_deserializer()).map(Some)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be:

use serde::de::value::BorrowedStrDeserializer;

seed.deserialize(BorrowedStrDeserializer::new(spanned::START)).map(Some)

in order for my other comment to work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

@udoprog
Copy link
Contributor Author

udoprog commented May 7, 2018

Regarding BTreeMap<Spanned<String>, Spanned<String>>, what's the identity of Spanned<T>? If it's transitive to T, two differently spanned T's would override each other as expected. On the other hand, two differently spanned Ts aren't equal.

Generally I address this by distinguishing between the key of a structure and its span, like: BTreeMap<String, (Span, T)>. But that isn't really an option here.

@dtolnay
Copy link
Contributor

dtolnay commented May 7, 2018

I would expect Map<Spanned<K>, V> to behave exactly like Map<K, V> just with more information captured during deserialization.

"foo = [0, 1, 2, 3, 4]",
"[0, 1, 2, 3, 4]"
);
// datetime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried an actual Datetime but it seems to get the wrong span. Is this a tokenizer issue?

good::<Datetime>("foo = 1997-09-09T09:09:09Z", "1997-09-09T09:09:09Z");
---- test_spanned_field stdout ----
	thread 'test_spanned_field' panicked at 'assertion failed: `(left == right)`
  left: `26`,
 right: `19`', tests/spanned.rs:20:9

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Probably a case of the deserializer not combining the spans. will check it out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 2b40d54, thanks!

@udoprog
Copy link
Contributor Author

udoprog commented May 7, 2018

@dtolnay

I would expect Map<Spanned, V> to behave exactly like Map<K, V> just with more information captured during deserialization.

I half-agree with this, but it becomes awkward when trying to implement and use: https://play.rust-lang.org/?gist=5a444c2ed6fd2322594658920b9eef12&version=stable&mode=debug (possibly my impl-skills are not strong enough, so please take a stab at this).

But if I have a map which includes spans, I'd want to be able to:

  • Have a way to get/remove elements by a reference of the contained value in the key (&K in Map<Spanned<K>, V>).
  • Have a way to access the span of a given key, by a reference of the contained value in the key (get Span from &K in Map<Spanned<K>, V>).

Note: One approach that satisfies this is to use Map<K, (Span, V)>, where the first element in the tuple is the span. We could introduce a third type specialized for map-serialization to support this: Map<K, MapSpanned<V>>.

Copy link
Collaborator

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks so much!

src/lib.rs Outdated
@@ -169,3 +169,7 @@ mod tokens;

#[doc(hidden)]
pub mod macros;

pub mod spanned;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this module start out private and only export the Spanned type below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Thanks.

src/spanned.rs Outdated
/// The end range (exclusive).
pub end: usize,
/// The spanned value.
pub value: T,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of public fields could these be accessed through methods perhaps? That way we could tweak the representation here later in the future and/or do other fancy things with struct layout

Copy link
Contributor Author

@udoprog udoprog May 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the set of methods I'm fiddling with right now:

impl<T> Spanned<T> {
    fn start(&self) -> usize { .. }

    fn end(&self) -> usize { .. }

    fn span(&self) -> (usize, usize) { .. }

    fn borrow(&self) -> &T { .. }

    fn take(self) -> T { .. }
}

Got any opinions on this?

src/spanned.rs Outdated
}

/// Take the inner value.
pub fn take(self) -> T {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with libstd, mind naming this into_inner? I think we may want to go ahead and add methods like get_ref and get_mut here as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants