Publicly expose parsing functions which don't consume the entire string. #81

mystor · 2017-01-24T20:23:24Z

Currently if I am trying to parse, say, a macro body with custom syntax like:

foo!(a as "a", b as "b")

So, I have the input string 'a as "a", b as "b"' and I want to parse it, I call syn::parse_ident(s) to parse the first ident from the string, and get an error, as there are characters which follow that ident. It would be nice to also have access to methods (say syn::partial_parse_ident(&str) -> Result<(Ident, &str), String>) which return the remaining string slice in addition to the ident which was parsed.

This could also be done by changing the signature of syn::parse_ident to something more like:

enum ParseError<'a, T> {
    Partial(&'a str, T),
    Failed(String)
}
impl<'a, T> Error for ParseError<'a, T> {..}

fn parse_ident<'a>(input: &'a str) -> Result<Ident, ParseError<'a, Ident>>

Which would allow me to extract the parsed value and remaining string data from the error value.

Implementing this shouldn't be difficult, but I would like @dtolnay's input on the design before I do so.

The text was updated successfully, but these errors were encountered:

dtolnay · 2017-01-24T20:38:33Z

I have been thinking about how to support function-like macros and the best idea I have so far is to expose the underlying nom parsers for the various bits of syntax and let procedural macro author use nom macros to parse their entire input. If this approach is flexible enough to parse all Rust syntax in syn, it should be flexible enough to parse practically any syntax a macro author dreams up. In your case:

use syn::parsing::*;

named!(pub foo -> Vec<(Ident, String)>,
    separated_nonempty_list!(punct!(","), do_parse!(
        x: ident >>
        keyword!("as") >>
        y: quoted_string >>
        (x, y)
    ))
);

This defines a function basically like the one you suggested. The IResult contains a &str of the remaining input and you can compose this function with other nom parsers as necessary.

fn foo(i: &str) -> IResult<&str, Vec<(Ident, String)>>

What do you think?

mystor · 2017-01-24T20:46:23Z

I was looking into that but I ran into a simple problem: syn uses its own mini-nom implementation internally, and I am not sure we want to expose that.

We could have some sort of nom_parsing feature which causes syn to use the real nom instead of our mini-nom to do the parsing, but that would add a lot of #[cfg()]s, because the differences, while small-seeming, actually do have quite a bit of impact on the consumer code. I'm not completely certain how disruptive that would be.

We could also simply export our internal IResult type, and let the consumers wrap it into nom's IResult type, and use nom that way, which means that we avoid having any explicit dependencies on nom, and allow the consumers to configure it however they want (because accommodating every set of nom feature flags through syn could be painful).

dtolnay · 2017-01-24T21:06:28Z

I think we just export the nom fork. The upstream IResult::Incomplete is really harmful for the procedural macro use case and it's very easy to end up with subtly broken parsers or weird failures unless you wrap everything in complete!(...) and use alt_complete!(...) everywhere.

Another option is to delete our fork and use real nom with in syn. I would be happy with that as long as the test suite passes and compile time for the macros 1.1 feature set does not regress too much. This may require some new techniques so that working with Incomplete is less risky. Let's see whether any other nom users have come up with good approaches here.

mystor · 2017-01-24T21:26:25Z

I'm fine with just exporting the nom fork if we're confident enough in its stability, and if we export that, and people decide they want the full nom experience, it should be pretty easy to write a wrapper which converts between the IResult types.

The biggest reason why I would be weary about exporting the nom fork is that it means that if we ever want to export macros from syn which aren't the nom ones, there is no way to peacemeal import only the ones you want, and you end up importing a bunch of macros which will conflict with the real nom if you happen to use that. It would be nicer if it was a separate crate so you can keep the macros separate.

dtolnay · 2017-01-24T21:51:48Z

It is stable. Let's move the macros to their own crate.

This was referenced Jan 24, 2017

Implement spans #41

Closed

Expose the nom parsers in a public parsing:: submodule #82

Merged

dtolnay mentioned this issue Jan 27, 2017

Question about parsing args passed to an attribute like macro #86

Closed

dtolnay added the enhancement label Jan 27, 2017

dtolnay closed this as completed in #82 Jan 27, 2017

Kestrer mentioned this issue May 28, 2020

Consider adding type for partial parsing #835

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publicly expose parsing functions which don't consume the entire string. #81

Publicly expose parsing functions which don't consume the entire string. #81

mystor commented Jan 24, 2017

dtolnay commented Jan 24, 2017

mystor commented Jan 24, 2017

dtolnay commented Jan 24, 2017

mystor commented Jan 24, 2017

dtolnay commented Jan 24, 2017

Publicly expose parsing functions which don't consume the entire string. #81

Publicly expose parsing functions which don't consume the entire string. #81

Comments

mystor commented Jan 24, 2017

dtolnay commented Jan 24, 2017

mystor commented Jan 24, 2017

dtolnay commented Jan 24, 2017

mystor commented Jan 24, 2017

dtolnay commented Jan 24, 2017