Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publicly expose parsing functions which don't consume the entire string. #81

Closed
mystor opened this issue Jan 24, 2017 · 5 comments
Closed

Comments

@mystor
Copy link
Collaborator

mystor commented Jan 24, 2017

Currently if I am trying to parse, say, a macro body with custom syntax like:

foo!(a as "a", b as "b")

So, I have the input string 'a as "a", b as "b"' and I want to parse it, I call syn::parse_ident(s) to parse the first ident from the string, and get an error, as there are characters which follow that ident. It would be nice to also have access to methods (say syn::partial_parse_ident(&str) -> Result<(Ident, &str), String>) which return the remaining string slice in addition to the ident which was parsed.

This could also be done by changing the signature of syn::parse_ident to something more like:

enum ParseError<'a, T> {
    Partial(&'a str, T),
    Failed(String)
}
impl<'a, T> Error for ParseError<'a, T> {..}

fn parse_ident<'a>(input: &'a str) -> Result<Ident, ParseError<'a, Ident>>

Which would allow me to extract the parsed value and remaining string data from the error value.

Implementing this shouldn't be difficult, but I would like @dtolnay's input on the design before I do so.

@dtolnay
Copy link
Owner

dtolnay commented Jan 24, 2017

I have been thinking about how to support function-like macros and the best idea I have so far is to expose the underlying nom parsers for the various bits of syntax and let procedural macro author use nom macros to parse their entire input. If this approach is flexible enough to parse all Rust syntax in syn, it should be flexible enough to parse practically any syntax a macro author dreams up. In your case:

use syn::parsing::*;

named!(pub foo -> Vec<(Ident, String)>,
    separated_nonempty_list!(punct!(","), do_parse!(
        x: ident >>
        keyword!("as") >>
        y: quoted_string >>
        (x, y)
    ))
);

This defines a function basically like the one you suggested. The IResult contains a &str of the remaining input and you can compose this function with other nom parsers as necessary.

fn foo(i: &str) -> IResult<&str, Vec<(Ident, String)>>

What do you think?

@mystor
Copy link
Collaborator Author

mystor commented Jan 24, 2017

I was looking into that but I ran into a simple problem: syn uses its own mini-nom implementation internally, and I am not sure we want to expose that.

We could have some sort of nom_parsing feature which causes syn to use the real nom instead of our mini-nom to do the parsing, but that would add a lot of #[cfg()]s, because the differences, while small-seeming, actually do have quite a bit of impact on the consumer code. I'm not completely certain how disruptive that would be.

We could also simply export our internal IResult type, and let the consumers wrap it into nom's IResult type, and use nom that way, which means that we avoid having any explicit dependencies on nom, and allow the consumers to configure it however they want (because accommodating every set of nom feature flags through syn could be painful).

@dtolnay
Copy link
Owner

dtolnay commented Jan 24, 2017

I think we just export the nom fork. The upstream IResult::Incomplete is really harmful for the procedural macro use case and it's very easy to end up with subtly broken parsers or weird failures unless you wrap everything in complete!(...) and use alt_complete!(...) everywhere.

Another option is to delete our fork and use real nom with in syn. I would be happy with that as long as the test suite passes and compile time for the macros 1.1 feature set does not regress too much. This may require some new techniques so that working with Incomplete is less risky. Let's see whether any other nom users have come up with good approaches here.

@mystor
Copy link
Collaborator Author

mystor commented Jan 24, 2017

I'm fine with just exporting the nom fork if we're confident enough in its stability, and if we export that, and people decide they want the full nom experience, it should be pretty easy to write a wrapper which converts between the IResult types.

The biggest reason why I would be weary about exporting the nom fork is that it means that if we ever want to export macros from syn which aren't the nom ones, there is no way to peacemeal import only the ones you want, and you end up importing a bunch of macros which will conflict with the real nom if you happen to use that. It would be nicer if it was a separate crate so you can keep the macros separate.

@dtolnay
Copy link
Owner

dtolnay commented Jan 24, 2017

It is stable. Let's move the macros to their own crate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants