Switch to graphql-parser #138

tailhook · 2018-02-05T00:54:26Z

Hi,

Since there is no activity in #32 I've decided to write a separate crate for graphql-parser.

It's not based on code here but built using excellent combine crate (making the grammar more readable in my opinion). So in a weekend project I could build the parser for the whole graphql grammar. Which means it should be on par with this crate for features (I've not checked every feature, but we also have subscriptions and triple-quoted strings already implemented).

What do you think about integrating with the graphql-parser crate instead of keeping custom parser?

The text was updated successfully, but these errors were encountered:

Keats · 2018-02-06T19:52:23Z

That looks much better than my half-baked Pest parser (although I still prefer pest to combine but that's not important)!

It would be really neat for the Rust ecosystem to standardise on a single parser so at least every crate wanting to do something with GraphQL doesn't have to rewrite it and it has many eyes on it.

theduke · 2018-02-06T20:47:12Z

@tailhook I'll look over the code, but in principle, as long as the AST produced is decent and can easily be converted (without much cloning) I'd definitely be in favour.

This also concers https://github.com/nrc/graphql and @nrc.
We've already talked about sharing fundamental infrastructure like the parser.

We'd also need the parser to handle parsing the GraphQL schema definition language.

theduke · 2018-02-06T21:02:23Z

@tailhook also, it's important to have great error reporting for parsing errors and to be able to map each ast node to the query location.

tailhook · 2018-02-06T21:13:10Z

This also concers https://github.com/nrc/graphql and @nrc.
We've already talked about sharing fundamental infrastructure like the parser.

Sure, here is the issue: nrc/graphql#36

it's important to have great error reporting for parsing errors

I think it's good enough. Of course, we have positions of all the errors. And "unexpected A, expected B, C". I'm not sure what else to say. Perhaps, there aren't much test for error reporting yet.

Currently, we don't have decent API for the error besides: format me a string, but it should be easy to add just need to figure out how it might look like.

and to be able to map each ast node to the query location.

We have positions for all important elements. The approach is different that one here, but we can discuss details.

We'd also need the parser to handle parsing the GraphQL schema definition language.

Surely, on a todo list. Does seem like another weekend project in it's size. The good thing that tokenizer may be reused entirely.

theduke · 2018-02-06T22:17:11Z

@tailhook all sounds great, I'll look over the code!

Would you be interested in moving the parser into the graphql-rust organization?

It would be cool to have all GraphQL related stuff under one hood, and it would ensure that someone else can maintain the parser if you need a break or whatever.

That's not in any way a requirement for me, though.

ps: I've written a demo parser in combine too, I think it's a good choice as in the the current parser combinator ecosystem.

tailhook · 2018-02-06T23:20:25Z

Would you be interested in moving the parser into the graphql-rust organization?

I'm fine with that. Should it be done now?

tailhook · 2018-02-11T22:12:48Z

SDL is implemented and pushed to crates.io as v0.2.0

Keats · 2018-02-26T19:43:59Z

Is there anything blocking that?

theduke · 2018-02-26T19:55:27Z

@Keats I haven't had time to look over the code.

We will probably want to have a custom-tailored ast for Juniper.
I'm perfectly fine with getting an AST from the parser and mapping over it to get the custom one.
It should be as efficient as possible to do so, though.

Especially wrt tokio, where sharing the AST between possible lots of resolvers (futures) should be as cheap as possible.

@mhallin had some concerns, so we probably want his input here.

mhallin · 2018-02-26T20:47:02Z

Hello, and thanks for bringing this up to the table!

While the parser part looks good, I've got some reservations for giving up control over the AST data structures to a general-purpose library. The issue I see is that we won't be able to model the data the way that suits a server application best.

For example, our AST of today uses Cow<&str> references everywhere into the original query source to avoid allocating tons of very small strings. This works today because the main execute function executes the entire query and returns when everything has been resolved. To support async code, where the AST needs to outlive the scope of the execute call, our probable AST of tomorrow has a special string type that works like a &str with a reference counted parent string.

I believe you could switch over graphql-parser to use this code, or build even better versions of all of this, but my reluctance comes from the fact that we now can do that without having to think about other use cases than ours.

tailhook · 2018-02-26T22:10:24Z

Well, AST in graphql-parser is currently owned, so works fine for async code right now.

Then we can try to benchmark and optimize things. Usually, rust/jemalloc is very good, so allocations themselves are rarely problem. Given that request processing does lots of validation and probably has a bunch of hashmaps the overhead of allocations may be amortized. Also if serde_json can afford owned strings I think graphql library can too.

Another point of view is that size of Spanning<SharedStr> is 72 bytes. I would rather try some kind of "inline string" library instead of that, given that most strings like type names and field names are much smaller than that. (But I'm not sure there is established "inline string" library in the community)

theduke · 2018-02-26T23:10:11Z

IMO it's fine if we start from what's there and optimize later, especially if @tailhook is fine with moving the repo into the graphql-rust org.

I also agree that the AST is probably a minuscule part of the work that happens when executing a resolver.

But:
Let's consider this relatively straight-forward query, executed with a traditional relational database under the hood, and something similar to dataloader:

query {
    users(page: 2, pagesize: 100) {
        id
        username
        posts(max: 20) {
          name
          comments(limit: 5) {
                id
                title
          }
        }
        likes(max: 50) {
            post {
                id
                title
            }
            time
        }
    }
}

Let's assume that the posts, posts.comments, likes, likes.posts queries are run as futures.

That means:
users: 1 + 100 clones
posts: 1 + 100
posts.comments: 100 * 20
likes: 1 + 100
likes.post: 100 * 50

Now let's think about 100 of those queries running concurrently...
That's a lot of cloning of at least nested parts of the AST.
Execution of the futures can be batched quite easily, but the creation of the futures much less so, and those futures will need to know the query.

So I think this can become a significant overhead.

There are a lot of possible optimizations, like just unsafely passing around references to the full AST, using an Arc of the initial query string like mhallin has tried, using some kind of per-query interner, ...

But it's something that we'll need to think about.

tailhook · 2018-02-27T00:07:32Z

Execution of the futures can be batched quite easily, but the creation of the futures much less so, and those futures will need to know the query.

I expect this to work by wrapping an Arc around a part of query. I think current SelectionSet which is basically a Vec<Selection> will be more like Arc<FnvHashSet<Field>> where fragments are already populated. So each resolver doesn't need to repeat fragment logic, and subselections can be referred to on their own.

I.e. the AST after validation/optimization is a bit different from what we have right after parsing anyway.

theduke · 2019-04-09T17:08:58Z

Some progress: graphql-rust/graphql-parser#19

tosti007 · 2020-03-14T14:43:47Z

@tailhook considering this issue was opened more than 2 years ago. Would the progress on graphql-parser be sufficient to fully replace the the current parser? I.e. what is the current status of this issue?

tailhook mentioned this issue Feb 5, 2018

Extract parser into its own crate #32

Closed

LegNeato mentioned this issue Mar 9, 2019

Add support for GraphQL Schema Language #324

Closed

theduke changed the title ~~Use graphql-parser for parsing query language~~ Switch to graphql-parser May 16, 2019

mwilliammyers mentioned this issue Jun 11, 2019

Add logging #370

Closed

timtonk mentioned this issue Aug 3, 2019

Federation specification implementation #376

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to graphql-parser #138

Switch to graphql-parser #138

tailhook commented Feb 5, 2018

Keats commented Feb 6, 2018

theduke commented Feb 6, 2018 •

edited

Loading

theduke commented Feb 6, 2018

tailhook commented Feb 6, 2018

theduke commented Feb 6, 2018 •

edited

Loading

tailhook commented Feb 6, 2018

tailhook commented Feb 11, 2018

Keats commented Feb 26, 2018

theduke commented Feb 26, 2018 •

edited

Loading

mhallin commented Feb 26, 2018

tailhook commented Feb 26, 2018

theduke commented Feb 26, 2018 •

edited

Loading

tailhook commented Feb 27, 2018

theduke commented Apr 9, 2019

tosti007 commented Mar 14, 2020

Switch to graphql-parser #138

Switch to graphql-parser #138

Comments

tailhook commented Feb 5, 2018

Keats commented Feb 6, 2018

theduke commented Feb 6, 2018 • edited Loading

theduke commented Feb 6, 2018

tailhook commented Feb 6, 2018

theduke commented Feb 6, 2018 • edited Loading

tailhook commented Feb 6, 2018

tailhook commented Feb 11, 2018

Keats commented Feb 26, 2018

theduke commented Feb 26, 2018 • edited Loading

mhallin commented Feb 26, 2018

tailhook commented Feb 26, 2018

theduke commented Feb 26, 2018 • edited Loading

tailhook commented Feb 27, 2018

theduke commented Apr 9, 2019

tosti007 commented Mar 14, 2020

theduke commented Feb 6, 2018 •

edited

Loading

theduke commented Feb 6, 2018 •

edited

Loading

theduke commented Feb 26, 2018 •

edited

Loading

theduke commented Feb 26, 2018 •

edited

Loading