Skip to content

Latest commit

 

History

History
104 lines (90 loc) · 18.9 KB

2019-08-01.md

File metadata and controls

104 lines (90 loc) · 18.9 KB

GraphQL Working Group Meeting 2019-08-01

Agenda

Action Items

  • Unclaimed Add maintainer list for relevant and breaking changes, send to reference library maintainers
  • Lee: Grant Erik collaboration access to libgraphqlparser
  • Lee: Contact GitHub and see what we can do about moving the GraphQL cats repo into the GraphQL org.

Notes

  • Lee:
    • Website planning a task for writing code page vision document is in progress but not shared yet.
    • Unclaimed task on important Q&A details is still open
    • Action on graphql.org thoughts will be taken soon.
  • Ivan: We’re clear on the acceptance criteria, always happy with new maintainers.
  • Lee: Andy is making significant progress on the DateTime RFC, will share progress soon

Parse ambiguity PR & greedy lexing PR (Lee Byron, 10m)

  • Lee: Let’s talk about parse ambiguity. There’s two PRs, one issue with syntactic grammar where the type system contains an edge case ambiguity where SDL is used in combination with queries. For example having both a type definition and a query in the same document creates an ambiguous situation. The goal should be to make the grammar fully unambiguous. The second PR tackles the same kind of issue, less of an ambiguity, something that has been unclear in the spec, explaining the algorithm for producing lexed tokens from an input character stream. We assumed people would know how to do this, we should be more unambiguous on what the general knowledge is. Mostly this isn’t a real bug in implementations, but it might create edge cases. I did some research on specs of other languages (C, Python, Swift). All of those contain a spec to detect the longest sequence of characters for a valid token. I’m totally nerding out by the way. I found academic literature around alternatives to this, essentially most languages are turning out more complicated by this, for example JavaScript’s opening slash for a RegEx that shouldn’t be used for division, clearly this poses problems. Going by an integer, the grammar would be 0 to 9. What about the numbers 1 and 2, does it mean twelve or is it two numbers? I added follow restrictions to the lexer rules. I think Ivan pointed out that JavaScript’s behaviour is different to what I was proposing. The other edge case ECMAScript poses is that a number can’t be followed by a letter. Why would this be the case? I found out that C has types. It turns out people would copy over the 3l (3 with l for long in C) and break their JavaScript code, which is why they added it. And now we have BigNumbers, followed by an n. Any opinions on this?
  • Michael: Is it even allowed to combine queries and type definitions?
  • Lee: Yes, the ambiguous documents are a rare edge case. Typically when people combine those two, they describe the schema and valid queries.
  • Michael: I’m just asking because there’s a validation rule to check that types aren’t allowed in a query doc.
  • Lee: That’s right, it’s why I’m saying it’s an extreme edge case.
  • Ivan: Just as a point of reference, GraphQL Java made a decision to allow empty bodies, I think the previous way is more natural.
  • Lee: You mean returning to an openining and closing brace? We could do that.
  • Ivan: For defining types it’s like a person opened an issue for extensions, it’s one thing when people try to add client-side directives which are more widely used. In a sense it actually makes more sense, in acceptance criteria we agreed to not break queries. If we make some changes we shouldn’t break existing queries.
  • Lee: I think it’s a worthwile discussion to have. I think it’s a meaningful change, we should discuss this in a separate PR!
  • Ivan: I agree! About parsing, what really bothers me is that in practical nobody writes numbers or strings without putting a space between them. Can we make a requirement to allow punctuators like commas and spaces between values? Could we describe this in formal language? I wrote a bunch of fuzzing tests and tried every permutation of tokens for graphql-js, for example try to parse it before and after. I found that it’s actually working to put a value between two non-punctuators.
  • Lee: Let me describe my other RFC first because I have some other opinions on it too
  • Ivan: Sure!

Number lexer follow restriction RFC (Lee Byron, 10m)

  • Lee: Let me describe some of this behaviour for folks that haven’t been paying as much attention as Ivan has. By saying that an integer cannot be followed by a digit, going back to the behaviour JavaScript has that uses a following restriction, so any character that would be a valid first identifier (letters and underscore)
  • Michael: Excluding e right?
  • Lee: No especially including e to avoid backtracking. The goal is to create lexing behaviour that is unambiguous and performant. Everytime you get one character, you should know how to move the state forward. Never should you have to go backwards. When you see a digit and the lexer creates a number token (number/float) and then see an e, what are you supposed to do? Most GraphQL implementations flip to a float in this case, even though the spec does not put any rules on it. So Michael if we used your example with 123efg, I want to make this illegal. If you read 123e, it could still be a valid float, but with the fg, we would need to go back. It’s a very complicated scenario. We would say that integers cannot be followed by any letter. Float is still fine. If you saw 123e0f, this could be free, but I want to add a rule that a float can’t be followed by a letter, this would fail as well. It would give us the control in kind we wanted to add new kinds of literals, for example if you were to type in a hex decimal, it would be a valid hexadecimal literal in most languages but GraphQL doesn’t have them, but regardless the lexer would just run through that and accept it although we don’t even support it. It would create multiple tokens and the syntax would just break. In the worst scenario, it would parse it just fine and create a validation error from it, just getting weirder and weirder. By adding the rule we could buy back the possibility to add this in the future but reserving the rule for now by making it illegal.
  • Michael: So this means adding any character that’s not a number after a number would make it fail?
  • Lee: Right! I’m actually a bit nervous if this is more complicated because we’d have to check tokens that can be followed, if we have an array we could have 1, 2) which should be totally legal. The logic of between next to after would require to lex the next thing and then look back, making it more difficult to express
  • Ivan: I remember the issue of double 0 which is still valid.
  • Lee: This would make double 0 invalid. Not this RFC but the previous RFC that lexing must be greedy would make double 0 invalid.
  • Michael: About the integer issue. In our implementation, if we detect digits, as soon as it hits an e, we flip the switch to parse as a float. And if another letter follows I’d throw a syntax exception?
  • Lee: Exactly. By that rule you’d know by sure that it’s not an Int anymore. Actually GraphQL.js and your implementation are all wrong because they’d parse it but I’d say we shouldn’t have it because it’s confusing. The whole conversation makes me think lexing is more complicated than it should be. I initially thought lexing was just a small step but now I’m finding academic literature and complicated and ambiguous specs.
  • Ivan: I would like to say another thing because it’s the second or third time we bring this up. People try to minimize queries, or try to fingerprint queries like Apollo Engine tries to fingerprint queries to apply statistics, so they’re trying to strip everything that’s nonessential. This is why we need a nonambiguous grammar, when I worked at minimizing (uglifying), I wrote a fuzzing test to try every permutation of tokens and found some appropriate rules for tokens but this is more complicated. Until it was reported as an issue, nobody knew about the ambiguity.
  • Lee: That’s a really interesting idea! Any other thoughts on both agenda items and RFC PRs? Any thoughts on the change of behaviours? It would require all implementations to change, the clarifications for the grammar greediness might still change things.
  • Ivan: We should try to implement it and see if people report problems, I don’t think anything should break by that.
  • Lee: I guess that’s why I’m asking here, is everybody fine with fasttracking it from stage 0?
  • Robert: Maybe we could slow down to create docs around it
  • Lee: Sure! Maybe we can move it to 1 and I could ask you to get appropriate reviewers for this
  • Ivan: Maybe we can create a list for implementers including deep topics, so nobody misses out on decisions if people aren’t present in the meeting. We might find people working on implementations.
  • Lee: I like this!
    • Action item: Add maintainer list for big and breaking changes
  • Lee: That’s all for me, any other thoughts on parsing stuff? Hopefully the academic talk was interesting!
  • Erik: Thanks! So libgraphqlparser is the C++ implementation of GraphQL parser, mostly for queries but with experimental schema support. It’s pretty basic in that it just parses documents but I think if I’m not mistaken, but the library might be originally developed by Facebook but has since been moved to the graphql organization. We’re using the library for some work but it seems like the library isn’t actively maintained. There’s a lot of open issues, partially filed by myself, and a lot of inactive PRs. I wanted to bring up what the future of this library might be, I think there’s three categories: Organization and maintenance issues like the license change from BSD to MIT isn’t fully completed in the library, there’s code committed still mentioning the GPL license which might put people off. The second category is bugs, tests current don’t run properly. There’s errors wiht lexing and parsing block strings and lastly there’s spec features that aren’t implemented yet. Ideally we’d like to contribute to this so the question is if something can be done here and what your opinions are
  • Evan: I used to spend some time on this, but for the reasons you listed we determined to write a pure Ruby parser that was only marginally slower, so we just left it.
  • Lee: I was actually under the impression that Ruby was still using it, it has some interesting history. It was initially created as an internal tool or hackathon project by a person that tried to keep it up to date but was also comfortable to let it live under the foundation so more people could maintain it since it wasn’t his primary priority. Good news is that the primary purpose of the GitHub org is to decouple it from Facebook’s open source tooling so if you guys are using it at IBM, we can absolutely make it happen! If there’s anybody else still using,
  • Michael: There’s some Python libraries using it but they presented it in Berlin (GraphQL Conf)
  • Lee: Great memory! We could do some callouts and maybe collect some maintainers by that, I guess there’s an action item to give you (Erik)
    • Action item: Give Erik collaboration access
  • Lee: I think it would be a worthwile thing to open an issue or GitHub milestone to describe the effort of reviving this issue.
  • Erik: We could absolutely create an issue to outline the complete effort
  • Lee: Yeah maybe we could find some folks for this.
  • Ivan: I think the graphiql team showed a good path to creating working groups, I added some links
  • Erik: Definitely, I think we should reach out and organize some kind of meeting!
  • Ivan: Good to know that you cover all use cases. With graphiql, some people just wanted to use it as components or the full solution
  • Erik: I think the long term issues include things like adding validation support so there’s a lot of questions to be answered
  • Lee: Sounds like a great plan, I think we should strive to make libgraphqlparser speccompliant and get it back to the point it was supposed to be
  • Erik: Great so Lee you would grant me access to manage PRs
  • Lee: Excellent. GraphQL Cats!
  • Michael: Yeah, we started to build a test generator for all the stuff half a year ago, but since then there hasn’t been much activity. Some of the rules are probably broken we found, if we want to have a project like GraphQL cats, we should add a rule to this framework once the spec changes. I don’t know if we want to follow this direction so storing it in yaml and generating specific tests might be good but if there’s some way to get this up to speed it would be great. We would love to help with this since we’re porting tests from graphql-js back to C# at the moment, where the workflow is really manual. In each specific language we have a generator to create the tests. There’s a lot of changes in graphql-js where we have to look up what actually changed and if we need to update the tests. Any other thoughts on GraphQL cats? Has anybody else than GraphQL Java put in some effort? Could we get this up and running again?
  • Lee: Great questions. I don’t have any satisfying answers at the moment. If you are eager on takign this on, it seems like that’s the missing piece. I think there’s some ambiguity why we can’t make it run on all platforms at once, why would you build this on top of your engine when there’s only some tests but then why would you contribute to graphql cats? I think in the beginning we just have to force this through so whoever has the time and passion to pull this off might steer it. People that are new to the ecosystem might question if the tool will work for them would appreciate graphql cats.
  • Michael: The only blocker is that I’m not sure who owns graphql cats right now. If we were to put it into a place where people had to write tests upon changes to the spec. There are definitely rules we’d have to fix but overall it’s a great idea.
  • Lee: Does anybody know who else might have access to GraphQL cats?
  • Ivan: I think I saw some issue over at GraphQL Sangria, but I’ll try to look it up. From my point of view about the tests, one problem with GraphQL Cats is that it tried to address entire execution pipeline, some of the things are pretty easy, so writing lexing and parsing tests is easy but its hard for execution especially asynchronous execution. It tried to address everything, what I’m trying to say is instead of going wide is to go deep so offering simple stuff like lexing and parsing would be great. Much of the complexity of the driver comes from directives I think. I think it’s worth to start from the beginning and support lexing and parsing. It might be a hundred lines of code to support this.
  • Michael: I would suggest also to include validation
  • Ivan: Last time we discussed GraphQL cats, if I remember GraphQL Python at some point didn’t support SDL so for them to fulfill validation they would have to create a schema so it was relatively controversial. This is why I’d start small to support all existing tools in the ecosystem
  • Michael: While I agree, most complicated issues are around validation so that’s why I’d like this
  • Ivan: Oleg put a lot of effort in it and it hat little adoption. If we really restrict it and build an MVP out of it, first issues from Lee were about changing lexing and parsing so I think it’s already valuable to have this. Right now if we create a working group for cats, people would just exchange ideas without contributing. But if we were to build a prototype and then ask people to contribute we might achieve wider coverage
  • Michael: I agree! Let’s start with the parser and lexer. I think the most important thing is to ask for parser and lexer changes to create tests.
  • Ivan: Maybe we could even think about fuzzing tests, for example block strings since they are hard to support. I’m interested in contributing in validation, we could try to clean up tests from graphql-js for this. We have a bunch of test cases
  • Michael: That means we’d have to get access to the repository right?
  • Ivan: I think we could just create a small experimental group. Get some adoption and make it official afterwards.
  • Michael: Should we maybe hook up after the meeting?
  • Lee: So here’s some action items:
    • Action item: Contact GitHub and see what we can do about moving the GraphQL cats repo into the GraphQL org. Might take a while
  • Lee: I strongly agree that we should add parsing tests first and keep it simple in the beginning. These YAML files can be quite complicated and they shouldn’t have to be. Once we have parser tests we could think about validation tests. The driver for that should be much simpler. To your point about validation rules in the spec is that nobody really went over it and checked that rules aligned perfectly. There have been cases where we found ambiguity but we also had a case where there wasn’t a mention of one case at all. I guess if we work on it we might find more cases. An editorial improvement would be to use clear naming to describe which rule is which, I believe it’s similar to the discussion about unique error codes, I don’t want to compare those two but it might be great to have some unique identifier. It’s awesome that you want to work on this though, I think there’s room to move significantly if we see that there’s room to build it simpler or easier
  • Michael: We’ll get in touch with Ivan and see how we’re going to solve this
  • Ivan: Sure! Maybe we can get in touch with other implementers. I’m against doing this too soon. Another thing, Lee what do you think about writing a first test, should we create an external GitHub org and repo or will you create a repo under the GraphQL organization?
  • Lee: Here’s what I’ll do, I’ll create a new repo as a space for you guys to be unblocked to start collaborating. Simultaneously I’ll take to GitHub and ask them to move it. I want to avoid confusion due to having multiple spaces. If we get ownership on both we’ll find a strategy on merging it.
  • Evan: I think the maintainer of GraphQL Ruby is a GitHub employee, so we might have a point of contact here.
  • Lee: I’ll keep that in mind!