schema/dmt: first pass at a parser #253

mvdan · 2021-09-15T19:38:09Z

(see commit message)

schema/dmt/parse_test.go

warpfork

My skim of the parser has been fairly fast, but it looks like a parser, and it looks like it has quite good error handling (line and offsets, etc) as a bonus... and if it parses the schema-schema, that is indeed quite a bunch of practical exercise and test already. Amazing!

warpfork · 2021-09-15T22:29:45Z

It looks like we don't test that the parser of the schema-schema DSL is functionally equivalent to the schema-schema JSON load, yet?

Is it time to test that? It might be as easy as feeding to the two of them to the printer.Sprint feature and then doing a big string diff.

mvdan · 2021-09-15T22:43:24Z

That sounds like a good idea. I'd probably say that's second PR territory, because it's likely going to find a few more bugs, and this PR is big enough as it is :)

warpfork · 2021-09-16T08:23:47Z

SGTM. My thumb is up!

mvdan · 2021-09-17T17:44:56Z

Waiting on ipld/ipld#137 (comment) to get a decision.

schema/dmt/parse.go

warpfork · 2021-09-20T06:47:03Z

I would probably move the whole Parse function family to another package. schema/dsl? It returns a schema/dmt.Schema, but most of the rest of its thinking is framed in terms of the DSL. It's also too easy to see schema/dmt.Parse and think "ah, that must take JSON or some other standard codec and parse to Data Model", which is of course not what this code does.

mvdan · 2021-09-20T06:58:00Z

I don't have a strong opinion on the names of schema vs DMT vs DSL, but I do think we should avoid adding more packages into the mix. go-ipld-prime already has dozens of packages, and I fear that adding more packages and placing the high-level APIs there will just further confuse the new users.

If you're worried about getting confused by the fully qualified name schemadmt.Parse, perhaps schemadmt.ParseDSL? It reads somewhat redundant to me, as practically all parsers read some form of language syntax specific to their purpose, but ultimately I can't argue against being explicit in APIs.

warpfork · 2021-09-20T07:49:22Z

I'm not so worried about more packages. Worried about undiscoverability if high-level APIs end up hidden in deeply nested packages: yes. But we already have that problem with schema/dmt; I think putting the DSL parser stuff in schema/dsl is precisely equally bad, and not worse. (And yet has bonuses of explicitness, and draws an accurate graph that the DSL depends on the DMT, and not vice-versa.)

The solution to discoverability issues for this stuff seems to me to usually shake out to making a top-level package that just focuses purely on the discoverability concern. To pursue that: one option would be to make an ipld.ParseSchemaDSL function all the way in the repo root. The other option: is to try to put discoverability routes in the schema package... but the trouble there is of course we kind of lost that battle already, because I didn't realize how much Fun this was going to be when I started the schema package ages ago. (I wonder if it would be a good idea to try to reclaim the ground, though, with copious aliasing, like I did for the repo root package. It's possible.)

schemadmt.ParseDSL would still also read better to me than schemadmt.Parse, and definitely be a step in the right direction. But I think I still at would least weakly favor packages.

mvdan · 2021-09-20T08:04:52Z

To pursue that: one option would be to make an ipld.ParseSchemaDSL function all the way in the repo root.

Oh, I like that. I had been thinking about what to name a "parse and compile schema" API, and where to put it. Now it seems clear it should go in the root package. If anyone wants something more specific, such as parsing the DSL but keeping it in DMT non-compiled form, then they could drop down to one of the sub-packages.

warpfork · 2021-09-20T08:13:21Z

Yep. And the root ipld package, fwiw, does already have a dependency on the schema package. So it's not at all unreasonable to add a ipld.ParseSchemaDSL there.

(A little more backstory: The ipld package already uses schema in tasks like feature-detecting typed nodes for purposes like the Marshal functions. To tie together some threads, this was made possible in #228, and usages of the schema package such as those Marshal functions appeared in #232 . I mention all this because it's the success of 228 that makes me float the idea we could pivot the schema package similarly, if we decided to do so. 😃 )

mvdan · 2021-09-20T08:36:19Z

I still don't have a strong opinion on whether to split DSL from DMT. You're right that they could be split, and they sit in different levels. On the other hand, the new DMT package will have a very small API, and I'm not a fan of splitting a package with a small API into two packages with even smaller APIs. They would both be relatively small amounts of code, too.

I think I lean slightly towards splitting the packages, if anything because schemadsl.Parse feels like a slightly better name than schemadmt.ParseDSL. In such a split, would we then keep "compile" as schemadmt.Compile?

To get to the tip of the bikeshedding: most end users will want to go from a schema file to a compiled Schema they can use with e.g. bindnode. So they really want a "Parse and Compile" API. But I really don't like ipld.ParseAndCompileSchema. Maybe ipld.LoadSchema or ipld.BuildSchema?

I mention all this because it's the success of 228 that makes me float the idea we could pivot the schema package similarly, if we decided to do so.

My intuition is that we should probably not pull that trick more than once in this module. The existing codebase has enough packages and dependency tree magic as it is :)

Plenty of TODOs scattered around, but it's complete enough that it can parse the schema-schema. We include ParseBytes for the upcoming go:embed usecases, and ParseFile for the common non-embed usecase. The parser is largely inspired by github.com/ipld/go-ipld-schema/parser, with some notable differences. First, it follows a top-down recursive descent design, with methods such as "consume token" and "peek token". This also enables us to provide consistently good errors. Second, it's not line-based, using bufio.Reader instead. This keeps the code simpler, as we aren't forced to tokenize by line, and we also avoid potential mistakes when slicing tokens by hand. Third, it parses straight into the new bindnode DMT types, which allows for Parse and Compile to be chained together. Note that the parser lives in schema/dsl, which sits on top of schema/dmt with the Go types and schema definition. This is a nice separation at a conceptual level, and also means one can use the DMT without pulling in the DSL parser. Finally, we bump the ipld submodule to get Eric's schema-schema changes that mean implicit scalar values are no longer quoted in the DSL. Updates #188.

mvdan · 2021-09-28T17:06:31Z

Moved the Parse API with its test to schema/dsl. Following schema/dmt being named schemadmt, the package is called schemadsl.

Note that Compile remains in schemadmt. That seems like the right place to me: it has nothing to do with the DSL.

warpfork

Looks even better than ever -- let's merge it!

mvdan requested a review from warpfork September 15, 2021 19:38

warpfork reviewed Sep 15, 2021

View reviewed changes

schema/dmt/parse_test.go Show resolved Hide resolved

warpfork approved these changes Sep 15, 2021

View reviewed changes

mvdan mentioned this pull request Sep 16, 2021

schema-schema: fix quoting typos around implicits in the DSL form. ipld/ipld#137

Merged

aschmahmann reviewed Sep 19, 2021

View reviewed changes

schema/dmt/parse.go Outdated Show resolved Hide resolved

warpfork approved these changes Sep 28, 2021

View reviewed changes

warpfork merged commit df8e909 into ipld:master Sep 28, 2021

mvdan deleted the dmt-parse branch October 15, 2021 10:39

aschmahmann mentioned this pull request Dec 1, 2021

Release v0.11 ipfs/kubo#8343

Closed

80 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

schema/dmt: first pass at a parser #253

schema/dmt: first pass at a parser #253

mvdan commented Sep 15, 2021

warpfork left a comment

warpfork commented Sep 15, 2021

mvdan commented Sep 15, 2021

warpfork commented Sep 16, 2021

mvdan commented Sep 17, 2021

warpfork commented Sep 20, 2021

mvdan commented Sep 20, 2021

warpfork commented Sep 20, 2021 •

edited

Loading

mvdan commented Sep 20, 2021

warpfork commented Sep 20, 2021

mvdan commented Sep 20, 2021

mvdan commented Sep 28, 2021

warpfork left a comment

schema/dmt: first pass at a parser #253

schema/dmt: first pass at a parser #253

Conversation

mvdan commented Sep 15, 2021

warpfork left a comment

Choose a reason for hiding this comment

warpfork commented Sep 15, 2021

mvdan commented Sep 15, 2021

warpfork commented Sep 16, 2021

mvdan commented Sep 17, 2021

warpfork commented Sep 20, 2021

mvdan commented Sep 20, 2021

warpfork commented Sep 20, 2021 • edited Loading

mvdan commented Sep 20, 2021

warpfork commented Sep 20, 2021

mvdan commented Sep 20, 2021

mvdan commented Sep 28, 2021

warpfork left a comment

Choose a reason for hiding this comment

warpfork commented Sep 20, 2021 •

edited

Loading