Upgrade to Odoc 2.0.0 #333

Julow · 2021-06-16T14:31:53Z

In Odoc 2.0.0 (which is not released yet), the parser will be packaged separately and the API will change slightly. The API might change a bit again, I'll update this PR accordingly.

An other change is the "metadata" field, allowing to attach informations on every code blocks:

{@ocaml labels... [ ... ]}

This PR updates the code to work with the new version and handle the metadata field.

An other change in this PR is that code blocks without header are no longer run. Existing code blocks that were running before need to be changed. Perhaps this change is out of scope and should be removed ?

Julow

The new API of the parser is still WIP: ocaml/odoc#685

Julow · 2021-06-16T14:33:01Z

dune-project

@@ -35,6 +35,6 @@
  result
  (ocaml-version
   (>= 2.3.0))
-  (odoc (>= 1.4.1))
+  (odoc-parser (>= 2.0.0))


This makes Mdx incompatible with Odoc 1.5. Odoc 2.0.0 should be fully retrocompatible. The only reason I can think of a user keeping the current version is the change in the output.

This is fixed now. Odoc 1.5 is co-installable with odoc-parser.

lib/mli_parser.ml

NathanReb · 2021-06-17T09:01:54Z

Thanks for taking care of this!

I haven't looked at the code but there are a couple of high level things I'd like to discuss.

My first concern is compatibility with older code. I'd like the behaviour of MDX not to change after this patch, as much as possible. Blocks with no headers should run exactly the same as before.

My understanding was that the content between {@ and [ would not be interpreted by odoc at all. I'm guessing that if you're considering forcing the first component to be a language header, that means you do intend to interpret it somehow in odoc. If so, what will odoc do with this header?
If odoc does want the first bit to be a language header, then maybe it's better to make the parser aware of that so that we don't have to deal with this here and can get a pair made of the language header and the rest (which is the part we care the most about here). It's also probably a good idea to have a clear separation between language header and custom headers, so that you can have one without the other. I don't see a good reason for forcing MDX users to add a language if they want to use MDX features.

The last point is more of an open quesiton but I'm wondering if we should not use language headers at all in mli files and rely on the kind label instead. I guess the answer here does depend on whether odoc plans on using language headers. If not, I don't think MDX should introduce them as they don't have an actual semantic value on their own. They were used to infer block types but this has proved not to be such a good idea, leading to bad error reporting and other similar issues.
We could keep supporting toplevel blocks with no labels and enforce any other blocks to at least have the kind label for MDX to interpret it. I think overall it would improve MDX behaviour and make MDX document more readable for people maintaining them.

Julow · 2021-06-17T09:29:41Z

I'd like the behaviour of MDX not to change after this patch, as much as possible. Blocks with no headers should run exactly the same as before.

I agree that this change was a bit out of scope. I'll revert it.
My concern was that some blocks might not be intended to run or might not be OCaml code. That would make Mdx not work by default if it were run in CI for example.

what will odoc do with this header?

The first word would be the language tag and it might be used for syntax highlighting in the html backend.

make the parser aware of that so that we don't have to deal with this here and can get a pair made of the language header and the rest

Yes, that's what I meant.

We could keep supporting toplevel blocks with no labels and enforce any other blocks to at least have the kind label for MDX to interpret it.

That would be great! Do you think we should change the syntax to have the language tag optional to avoid passing informations twice in this case ?

We don't have clear plans yet for the language tag, it looked interesting to add because Mdx might use it.

NathanReb · 2021-06-17T10:04:53Z

You're right. I think supporting an optional language tag in odoc, so that it can be used to generate the right, syntax highlighted HTML is a perfectly valid reason to allow adding language tag to code blocks.

I think it should be optional indeed, both for compatibility reasons and because sometimes there might be blocks we don't know how to highlight and for which there isn't really a point in adding a language tag but where you might still want to add headers. For example, an MDX file include block that's importing the content of a text file.
To clarify I feel like users should be able to write blocks with just a language tag, just custom headers, both or none. The syntax probably need to be changed for that to work well, e.g. have a symbol marking the beginning of the header. I don't have any particular opinion on what it should look like and I trust you to make the right choice here!

I think the kind label doesn't necessarily dispense one from adding a language tag as they might still want to have it properly highlighted in the generated documentation.
Remains the question of how to highlight toplevel blocks, which aren't strictly speaking valid ocaml syntax but we can work this one out I guess.

Regarding simple blocks and MDX, if I remember correctly, the current behaviour is that blocks are inspected: if they follow the toplevel blocks syntax, they are evaluated by MDX, otherwise they are simply ignored. I think we should preserve this with this patch. The good news is that now, users are able to add a skip label to blocks they don't want MDX to temper with. I feel like it's a win-win situation!

This all looks really promising, I'm fairly excited to get that out! Let me know if there's anything I can do to help with this on either sides (MDX or odoc)!

jonludlam · 2021-06-23T16:28:53Z

Thanks for looking this over @NathanReb! I'm curious about the problems you encountered with the language tag? I can't immediately see a useful difference between {@ocaml foo=bar[...]} and {@kind=ocaml foo=bar[...]}, but perhaps I'm misinterpreting what you're suggesting? It would be helpful to see some examples of the distinction you mean.

NathanReb · 2021-06-23T17:39:13Z

In Markdown, the language tag is used for markdown interpeters to generate the styling for the code blocks. MDX used this information to infer how to interpret the content of the block, along with the syntax of the block content itself but it has not strong semantic meaning. There are 4 different ways to interpret a block for MDX:

It's an ocaml toplevel block, made of toplevel phrases interleaved with their printed evaluation, similar to what one gets running an interactive toplevel
A raw OCaml block, made of any valid ocaml structure. MDX will make sure this block compile by running it as a toplevel phrase but won't print any output.
A file include block, it's made of a bit of arbitrary text that MDX imported from a file as specified by the labels
A shell block, which is similar to a toplevel block except that it's made of shell commands interleaved with their standard output and error.

For example, the first 3 blocks can contain OCaml code, or something you'd want to style as OCaml code. The language tag is too ambiguous on its own and MDX can't rely on that only to find out how to interpret the block. At some point it will have to make a guess, which might differ from what the user wanted. This guess will affect how MDX report errors and it can lead to a great deal of confusion when MDX and the user had different expectations for that block.

Another issue with this is that, atm, toplevel blocks have a different syntax than the raw OCaml one, and can't be styled the same way.

In short, the block type and the language tag are two different piece of information, used for different purposes and this is why I believe they should be handled differently.
I believe the block kind is clear and explicit as to what MDX is going to do with that block both for user and for MDX itself, which in return allows MDX to guide users when they make mistake such as forgetting to specify a mandatory label or use the wrong syntax for that specific kind of block.

For compat reasons, we'll maintain that behaviour in markdown files but we'd like to encourage our users to slowly migrate to the prefered block kind approach. If I can avoid doing the same mistakes in mli files from the start, I would much rather do that!

jonludlam · 2021-06-28T12:40:26Z

Does this mean that if a block is included it can't be treated as a block of ocaml? That is, I can't interpret it and add to the environment for a subsequent block to use?

jonludlam · 2021-07-07T09:13:42Z

I think this PR should be split - one PR to switch to odoc-parser (with no support of code-block metadata) and a follow up with the rest of the changes.

NathanReb · 2021-07-07T09:32:49Z

@jonludlam this definitely sounds sensible!

Julow · 2021-07-08T12:14:14Z

The parser is released, I updated this PR and removed the commit about the metadata field, it is ignored for now.

Julow · 2021-07-08T12:16:05Z

lib/mli_parser.ml

-   end of "[**", but doesn't add those three characters to the within-the-file column
-   number. Here, we make the adjustment.
-*)
-let account_for_docstring_open_token (location : Odoc_model.Location_.span) =


The parser now keeps the location :)

I talked too fast. This is still needed, an other force-push.

I debugged this a bit more and found that there is no bug into odoc-parser but rather that it expects the starting location to account for the comment opening. That's how odoc does it: https://github.com/ocaml/odoc/blob/master/src/loader/doc_attr.ml#L56
I pushed a fix.

Indeed, as described here: https://ocaml-doc.github.io/odoc-parser/Odoc_parser/index.html#val-parse_comment

NathanReb

Thanks! This looks great!

I have a minor comment but otherwise it all looks good to me.

Could you add a changelog entry? In theory it shouldn't matter as the behaviour is expected to remain the same but I think it's good to mention it, just in case we uncover bugs or changes in the future. I'll cut a release with this straight after we merge!

You mentioned a required bug fix on the odoc-parser side, has it been released? If so we can fix that module alias thing and merge!

lib/mli_parser.ml

NathanReb · 2021-07-12T09:35:27Z

Ouh also, could you please rebase on top of master so the 4.13 build succeeds?

jonludlam · 2021-07-12T11:02:03Z

I don't think there was a bug fix required for odoc-parser - it was just a difference in expectations about what the initial location meant.

Remove the extra step of correcting locations by padding it before.

Julow · 2021-07-12T14:38:24Z

Rebased, changelog update and alias removed. Good to merge for me.

NathanReb · 2021-07-13T09:44:48Z

It appears the opam file isn't up to date anymore with what dune generates, I'll push the update!

Signed-off-by: Nathan Rebours <[email protected]>

NathanReb · 2021-07-13T09:45:50Z

This is due to the master branch updating the dune lang version!

NathanReb · 2021-07-13T09:58:20Z

Okay, let's merge this!

Huge thanks to both of you @Julow @jonludlam for taking care of this!

@Julow

CHANGES: #### Added #### Changed - Use odoc-parser.0.9.0 (realworldocaml/mdx#333, @Julow) #### Deprecated - Add a deprecation warning for toplevel blocks that are not terminated with `;;` (realworldocaml/mdx#342, @Leonidas-from-XIV) #### Fixed - Fix accidental redirect of stderr to stdout (realworldocaml/mdx#343, @Leonidas-from-XIV) - Remove trailing whitespaces that were added to indent empty lines (realworldocaml/mdx#341, @gpetiot) #### Removed #### Security

Julow commented Jun 16, 2021

View reviewed changes

Julow force-pushed the odoc-200 branch from 0bc9e4d to 47287ce Compare June 17, 2021 09:36

emillon mentioned this pull request Jun 23, 2021

4.13 compatibility #330

Merged

NathanReb changed the base branch from master to main July 2, 2021 09:43

Julow force-pushed the odoc-200 branch from 47287ce to 4b1467a Compare July 8, 2021 12:12

Julow marked this pull request as ready for review July 8, 2021 12:14

Julow commented Jul 8, 2021

View reviewed changes

Julow force-pushed the odoc-200 branch from 4b1467a to fef8c75 Compare July 8, 2021 12:24

NathanReb reviewed Jul 12, 2021

View reviewed changes

lib/mli_parser.ml Outdated Show resolved Hide resolved

Julow added 3 commits July 12, 2021 16:36

Upgrade to Odoc 2.0.0

61a932f

Simplify location handling

c0012a1

Remove the extra step of correcting locations by padding it before.

Update changelog

699a6b4

Julow force-pushed the odoc-200 branch from 5a2e496 to 699a6b4 Compare July 12, 2021 14:36

Sync dune generated opam file

d8f0e6a

Signed-off-by: Nathan Rebours <[email protected]>

NathanReb merged commit 76fa353 into realworldocaml:main Jul 13, 2021

Julow mentioned this pull request Aug 3, 2021

Recognise the metadata field on Odoc code blocks #339

Closed

Leonidas-from-XIV mentioned this pull request Sep 8, 2021

[new release] mdx (1.11.0) ocaml/opam-repository#19423

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade to Odoc 2.0.0 #333

Upgrade to Odoc 2.0.0 #333

Julow commented Jun 16, 2021

Julow left a comment

Julow Jun 16, 2021

Julow Jul 8, 2021

NathanReb commented Jun 17, 2021

Julow commented Jun 17, 2021

NathanReb commented Jun 17, 2021

jonludlam commented Jun 23, 2021

NathanReb commented Jun 23, 2021

jonludlam commented Jun 28, 2021

jonludlam commented Jul 7, 2021

NathanReb commented Jul 7, 2021

Julow commented Jul 8, 2021

Julow Jul 8, 2021

Julow Jul 8, 2021

Julow Jul 8, 2021

jonludlam Jul 9, 2021

NathanReb left a comment

NathanReb commented Jul 12, 2021

jonludlam commented Jul 12, 2021

Julow commented Jul 12, 2021

NathanReb commented Jul 13, 2021

NathanReb commented Jul 13, 2021

NathanReb commented Jul 13, 2021

Upgrade to Odoc 2.0.0 #333

Upgrade to Odoc 2.0.0 #333

Conversation

Julow commented Jun 16, 2021

Julow left a comment

Choose a reason for hiding this comment

Julow Jun 16, 2021

Choose a reason for hiding this comment

Julow Jul 8, 2021

Choose a reason for hiding this comment

NathanReb commented Jun 17, 2021

Julow commented Jun 17, 2021

NathanReb commented Jun 17, 2021

jonludlam commented Jun 23, 2021

NathanReb commented Jun 23, 2021

jonludlam commented Jun 28, 2021

jonludlam commented Jul 7, 2021

NathanReb commented Jul 7, 2021

Julow commented Jul 8, 2021

Julow Jul 8, 2021

Choose a reason for hiding this comment

Julow Jul 8, 2021

Choose a reason for hiding this comment

Julow Jul 8, 2021

Choose a reason for hiding this comment

jonludlam Jul 9, 2021

Choose a reason for hiding this comment

NathanReb left a comment

Choose a reason for hiding this comment

NathanReb commented Jul 12, 2021

jonludlam commented Jul 12, 2021

Julow commented Jul 12, 2021

NathanReb commented Jul 13, 2021

NathanReb commented Jul 13, 2021

NathanReb commented Jul 13, 2021