Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Odoc 2.0.0 #333

Merged
merged 4 commits into from
Jul 13, 2021
Merged

Upgrade to Odoc 2.0.0 #333

merged 4 commits into from
Jul 13, 2021

Conversation

Julow
Copy link
Collaborator

@Julow Julow commented Jun 16, 2021

In Odoc 2.0.0 (which is not released yet), the parser will be packaged separately and the API will change slightly. The API might change a bit again, I'll update this PR accordingly.

An other change is the "metadata" field, allowing to attach informations on every code blocks:

{@ocaml labels... [ ... ]}

This PR updates the code to work with the new version and handle the metadata field.

An other change in this PR is that code blocks without header are no longer run. Existing code blocks that were running before need to be changed. Perhaps this change is out of scope and should be removed ?

Copy link
Collaborator Author

@Julow Julow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new API of the parser is still WIP: ocaml/odoc#685

dune-project Outdated
@@ -35,6 +35,6 @@
result
(ocaml-version
(>= 2.3.0))
(odoc (>= 1.4.1))
(odoc-parser (>= 2.0.0))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes Mdx incompatible with Odoc 1.5. Odoc 2.0.0 should be fully retrocompatible. The only reason I can think of a user keeping the current version is the change in the output.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed now. Odoc 1.5 is co-installable with odoc-parser.

lib/mli_parser.ml Outdated Show resolved Hide resolved
@NathanReb
Copy link
Contributor

Thanks for taking care of this!

I haven't looked at the code but there are a couple of high level things I'd like to discuss.

My first concern is compatibility with older code. I'd like the behaviour of MDX not to change after this patch, as much as possible. Blocks with no headers should run exactly the same as before.

My understanding was that the content between {@ and [ would not be interpreted by odoc at all. I'm guessing that if you're considering forcing the first component to be a language header, that means you do intend to interpret it somehow in odoc. If so, what will odoc do with this header?
If odoc does want the first bit to be a language header, then maybe it's better to make the parser aware of that so that we don't have to deal with this here and can get a pair made of the language header and the rest (which is the part we care the most about here). It's also probably a good idea to have a clear separation between language header and custom headers, so that you can have one without the other. I don't see a good reason for forcing MDX users to add a language if they want to use MDX features.

The last point is more of an open quesiton but I'm wondering if we should not use language headers at all in mli files and rely on the kind label instead. I guess the answer here does depend on whether odoc plans on using language headers. If not, I don't think MDX should introduce them as they don't have an actual semantic value on their own. They were used to infer block types but this has proved not to be such a good idea, leading to bad error reporting and other similar issues.
We could keep supporting toplevel blocks with no labels and enforce any other blocks to at least have the kind label for MDX to interpret it. I think overall it would improve MDX behaviour and make MDX document more readable for people maintaining them.

@Julow
Copy link
Collaborator Author

Julow commented Jun 17, 2021

I'd like the behaviour of MDX not to change after this patch, as much as possible. Blocks with no headers should run exactly the same as before.

I agree that this change was a bit out of scope. I'll revert it.
My concern was that some blocks might not be intended to run or might not be OCaml code. That would make Mdx not work by default if it were run in CI for example.

what will odoc do with this header?

The first word would be the language tag and it might be used for syntax highlighting in the html backend.

make the parser aware of that so that we don't have to deal with this here and can get a pair made of the language header and the rest

Yes, that's what I meant.

We could keep supporting toplevel blocks with no labels and enforce any other blocks to at least have the kind label for MDX to interpret it.

That would be great! Do you think we should change the syntax to have the language tag optional to avoid passing informations twice in this case ?

We don't have clear plans yet for the language tag, it looked interesting to add because Mdx might use it.

@NathanReb
Copy link
Contributor

You're right. I think supporting an optional language tag in odoc, so that it can be used to generate the right, syntax highlighted HTML is a perfectly valid reason to allow adding language tag to code blocks.

I think it should be optional indeed, both for compatibility reasons and because sometimes there might be blocks we don't know how to highlight and for which there isn't really a point in adding a language tag but where you might still want to add headers. For example, an MDX file include block that's importing the content of a text file.
To clarify I feel like users should be able to write blocks with just a language tag, just custom headers, both or none. The syntax probably need to be changed for that to work well, e.g. have a symbol marking the beginning of the header. I don't have any particular opinion on what it should look like and I trust you to make the right choice here!

I think the kind label doesn't necessarily dispense one from adding a language tag as they might still want to have it properly highlighted in the generated documentation.
Remains the question of how to highlight toplevel blocks, which aren't strictly speaking valid ocaml syntax but we can work this one out I guess.

Regarding simple blocks and MDX, if I remember correctly, the current behaviour is that blocks are inspected: if they follow the toplevel blocks syntax, they are evaluated by MDX, otherwise they are simply ignored. I think we should preserve this with this patch. The good news is that now, users are able to add a skip label to blocks they don't want MDX to temper with. I feel like it's a win-win situation!

This all looks really promising, I'm fairly excited to get that out! Let me know if there's anything I can do to help with this on either sides (MDX or odoc)!

@emillon emillon mentioned this pull request Jun 23, 2021
@jonludlam
Copy link
Contributor

Thanks for looking this over @NathanReb! I'm curious about the problems you encountered with the language tag? I can't immediately see a useful difference between {@ocaml foo=bar[...]} and {@kind=ocaml foo=bar[...]}, but perhaps I'm misinterpreting what you're suggesting? It would be helpful to see some examples of the distinction you mean.

@NathanReb
Copy link
Contributor

In Markdown, the language tag is used for markdown interpeters to generate the styling for the code blocks. MDX used this information to infer how to interpret the content of the block, along with the syntax of the block content itself but it has not strong semantic meaning. There are 4 different ways to interpret a block for MDX:

  1. It's an ocaml toplevel block, made of toplevel phrases interleaved with their printed evaluation, similar to what one gets running an interactive toplevel
  2. A raw OCaml block, made of any valid ocaml structure. MDX will make sure this block compile by running it as a toplevel phrase but won't print any output.
  3. A file include block, it's made of a bit of arbitrary text that MDX imported from a file as specified by the labels
  4. A shell block, which is similar to a toplevel block except that it's made of shell commands interleaved with their standard output and error.

For example, the first 3 blocks can contain OCaml code, or something you'd want to style as OCaml code. The language tag is too ambiguous on its own and MDX can't rely on that only to find out how to interpret the block. At some point it will have to make a guess, which might differ from what the user wanted. This guess will affect how MDX report errors and it can lead to a great deal of confusion when MDX and the user had different expectations for that block.

Another issue with this is that, atm, toplevel blocks have a different syntax than the raw OCaml one, and can't be styled the same way.

In short, the block type and the language tag are two different piece of information, used for different purposes and this is why I believe they should be handled differently.
I believe the block kind is clear and explicit as to what MDX is going to do with that block both for user and for MDX itself, which in return allows MDX to guide users when they make mistake such as forgetting to specify a mandatory label or use the wrong syntax for that specific kind of block.

For compat reasons, we'll maintain that behaviour in markdown files but we'd like to encourage our users to slowly migrate to the prefered block kind approach. If I can avoid doing the same mistakes in mli files from the start, I would much rather do that!

@jonludlam
Copy link
Contributor

Does this mean that if a block is included it can't be treated as a block of ocaml? That is, I can't interpret it and add to the environment for a subsequent block to use?

@NathanReb NathanReb changed the base branch from master to main July 2, 2021 09:43
@jonludlam
Copy link
Contributor

I think this PR should be split - one PR to switch to odoc-parser (with no support of code-block metadata) and a follow up with the rest of the changes.

@NathanReb
Copy link
Contributor

@jonludlam this definitely sounds sensible!

@Julow
Copy link
Collaborator Author

Julow commented Jul 8, 2021

The parser is released, I updated this PR and removed the commit about the metadata field, it is ignored for now.

@Julow Julow marked this pull request as ready for review July 8, 2021 12:14
end of "[**", but doesn't add those three characters to the within-the-file column
number. Here, we make the adjustment.
*)
let account_for_docstring_open_token (location : Odoc_model.Location_.span) =
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser now keeps the location :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked too fast. This is still needed, an other force-push.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I debugged this a bit more and found that there is no bug into odoc-parser but rather that it expects the starting location to account for the comment opening. That's how odoc does it: https://github.com/ocaml/odoc/blob/master/src/loader/doc_attr.ml#L56
I pushed a fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@NathanReb NathanReb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks great!

I have a minor comment but otherwise it all looks good to me.

Could you add a changelog entry? In theory it shouldn't matter as the behaviour is expected to remain the same but I think it's good to mention it, just in case we uncover bugs or changes in the future. I'll cut a release with this straight after we merge!

You mentioned a required bug fix on the odoc-parser side, has it been released? If so we can fix that module alias thing and merge!

lib/mli_parser.ml Outdated Show resolved Hide resolved
@NathanReb
Copy link
Contributor

Ouh also, could you please rebase on top of master so the 4.13 build succeeds?

@jonludlam
Copy link
Contributor

I don't think there was a bug fix required for odoc-parser - it was just a difference in expectations about what the initial location meant.

Remove the extra step of correcting locations by padding it before.
@Julow
Copy link
Collaborator Author

Julow commented Jul 12, 2021

Rebased, changelog update and alias removed. Good to merge for me.

@NathanReb
Copy link
Contributor

It appears the opam file isn't up to date anymore with what dune generates, I'll push the update!

Signed-off-by: Nathan Rebours <[email protected]>
@NathanReb
Copy link
Contributor

This is due to the master branch updating the dune lang version!

@NathanReb
Copy link
Contributor

Okay, let's merge this!

Huge thanks to both of you @Julow @jonludlam for taking care of this!

@NathanReb NathanReb merged commit 76fa353 into realworldocaml:main Jul 13, 2021
Leonidas-from-XIV added a commit to Leonidas-from-XIV/opam-repository that referenced this pull request Sep 8, 2021
CHANGES:

#### Added

#### Changed

- Use odoc-parser.0.9.0 (realworldocaml/mdx#333, @Julow)

#### Deprecated

- Add a deprecation warning for toplevel blocks that are not terminated with `;;` (realworldocaml/mdx#342, @Leonidas-from-XIV)

#### Fixed

- Fix accidental redirect of stderr to stdout (realworldocaml/mdx#343, @Leonidas-from-XIV)
- Remove trailing whitespaces that were added to indent empty lines (realworldocaml/mdx#341, @gpetiot)

#### Removed

#### Security
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants