Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Markdown automatic links extension #20127

Closed
a-mr opened this issue Aug 1, 2022 · 0 comments
Closed

Implement Markdown automatic links extension #20127

a-mr opened this issue Aug 1, 2022 · 0 comments
Labels

Comments

@a-mr
Copy link
Contributor

a-mr commented Aug 1, 2022

Summary

Adopt the implicit_header_references extension for both normal section references and Nim smart doc links (which is related to RFC nim-lang/RFCs#125).
This is a part of work for converting from RST to Markdown markup.

Motivation

  • RST-style references contain an underscore in them which conflicts with Markdown's usage of underscore for emphasis. So we should not keep this RST extension enabled in Markdown mode.
  • But automatic links are useful because otherwise one need to write down the generated anchors when making links. Or know how anchors are formed (e.g. for headings from their text), and it's implementation-specific. And manual anchors are ugly since info is duplicated in them.

Description

Example:

Type relations
==============

Convertible relation
--------------------

Markdown's CommonMark spec allows only links with the explicit target:

Ref [Convertible relation](#type-relations-convertible-relation)

While in RST we can have just:

Ref `Convertible relation`_

Fortunately there is the implicit_header_references extension from Pandoc Markdown, which allows to skip target section:

Ref [Convertible relation] or [Convertible relation][]
    ^^^^^^^^^^^^^^^^^^^^^^    ^^^^^^^^^^^^^^^^^^^^^^^^

To summarize:

* Old RST ways of referencing:
  - manual `Convertible relation <#type-relations-convertible-relation>`_
  - automatic `Convertible relation`_ 
* New Markdown ways:
  - manual [Convertible relation](#type-relations-convertible-relation)
  - automatic, suggested here: [Convertible relation]
                               ^^^^^^^^^^^^^^^^^^^^^^

Also the whole point of Nim extension (which implement nim-lang/RFCs#125) — smart doc links is to avoid inputting a complex target anchor format when referencing Nim's symbol. Applying the current suggestion here:

* old RST  links:
  - manual `iterator walkdir (dir: string) <#walkDir.i,string>`_
  - auto `iterator walkdir (dir: string)`_
*  Markdown  links: 
   - manual [iterator walkDir (dir: string)](#walkDir.i,string)
   - suggested [iterator walkdir (dir: string)]
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Alternatives

  • Input only standard Markdown links.
  • Use Sphinx/RST-style links like `Convertible relation`:ref: — it's also acceptable option considering that :role: syntax does not conflict with Markdown

Additional Information

@a-mr a-mr added the Feature label Aug 1, 2022
a-mr added a commit to a-mr/Nim that referenced this issue Sep 4, 2022
This implements nim-lang#20127.
Besides reference to headings we also support doing references
to Nim symbols inside Nim modules.

Markdown:
```
Some heading
------------

Ref. [Some heading].
```

Nim:
```
proc someFunction*() ...

... ## Ref. [someFunction]
```

This is substitution for RST syntax like `` `target`_ ``.
All 3 syntax variants of extension from Pandoc Markdown are supported:
`[target]`, `[target][]`, `[description][target]`.

This PR also fixes clashes in existing files, particularly
conflicts with RST footnote feature, which does not work with
this PR (but there is a plan to adopt a popular [Markdown footnote
extension](https://pandoc.org/MANUAL.html#footnotes) to make footnotes work).

Also the PR fixes a bug that Markdown links did not work when `[...]`
section had a line break.

The implementation is straightforward since link resolution did not
change w.r.t. RST implementation, it's almost only about new syntax
addition. The only essential difference is a possibility to add a custom
link description: form `[description][target]` which does not have an
RST equivalent.
Varriount pushed a commit that referenced this issue Sep 4, 2022
* Implement Pandoc Markdown concise link extension

This implements #20127.
Besides reference to headings we also support doing references
to Nim symbols inside Nim modules.

Markdown:
```
Some heading
------------

Ref. [Some heading].
```

Nim:
```
proc someFunction*() ...

... ## Ref. [someFunction]
```

This is substitution for RST syntax like `` `target`_ ``.
All 3 syntax variants of extension from Pandoc Markdown are supported:
`[target]`, `[target][]`, `[description][target]`.

This PR also fixes clashes in existing files, particularly
conflicts with RST footnote feature, which does not work with
this PR (but there is a plan to adopt a popular [Markdown footnote
extension](https://pandoc.org/MANUAL.html#footnotes) to make footnotes work).

Also the PR fixes a bug that Markdown links did not work when `[...]`
section had a line break.

The implementation is straightforward since link resolution did not
change w.r.t. RST implementation, it's almost only about new syntax
addition. The only essential difference is a possibility to add a custom
link description: form `[description][target]` which does not have an
RST equivalent.

* fix nim 1.0 gotcha
@a-mr a-mr closed this as completed Sep 16, 2022
ringabout pushed a commit to nim-lang/db_connector that referenced this issue Dec 1, 2022
* Implement Pandoc Markdown concise link extension

This implements nim-lang/Nim#20127.
Besides reference to headings we also support doing references
to Nim symbols inside Nim modules.

Markdown:
```
Some heading
------------

Ref. [Some heading].
```

Nim:
```
proc someFunction*() ...

... ## Ref. [someFunction]
```

This is substitution for RST syntax like `` `target`_ ``.
All 3 syntax variants of extension from Pandoc Markdown are supported:
`[target]`, `[target][]`, `[description][target]`.

This PR also fixes clashes in existing files, particularly
conflicts with RST footnote feature, which does not work with
this PR (but there is a plan to adopt a popular [Markdown footnote
extension](https://pandoc.org/MANUAL.html#footnotes) to make footnotes work).

Also the PR fixes a bug that Markdown links did not work when `[...]`
section had a line break.

The implementation is straightforward since link resolution did not
change w.r.t. RST implementation, it's almost only about new syntax
addition. The only essential difference is a possibility to add a custom
link description: form `[description][target]` which does not have an
RST equivalent.

* fix nim 1.0 gotcha
ringabout pushed a commit to nim-lang/db_connector that referenced this issue Dec 1, 2022
* Implement Pandoc Markdown concise link extension

This implements nim-lang/Nim#20127.
Besides reference to headings we also support doing references
to Nim symbols inside Nim modules.

Markdown:
```
Some heading
------------

Ref. [Some heading].
```

Nim:
```
proc someFunction*() ...

... ## Ref. [someFunction]
```

This is substitution for RST syntax like `` `target`_ ``.
All 3 syntax variants of extension from Pandoc Markdown are supported:
`[target]`, `[target][]`, `[description][target]`.

This PR also fixes clashes in existing files, particularly
conflicts with RST footnote feature, which does not work with
this PR (but there is a plan to adopt a popular [Markdown footnote
extension](https://pandoc.org/MANUAL.html#footnotes) to make footnotes work).

Also the PR fixes a bug that Markdown links did not work when `[...]`
section had a line break.

The implementation is straightforward since link resolution did not
change w.r.t. RST implementation, it's almost only about new syntax
addition. The only essential difference is a possibility to add a custom
link description: form `[description][target]` which does not have an
RST equivalent.

* fix nim 1.0 gotcha
ringabout pushed a commit to nim-lang/db_connector that referenced this issue Dec 1, 2022
* Implement Pandoc Markdown concise link extension

This implements nim-lang/Nim#20127.
Besides reference to headings we also support doing references
to Nim symbols inside Nim modules.

Markdown:
```
Some heading
------------

Ref. [Some heading].
```

Nim:
```
proc someFunction*() ...

... ## Ref. [someFunction]
```

This is substitution for RST syntax like `` `target`_ ``.
All 3 syntax variants of extension from Pandoc Markdown are supported:
`[target]`, `[target][]`, `[description][target]`.

This PR also fixes clashes in existing files, particularly
conflicts with RST footnote feature, which does not work with
this PR (but there is a plan to adopt a popular [Markdown footnote
extension](https://pandoc.org/MANUAL.html#footnotes) to make footnotes work).

Also the PR fixes a bug that Markdown links did not work when `[...]`
section had a line break.

The implementation is straightforward since link resolution did not
change w.r.t. RST implementation, it's almost only about new syntax
addition. The only essential difference is a possibility to add a custom
link description: form `[description][target]` which does not have an
RST equivalent.

* fix nim 1.0 gotcha
ringabout pushed a commit that referenced this issue Dec 1, 2022
* Implement Pandoc Markdown concise link extension

This implements #20127.
Besides reference to headings we also support doing references
to Nim symbols inside Nim modules.

Markdown:
```
Some heading
------------

Ref. [Some heading].
```

Nim:
```
proc someFunction*() ...

... ## Ref. [someFunction]
```

This is substitution for RST syntax like `` `target`_ ``.
All 3 syntax variants of extension from Pandoc Markdown are supported:
`[target]`, `[target][]`, `[description][target]`.

This PR also fixes clashes in existing files, particularly
conflicts with RST footnote feature, which does not work with
this PR (but there is a plan to adopt a popular [Markdown footnote
extension](https://pandoc.org/MANUAL.html#footnotes) to make footnotes work).

Also the PR fixes a bug that Markdown links did not work when `[...]`
section had a line break.

The implementation is straightforward since link resolution did not
change w.r.t. RST implementation, it's almost only about new syntax
addition. The only essential difference is a possibility to add a custom
link description: form `[description][target]` which does not have an
RST equivalent.

* fix nim 1.0 gotcha
ringabout pushed a commit that referenced this issue Dec 1, 2022
* Implement Pandoc Markdown concise link extension

This implements #20127.
Besides reference to headings we also support doing references
to Nim symbols inside Nim modules.

Markdown:
```
Some heading
------------

Ref. [Some heading].
```

Nim:
```
proc someFunction*() ...

... ## Ref. [someFunction]
```

This is substitution for RST syntax like `` `target`_ ``.
All 3 syntax variants of extension from Pandoc Markdown are supported:
`[target]`, `[target][]`, `[description][target]`.

This PR also fixes clashes in existing files, particularly
conflicts with RST footnote feature, which does not work with
this PR (but there is a plan to adopt a popular [Markdown footnote
extension](https://pandoc.org/MANUAL.html#footnotes) to make footnotes work).

Also the PR fixes a bug that Markdown links did not work when `[...]`
section had a line break.

The implementation is straightforward since link resolution did not
change w.r.t. RST implementation, it's almost only about new syntax
addition. The only essential difference is a possibility to add a custom
link description: form `[description][target]` which does not have an
RST equivalent.

* fix nim 1.0 gotcha
ringabout pushed a commit that referenced this issue Dec 1, 2022
* Implement Pandoc Markdown concise link extension

This implements #20127.
Besides reference to headings we also support doing references
to Nim symbols inside Nim modules.

Markdown:
```
Some heading
------------

Ref. [Some heading].
```

Nim:
```
proc someFunction*() ...

... ## Ref. [someFunction]
```

This is substitution for RST syntax like `` `target`_ ``.
All 3 syntax variants of extension from Pandoc Markdown are supported:
`[target]`, `[target][]`, `[description][target]`.

This PR also fixes clashes in existing files, particularly
conflicts with RST footnote feature, which does not work with
this PR (but there is a plan to adopt a popular [Markdown footnote
extension](https://pandoc.org/MANUAL.html#footnotes) to make footnotes work).

Also the PR fixes a bug that Markdown links did not work when `[...]`
section had a line break.

The implementation is straightforward since link resolution did not
change w.r.t. RST implementation, it's almost only about new syntax
addition. The only essential difference is a possibility to add a custom
link description: form `[description][target]` which does not have an
RST equivalent.

* fix nim 1.0 gotcha
a-mr added a commit to a-mr/Nim that referenced this issue Dec 1, 2022
Fully implements nim-lang/RFCs#125
Follow-up of: nim-lang#18642 (for internal links)
and nim-lang#20127.

Overview
--------

Explicit import-like directive is required, called `.. importdoc::`.
(the syntax is % RST, Markdown will use it for a while).

Then one can reference any symbols/headings/anchors, as if they
were in the local file (but they will be prefixed with a module name
or markup document in link text).
It's possible to reference anything from anywhere (any direction
in `.nim`/`.md`/`.rst` files).

See `doc/docgen.md` for full description.

Working is based on `.idx` files, hence one needs to generate
all `.idx` beforehand. A dedicated option `--index:only` is introduced
(and a separate stage for `--index:only` is added to `kochdocs.nim`).

Performance note
----------------

Full run for `./koch docs` now takes 185% of the time before this PR.
(After: 315 s, before: 170 s on my PC).
All the time seems to be spent on `--index:only` run, which takes
almost as much (85%) of normal doc run -- it seems that most time
is spent on file parsing, turning off HTML generation phase has not
helped much.
(One could avoid it by specifying list of files that can be referenced
and pre-processing only them. But it can become error-prone and I assume
that these linke will be **everywhere** in the repository anyway,
especially considering nim-lang/RFCs#478.
So every `.nim`/`.md` file is processed for `.idx` first).

But that's all without significant part of repository converted to
cross-module auto links. To estimate impact I checked the time for
`doc`ing a few files (after all indexes have been generated), and
everywhere difference was **negligible**.
E.g. for `lib/std/private/osfiles.nim` that `importdoc`s large
`os.idx` and hence should have been a case with relatively large
performance impact, but:

* After: 0.59 s.
* Before: 0.59 s.

So Nim compiler works so slow that doc part basically does not matter :-)

Testing
-------

1) added `extlinks` test to `nimdoc/`
2) checked that `theindex.html` is still correct
2) fixed broken auto-links for modules that were derived from `os.nim`
   by adding appropriate ``importdoc``

Implementation note
-------------------

Parsing and formating of `.idx` entries is moved into a dedicated
`rstidx.nim` module from `rstgen.nim`.

`.idx` file format changed:

* fields are not escaped in most cases because we need original
  strings for referencing, not HTML ones
  (the exception is linkTitle for titles and headings).
  Escaping happens later -- on the stage of `rstgen` buildIndex, etc.
* all lines have fixed number of columns 6
* added discriminator tag as a first column,
  it always allows distinguish Nim/markup entries, titles/headings, etc.
  `rstgen` does not rely any more (in most cases) on ad-hoc logic
  to determine what type each entry is.
* there is now always a title entry added at the first line.
* add a line number as 6th column
* linkTitle (4th) column has a different format: before it was like
  `module: funcName()`, now it's `proc funcName()`.
  (This format is also propagated to `theindex.html` and search results,
  I kept it that way since I like it more though it's discussible.)
  This column is what used for Nim symbols resolution.
* also changed details on column format for headings and titles:
  "keyword" is original, "linkTitle" is HTML one
a-mr added a commit to a-mr/Nim that referenced this issue Dec 16, 2022
Fully implements nim-lang/RFCs#125
Follow-up of: nim-lang#18642 (for internal links)
and nim-lang#20127.

Overview
--------

Explicit import-like directive is required, called `.. importdoc::`.
(the syntax is % RST, Markdown will use it for a while).

Then one can reference any symbols/headings/anchors, as if they
were in the local file (but they will be prefixed with a module name
or markup document in link text).
It's possible to reference anything from anywhere (any direction
in `.nim`/`.md`/`.rst` files).

See `doc/docgen.md` for full description.

Working is based on `.idx` files, hence one needs to generate
all `.idx` beforehand. A dedicated option `--index:only` is introduced
(and a separate stage for `--index:only` is added to `kochdocs.nim`).

Performance note
----------------

Full run for `./koch docs` now takes 185% of the time before this PR.
(After: 315 s, before: 170 s on my PC).
All the time seems to be spent on `--index:only` run, which takes
almost as much (85%) of normal doc run -- it seems that most time
is spent on file parsing, turning off HTML generation phase has not
helped much.
(One could avoid it by specifying list of files that can be referenced
and pre-processing only them. But it can become error-prone and I assume
that these linke will be **everywhere** in the repository anyway,
especially considering nim-lang/RFCs#478.
So every `.nim`/`.md` file is processed for `.idx` first).

But that's all without significant part of repository converted to
cross-module auto links. To estimate impact I checked the time for
`doc`ing a few files (after all indexes have been generated), and
everywhere difference was **negligible**.
E.g. for `lib/std/private/osfiles.nim` that `importdoc`s large
`os.idx` and hence should have been a case with relatively large
performance impact, but:

* After: 0.59 s.
* Before: 0.59 s.

So Nim compiler works so slow that doc part basically does not matter :-)

Testing
-------

1) added `extlinks` test to `nimdoc/`
2) checked that `theindex.html` is still correct
2) fixed broken auto-links for modules that were derived from `os.nim`
   by adding appropriate ``importdoc``

Implementation note
-------------------

Parsing and formating of `.idx` entries is moved into a dedicated
`rstidx.nim` module from `rstgen.nim`.

`.idx` file format changed:

* fields are not escaped in most cases because we need original
  strings for referencing, not HTML ones
  (the exception is linkTitle for titles and headings).
  Escaping happens later -- on the stage of `rstgen` buildIndex, etc.
* all lines have fixed number of columns 6
* added discriminator tag as a first column,
  it always allows distinguish Nim/markup entries, titles/headings, etc.
  `rstgen` does not rely any more (in most cases) on ad-hoc logic
  to determine what type each entry is.
* there is now always a title entry added at the first line.
* add a line number as 6th column
* linkTitle (4th) column has a different format: before it was like
  `module: funcName()`, now it's `proc funcName()`.
  (This format is also propagated to `theindex.html` and search results,
  I kept it that way since I like it more though it's discussible.)
  This column is what used for Nim symbols resolution.
* also changed details on column format for headings and titles:
  "keyword" is original, "linkTitle" is HTML one
Varriount pushed a commit that referenced this issue Jan 4, 2023
* docgen: implement cross-document links

Fully implements nim-lang/RFCs#125
Follow-up of: #18642 (for internal links)
and #20127.

Overview
--------

Explicit import-like directive is required, called `.. importdoc::`.
(the syntax is % RST, Markdown will use it for a while).

Then one can reference any symbols/headings/anchors, as if they
were in the local file (but they will be prefixed with a module name
or markup document in link text).
It's possible to reference anything from anywhere (any direction
in `.nim`/`.md`/`.rst` files).

See `doc/docgen.md` for full description.

Working is based on `.idx` files, hence one needs to generate
all `.idx` beforehand. A dedicated option `--index:only` is introduced
(and a separate stage for `--index:only` is added to `kochdocs.nim`).

Performance note
----------------

Full run for `./koch docs` now takes 185% of the time before this PR.
(After: 315 s, before: 170 s on my PC).
All the time seems to be spent on `--index:only` run, which takes
almost as much (85%) of normal doc run -- it seems that most time
is spent on file parsing, turning off HTML generation phase has not
helped much.
(One could avoid it by specifying list of files that can be referenced
and pre-processing only them. But it can become error-prone and I assume
that these linke will be **everywhere** in the repository anyway,
especially considering nim-lang/RFCs#478.
So every `.nim`/`.md` file is processed for `.idx` first).

But that's all without significant part of repository converted to
cross-module auto links. To estimate impact I checked the time for
`doc`ing a few files (after all indexes have been generated), and
everywhere difference was **negligible**.
E.g. for `lib/std/private/osfiles.nim` that `importdoc`s large
`os.idx` and hence should have been a case with relatively large
performance impact, but:

* After: 0.59 s.
* Before: 0.59 s.

So Nim compiler works so slow that doc part basically does not matter :-)

Testing
-------

1) added `extlinks` test to `nimdoc/`
2) checked that `theindex.html` is still correct
2) fixed broken auto-links for modules that were derived from `os.nim`
   by adding appropriate ``importdoc``

Implementation note
-------------------

Parsing and formating of `.idx` entries is moved into a dedicated
`rstidx.nim` module from `rstgen.nim`.

`.idx` file format changed:

* fields are not escaped in most cases because we need original
  strings for referencing, not HTML ones
  (the exception is linkTitle for titles and headings).
  Escaping happens later -- on the stage of `rstgen` buildIndex, etc.
* all lines have fixed number of columns 6
* added discriminator tag as a first column,
  it always allows distinguish Nim/markup entries, titles/headings, etc.
  `rstgen` does not rely any more (in most cases) on ad-hoc logic
  to determine what type each entry is.
* there is now always a title entry added at the first line.
* add a line number as 6th column
* linkTitle (4th) column has a different format: before it was like
  `module: funcName()`, now it's `proc funcName()`.
  (This format is also propagated to `theindex.html` and search results,
  I kept it that way since I like it more though it's discussible.)
  This column is what used for Nim symbols resolution.
* also changed details on column format for headings and titles:
  "keyword" is original, "linkTitle" is HTML one

* fix paths on Windows + more clear code

* Update compiler/docgen.nim

Co-authored-by: Andreas Rumpf <[email protected]>

* Handle .md and .nim paths uniformly in findRefFile

* handle titles better + more comments

* don't allow markup overwrite index title for .nim files

Co-authored-by: Andreas Rumpf <[email protected]>
survivorm pushed a commit to survivorm/Nim that referenced this issue Feb 28, 2023
* docgen: implement cross-document links

Fully implements nim-lang/RFCs#125
Follow-up of: nim-lang#18642 (for internal links)
and nim-lang#20127.

Overview
--------

Explicit import-like directive is required, called `.. importdoc::`.
(the syntax is % RST, Markdown will use it for a while).

Then one can reference any symbols/headings/anchors, as if they
were in the local file (but they will be prefixed with a module name
or markup document in link text).
It's possible to reference anything from anywhere (any direction
in `.nim`/`.md`/`.rst` files).

See `doc/docgen.md` for full description.

Working is based on `.idx` files, hence one needs to generate
all `.idx` beforehand. A dedicated option `--index:only` is introduced
(and a separate stage for `--index:only` is added to `kochdocs.nim`).

Performance note
----------------

Full run for `./koch docs` now takes 185% of the time before this PR.
(After: 315 s, before: 170 s on my PC).
All the time seems to be spent on `--index:only` run, which takes
almost as much (85%) of normal doc run -- it seems that most time
is spent on file parsing, turning off HTML generation phase has not
helped much.
(One could avoid it by specifying list of files that can be referenced
and pre-processing only them. But it can become error-prone and I assume
that these linke will be **everywhere** in the repository anyway,
especially considering nim-lang/RFCs#478.
So every `.nim`/`.md` file is processed for `.idx` first).

But that's all without significant part of repository converted to
cross-module auto links. To estimate impact I checked the time for
`doc`ing a few files (after all indexes have been generated), and
everywhere difference was **negligible**.
E.g. for `lib/std/private/osfiles.nim` that `importdoc`s large
`os.idx` and hence should have been a case with relatively large
performance impact, but:

* After: 0.59 s.
* Before: 0.59 s.

So Nim compiler works so slow that doc part basically does not matter :-)

Testing
-------

1) added `extlinks` test to `nimdoc/`
2) checked that `theindex.html` is still correct
2) fixed broken auto-links for modules that were derived from `os.nim`
   by adding appropriate ``importdoc``

Implementation note
-------------------

Parsing and formating of `.idx` entries is moved into a dedicated
`rstidx.nim` module from `rstgen.nim`.

`.idx` file format changed:

* fields are not escaped in most cases because we need original
  strings for referencing, not HTML ones
  (the exception is linkTitle for titles and headings).
  Escaping happens later -- on the stage of `rstgen` buildIndex, etc.
* all lines have fixed number of columns 6
* added discriminator tag as a first column,
  it always allows distinguish Nim/markup entries, titles/headings, etc.
  `rstgen` does not rely any more (in most cases) on ad-hoc logic
  to determine what type each entry is.
* there is now always a title entry added at the first line.
* add a line number as 6th column
* linkTitle (4th) column has a different format: before it was like
  `module: funcName()`, now it's `proc funcName()`.
  (This format is also propagated to `theindex.html` and search results,
  I kept it that way since I like it more though it's discussible.)
  This column is what used for Nim symbols resolution.
* also changed details on column format for headings and titles:
  "keyword" is original, "linkTitle" is HTML one

* fix paths on Windows + more clear code

* Update compiler/docgen.nim

Co-authored-by: Andreas Rumpf <[email protected]>

* Handle .md and .nim paths uniformly in findRefFile

* handle titles better + more comments

* don't allow markup overwrite index title for .nim files

Co-authored-by: Andreas Rumpf <[email protected]>
capocasa pushed a commit to capocasa/Nim that referenced this issue Mar 31, 2023
* Implement Pandoc Markdown concise link extension

This implements nim-lang#20127.
Besides reference to headings we also support doing references
to Nim symbols inside Nim modules.

Markdown:
```
Some heading
------------

Ref. [Some heading].
```

Nim:
```
proc someFunction*() ...

... ## Ref. [someFunction]
```

This is substitution for RST syntax like `` `target`_ ``.
All 3 syntax variants of extension from Pandoc Markdown are supported:
`[target]`, `[target][]`, `[description][target]`.

This PR also fixes clashes in existing files, particularly
conflicts with RST footnote feature, which does not work with
this PR (but there is a plan to adopt a popular [Markdown footnote
extension](https://pandoc.org/MANUAL.html#footnotes) to make footnotes work).

Also the PR fixes a bug that Markdown links did not work when `[...]`
section had a line break.

The implementation is straightforward since link resolution did not
change w.r.t. RST implementation, it's almost only about new syntax
addition. The only essential difference is a possibility to add a custom
link description: form `[description][target]` which does not have an
RST equivalent.

* fix nim 1.0 gotcha
capocasa pushed a commit to capocasa/Nim that referenced this issue Mar 31, 2023
* docgen: implement cross-document links

Fully implements nim-lang/RFCs#125
Follow-up of: nim-lang#18642 (for internal links)
and nim-lang#20127.

Overview
--------

Explicit import-like directive is required, called `.. importdoc::`.
(the syntax is % RST, Markdown will use it for a while).

Then one can reference any symbols/headings/anchors, as if they
were in the local file (but they will be prefixed with a module name
or markup document in link text).
It's possible to reference anything from anywhere (any direction
in `.nim`/`.md`/`.rst` files).

See `doc/docgen.md` for full description.

Working is based on `.idx` files, hence one needs to generate
all `.idx` beforehand. A dedicated option `--index:only` is introduced
(and a separate stage for `--index:only` is added to `kochdocs.nim`).

Performance note
----------------

Full run for `./koch docs` now takes 185% of the time before this PR.
(After: 315 s, before: 170 s on my PC).
All the time seems to be spent on `--index:only` run, which takes
almost as much (85%) of normal doc run -- it seems that most time
is spent on file parsing, turning off HTML generation phase has not
helped much.
(One could avoid it by specifying list of files that can be referenced
and pre-processing only them. But it can become error-prone and I assume
that these linke will be **everywhere** in the repository anyway,
especially considering nim-lang/RFCs#478.
So every `.nim`/`.md` file is processed for `.idx` first).

But that's all without significant part of repository converted to
cross-module auto links. To estimate impact I checked the time for
`doc`ing a few files (after all indexes have been generated), and
everywhere difference was **negligible**.
E.g. for `lib/std/private/osfiles.nim` that `importdoc`s large
`os.idx` and hence should have been a case with relatively large
performance impact, but:

* After: 0.59 s.
* Before: 0.59 s.

So Nim compiler works so slow that doc part basically does not matter :-)

Testing
-------

1) added `extlinks` test to `nimdoc/`
2) checked that `theindex.html` is still correct
2) fixed broken auto-links for modules that were derived from `os.nim`
   by adding appropriate ``importdoc``

Implementation note
-------------------

Parsing and formating of `.idx` entries is moved into a dedicated
`rstidx.nim` module from `rstgen.nim`.

`.idx` file format changed:

* fields are not escaped in most cases because we need original
  strings for referencing, not HTML ones
  (the exception is linkTitle for titles and headings).
  Escaping happens later -- on the stage of `rstgen` buildIndex, etc.
* all lines have fixed number of columns 6
* added discriminator tag as a first column,
  it always allows distinguish Nim/markup entries, titles/headings, etc.
  `rstgen` does not rely any more (in most cases) on ad-hoc logic
  to determine what type each entry is.
* there is now always a title entry added at the first line.
* add a line number as 6th column
* linkTitle (4th) column has a different format: before it was like
  `module: funcName()`, now it's `proc funcName()`.
  (This format is also propagated to `theindex.html` and search results,
  I kept it that way since I like it more though it's discussible.)
  This column is what used for Nim symbols resolution.
* also changed details on column format for headings and titles:
  "keyword" is original, "linkTitle" is HTML one

* fix paths on Windows + more clear code

* Update compiler/docgen.nim

Co-authored-by: Andreas Rumpf <[email protected]>

* Handle .md and .nim paths uniformly in findRefFile

* handle titles better + more comments

* don't allow markup overwrite index title for .nim files

Co-authored-by: Andreas Rumpf <[email protected]>
bung87 pushed a commit to bung87/Nim that referenced this issue Jul 29, 2023
* docgen: implement cross-document links

Fully implements nim-lang/RFCs#125
Follow-up of: nim-lang#18642 (for internal links)
and nim-lang#20127.

Overview
--------

Explicit import-like directive is required, called `.. importdoc::`.
(the syntax is % RST, Markdown will use it for a while).

Then one can reference any symbols/headings/anchors, as if they
were in the local file (but they will be prefixed with a module name
or markup document in link text).
It's possible to reference anything from anywhere (any direction
in `.nim`/`.md`/`.rst` files).

See `doc/docgen.md` for full description.

Working is based on `.idx` files, hence one needs to generate
all `.idx` beforehand. A dedicated option `--index:only` is introduced
(and a separate stage for `--index:only` is added to `kochdocs.nim`).

Performance note
----------------

Full run for `./koch docs` now takes 185% of the time before this PR.
(After: 315 s, before: 170 s on my PC).
All the time seems to be spent on `--index:only` run, which takes
almost as much (85%) of normal doc run -- it seems that most time
is spent on file parsing, turning off HTML generation phase has not
helped much.
(One could avoid it by specifying list of files that can be referenced
and pre-processing only them. But it can become error-prone and I assume
that these linke will be **everywhere** in the repository anyway,
especially considering nim-lang/RFCs#478.
So every `.nim`/`.md` file is processed for `.idx` first).

But that's all without significant part of repository converted to
cross-module auto links. To estimate impact I checked the time for
`doc`ing a few files (after all indexes have been generated), and
everywhere difference was **negligible**.
E.g. for `lib/std/private/osfiles.nim` that `importdoc`s large
`os.idx` and hence should have been a case with relatively large
performance impact, but:

* After: 0.59 s.
* Before: 0.59 s.

So Nim compiler works so slow that doc part basically does not matter :-)

Testing
-------

1) added `extlinks` test to `nimdoc/`
2) checked that `theindex.html` is still correct
2) fixed broken auto-links for modules that were derived from `os.nim`
   by adding appropriate ``importdoc``

Implementation note
-------------------

Parsing and formating of `.idx` entries is moved into a dedicated
`rstidx.nim` module from `rstgen.nim`.

`.idx` file format changed:

* fields are not escaped in most cases because we need original
  strings for referencing, not HTML ones
  (the exception is linkTitle for titles and headings).
  Escaping happens later -- on the stage of `rstgen` buildIndex, etc.
* all lines have fixed number of columns 6
* added discriminator tag as a first column,
  it always allows distinguish Nim/markup entries, titles/headings, etc.
  `rstgen` does not rely any more (in most cases) on ad-hoc logic
  to determine what type each entry is.
* there is now always a title entry added at the first line.
* add a line number as 6th column
* linkTitle (4th) column has a different format: before it was like
  `module: funcName()`, now it's `proc funcName()`.
  (This format is also propagated to `theindex.html` and search results,
  I kept it that way since I like it more though it's discussible.)
  This column is what used for Nim symbols resolution.
* also changed details on column format for headings and titles:
  "keyword" is original, "linkTitle" is HTML one

* fix paths on Windows + more clear code

* Update compiler/docgen.nim

Co-authored-by: Andreas Rumpf <[email protected]>

* Handle .md and .nim paths uniformly in findRefFile

* handle titles better + more comments

* don't allow markup overwrite index title for .nim files

Co-authored-by: Andreas Rumpf <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant