Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc hard, and with a vengeance #11943

Merged
merged 10 commits into from
Aug 2, 2015
Merged

Doc hard, and with a vengeance #11943

merged 10 commits into from
Aug 2, 2015

Conversation

MikeInnes
Copy link
Member

Alright, so for some reason I've come back for round two. This time I hope that what's going on is a lot clearer, and it's more automated so that it'll be easier to wait for input from whomever wants to provide it.

Remaining issues:

  • Quote escaping
  • Module clashes – the helpdb.jl production treats eps as separate from Dates.eps, but they're actually the same function and so one clobbers the other.
  • Multiple methods – these are all joined together for now. Technically we could fix this in a hacky way, but it would be fiddly – I think it's best to wait for a real solution with the new manual.
  • Shared docstrings, e.g. and with in.

I think there are few enough of these cases (aside from the methods thing) that we can fix them by hand later.

From a quick check, a relatively small number (222) of the docstrings use weird :...: syntaxes which means the rest (1036) should be easy to convert to markdown automatically.

@shashi
Copy link
Contributor

shashi commented Jun 30, 2015

This looks lossless so far! 👍

How do you want to handle RST specific features? Interpolating Julia code, maybe? e.g. :func:f could be $(func_link(:function_name)). And similar functions for other things?

@jiahao
Copy link
Member

jiahao commented Jun 30, 2015

Thank you for trying again. I'll give this more careful scrutiny tomorrow, but can you explain what the idea is in this current iteration? For example, are the lines following .. function:: test(pkgs...) meant to be the new docstring for Pkg.test()? 66a5b0f#diff-bf56c1110568368fd8d56907a8df71d9R220

.. function:: test(pkgs...)

   ``test()``

   Run the tests for all installed packages ensuring that each package's test dependencies are installed for the duration of the test. A package is tested by running its ``test/runtests.jl`` file and test dependencies are specified in ``test/REQUIRE``.

   ``test(pkgs...)``

    Run the tests for each package in ``pkgs`` ensuring that each package's test dependencies are installed for the duration of the test. A package is tested by running its ``test/runtests.jl`` file and test dependencies are specified in ``test/REQUIRE``.

@jiahao
Copy link
Member

jiahao commented Jun 30, 2015

A question for later design, maybe: I was discussing further with @jakebolewski the idea we floated offline about a second pass to detect cross-referrable entities within the docs. @jakebolewski reminded me that the help database currently isn't reified until the first help query is run. Does this mean that cross-references can only be generated after help is called once? Or will there need to be a second pass to generate rendered docstrings with cross-references statically? If so, how will runtime extensibility from loading new modules work?

@StefanKarpinski
Copy link
Member

@one-more-minute, I'm glad you're taking another crack at this even after I got you into hot water the last time around.

@MikeInnes
Copy link
Member Author

@StefanKarpinski No problem! Looking forward to this being done and dusted though.

@jiahao So to try and make it clear what's going on here: In the previous system, the *.rst files were the primary source for docstrings, and they were loaded into Julia as a second step. This PR flips that around – docs/helpdb.jl is now the primary source for docstrings like Pkg.test(), and those docs are spliced into the manual as a second step.

That means that once this is in, the old help system will be entirely redundant – its mechanics aren't really relevant to us here.

Things like cross references are worth discussing – I think we can make them a lot nicer than they were before, and it can all be done in a dynamic way. However, since this flip-around step is (hopefully) the most painful part of the process, I think it makes sense to make it as simple as possible, focus on moving without information loss, and build on it from there – deleting the help system, translating to markdown, implementing cross-references etc. etc., will be much easier if we can build them on top of the reorganised system in future PRs.

It's a little unfortunate that the docs have to get worse before they get better, and obviously I'd love to do this stuff all in one go, but I just don't see a route that involves a reasonable level of effort. I think the very short-term pain will be well worth it long run.

(Although arguably, this actually improves the REPL experience, since we can view more than one method of a function now.)

@jiahao
Copy link
Member

jiahao commented Jun 30, 2015

Thanks Mike, that explanation was very helpful.

I've examined the troublesome cases from the previous incarnation of this PR and they all look much better than before. Separating the issues of migrating from manual to docstring, from ReST conversion, is much clearer and avoids all the formatting weirdness like changing backquoting to double quotes.

There is an issue with docstrings shared across multiple functions, e.g. != and . If I understand correctly, the new helpdb.jl only associates the existing docstring to Base.(:(!=)) but not Base.(:(≠)). The easiest thing to do might be to duplicate the existing docstrings for the handful of such cases, but maybe it's worth 5 minutes of thought on how to handle shared docstrings.

There's also a minor formatting problem in the docstring itself, where only the first method is double-backquoted but other methods are not. That can be fixed easily though.

@MikeInnes
Copy link
Member Author

Ok great, thanks for the feedback. Would it be sufficient to indent all signatures, i.e. changing that to

```rst
           !=(x, y)
           ≠(x,y)

... blah ...

for them all to be properly formatted as code?

We can handle shared docstrings reasonably nicely already by doing:

@doc (@doc foo) bar

It might be useful, and would be reasonably easy to implement, to have proper aliases for metadata. It might make sense to handle these cases as part of this translation – I'll play around with it.

@ScottPJones
Copy link
Contributor

Since I think a lot of cases of Unicode operators have (or should have) an ASCII equalivalent operator or function name, it does seem like it would be good to make that sharing handled.
👍 for idea of having proper aliases for the metadata.

@jiahao
Copy link
Member

jiahao commented Jun 30, 2015

Indentation would suffice in ReST when prefaced with a line containing only "::"

@MikeInnes
Copy link
Member Author

Actually, unicode aliases like are just pointers to the original function anyway, as opposed to functions in their own right. So we don't need to do anything more to handle that properly.

@ScottPJones
Copy link
Contributor

I'm pretty sure I've seen cases where there was just a Unicode character, and no ASCII alias

@jiahao
Copy link
Member

jiahao commented Jun 30, 2015

The most complicated example of shared docstrings is in, , , : 66a5b0f#diff-badd7d54b8cfdba53f7e1064089334adR160

Not all of these are exact synonyms. Can this case be handled? (If not, one serious alternative is to document only in, and re-document the rest.)

@IainNZ IainNZ added the docsystem The documentation building system label Jun 30, 2015
@MikeInnes MikeInnes force-pushed the omm/doc-transition-2 branch 4 times, most recently from e5798c3 to c742260 Compare July 7, 2015 22:20
::
eye(A)

Constructs an identity matrix of the same dimensions and type as ``A``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eye(A) is repeated 3 times.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the methods issue mentioned in the PR description – three methods are listed in the manual, the methods are combined for the whole function, each method in the manual gets the docs for the whole function.

In cases like this we can probably just delete the second two method listings.

@MikeInnes
Copy link
Member Author

@jiahao Absolutely, we can handle those cases easily enough. Although if they're not actually synonyms it probably isn't a bad idea to give them separate docstrings with reference to each other.

@MikeInnes
Copy link
Member Author

How does this look now? I've edited the head of the PR to look at the issues that aren't addressed in the code itself, but let me know if I've missed any feedback.

@@ -788,13 +907,11 @@ Mathematical Functions

.. function:: trunc([T,] x, [digits, [base]])

``trunc(x)`` returns the nearest integral value of the same type as ``x`` whose absolute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two methods for trunc getting clobbered by the trunc method in Dates ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that looks like it's the same issue as with eps.

@JeffBezanson JeffBezanson added this to the 0.4.0 milestone Jul 15, 2015
@JeffBezanson
Copy link
Member

What's the status of this? Anything I or anyone else can do to help? Let's get an RC out!

@MikeInnes
Copy link
Member Author

As long as there are no more major info losses that we can deal with systematically, I think this should basically be good to go (modulo a rebase). Once it's in I can work on what can be converted automatically, and we can crowdsource moving things around and fixing the remaining issues.

I didn't get round to doing @MichaelHatherly's idea of splitting up the huge helpdb.jl file, can have a go at that if it's wanted.

@JeffBezanson
Copy link
Member

+1 for getting this merged and crowdsourcing the needed doc touch-ups.

@johnmyleswhite
Copy link
Member

+1 to that

@@ -200,6 +204,7 @@ Macro reference
would generate::

if i_1 > size(A, 1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Odd. Bug maybe?

@StefanKarpinski
Copy link
Member

Ok, we're doing this and going to deal with various issues that need to be fixed as they come up.

StefanKarpinski added a commit that referenced this pull request Aug 2, 2015
@StefanKarpinski StefanKarpinski merged commit bc17cec into master Aug 2, 2015
@StefanKarpinski StefanKarpinski deleted the omm/doc-transition-2 branch August 2, 2015 18:38
@aviks
Copy link
Member

aviks commented Aug 2, 2015

Woo hoo! Thanks @one-more-minute, @MichaelHatherly et al...

@tkelman
Copy link
Contributor

tkelman commented Aug 2, 2015

Yeah, probably the right thing to do. Would still be nice to get answers to the honest questions I had above. Hard to fix known issues if no one else knows how to go about fixing them. And we'll have to be very careful about open PR's that touch parts of the docs that are now autogenerated from docstrings.

@StefanKarpinski
Copy link
Member

I agree, but we needed to move on this and get a 0.4 RC out the door.

@tkelman
Copy link
Contributor

tkelman commented Aug 2, 2015

Sure, my 2 small "before merging" bullet items were taken care of. Now we all need to clean this up and let everyone who has an open PR or will file any future PRs touching docs know what they need to do. I still don't get where or what needs to be done to restore the data that's currently in helpdb but not in the rst docs, do you?

As far as @ViralBShah's idea above goes, I was actually thinking the other way around. This PR created the docstrings already so now we should hopefully be free to move them around however we want, but conversion from rst to markdown and moving away from sphinx seems way premature and not something we should try to do while we're worrying about release candidates.

@tkelman
Copy link
Contributor

tkelman commented Aug 2, 2015

All open PR's that touch docs (likely an overestimate):

#12432
#12429
#12423
#12408
#12385
#12383
#12353
#12350
#12333
#12325
#12146
#12083
#12042
#12033
#12025
#11998
#11977
#11842
#11754
#11631
#11242
#11077
#11012
#10940
#10810
#10800
#10799
#10776
#10760
#10612
#10517
#10369
#10277
#10108
#9798
#9596
#9340
#8775
#8685
#8668
#8641
#8636
#8258
#7327
#7186
#6842
#6669
#6455

dumb code I used to get that list, feel free to make fancier

for i in `seq 7`; do
  curl "https://api.github.com/repos/JuliaLang/julia/pulls?state=open&page=$i" | jq "map(.number)" | tr -d '[],' >> openprs.txt;
done
for i in `cat openprs.txt`; do
  if [ -n "$(curl -sL https://github.com/JuliaLang/julia/pull/$i.patch | grep '/doc/')" ]; then
    echo "#$i";
  fi;
done

@KristofferC
Copy link
Member

So what's the next step now? I read the comments but I'm still not sure. Is it to transfer the stuff from helpdb.jl to the source files?

@ScottPJones
Copy link
Contributor

👍 💯 for doing this! Now to iron out any glitches before 0.4.0 release.
I just got a build with new master, and found that the ? is not giving any information for some functions that it used to (I was hoping that my document strings in Base would be picked up now).

help?> convert(UTF32String, Char['a','b','c'])

help?> convert(UTF8String, Char['a','b','c'])

help?> convert(UTF8String, UInt8[64,65,66])

help?> convert(Int, 3.0)

Old:

help?> convert(Int, 3.0)
convert(::Type{Int64}, x::Float64) at int.jl:205

julia> convert(UTF8String, UInt8[64, 65, 66])
"@AB"

help?> convert(UTF8String, UInt8[64, 65, 66])
convert(::Type{UTF8String}, dat::Array{UInt8,1}) at unicode/utf8.jl:231

@tkelman
Copy link
Contributor

tkelman commented Aug 2, 2015

That looks like a regression that is probably worth a separate issue.

@amitmurthy
Copy link
Contributor

Echoing @KristofferC , what is the next step for now? If changes are not expected to be made to doc/stdlib/*.rst and the source files do not yet carry documentation, do we just hold off on stdlib doc changes for now?

@MikeInnes
Copy link
Member Author

Docstring changes are fine, but they should be made to helpdb.jl as opposed to *.rst.

The next step is to convert stuff to markdown, #12435, and once that's in we can all start to move the docstrings to appropriate places.

@tkelman
Copy link
Contributor

tkelman commented Aug 3, 2015

Don't know if you missed this or are ignoring me, but again, would be great to get answers for the following

moving the docstrings for the handful of small clashes / lost-rst-data to the methods in question will automatically populate type info from the method signature, then even staying in rst that data could be brought back? Is that type-information population from methods currently working when the docstrings are in the right place, or does the whole system for doing that need to be written?

@MikeInnes
Copy link
Member Author

Sure, have just had a lot on so things might've slipped by. Just moving the docstrings above the correct methods will give them the correct type info, yes – all that stuff's already in place. Then we just need a syntax for type info in the RST docs and for genstdlib.jl to recognise that syntax, e.g. we'd change

.. function:: length(a)

to

.. function:: length(a::AbstractArray)

so that the specific method doc is spliced in instead of the whole function.

@tkelman
Copy link
Contributor

tkelman commented Aug 3, 2015

Awesome. I'll experiment with moving around the functions getting clobbered by Dates module docs, and utf32 which currently aren't making it into RST correctly and see if it helps bring those back. (Though maybe you'll want to get #12435 done first, so I'll do it in a PR.)

Some of the RST docs have already been using declaration-like type syntax, but not all. genstdlib.jl tweaks will probably need your help.

@MikeInnes
Copy link
Member Author

Ok, great – it also occurs to me that I can actually just run the #12435 conversion over all of Base if necessary, so it's not really a big deal if we start moving docstrings before then.

@tkelman
Copy link
Contributor

tkelman commented Aug 4, 2015

Well, tried https://gist.github.com/b5c7657d53d5c37b4e85 but so far that doesn't appear to be helping much. And at the repl this might just be #12437?

help?> eps
search: eps RepString @elapsed indexpids expanduser escape_string peakflops unescape_string

  Returns Millisecond(1) for DateTime values and Day(1) for Date values.

help?> Base.eps
  Returns Millisecond(1) for DateTime values and Day(1) for Date values.

help?> eps(Date(1))

help?> eps()

help?> eps(1.0)
  Returns Millisecond(1) for DateTime values and Day(1) for Date values.

I think the top priority needs to be getting the repl help usable and fixing the manual back up, more than worrying about what format it's in.

stevengj added a commit to JuliaPy/PyCall.jl that referenced this pull request Aug 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docsystem The documentation building system
Projects
None yet
Development

Successfully merging this pull request may close these issues.