Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is SHARE_SPHINX_DOCTREE enabled? #6966

Closed
mgeier opened this issue Apr 26, 2020 · 6 comments · Fixed by #6991
Closed

Is SHARE_SPHINX_DOCTREE enabled? #6966

mgeier opened this issue Apr 26, 2020 · 6 comments · Fixed by #6991
Labels
Accepted Accepted issue on our roadmap Needed: documentation Documentation is required

Comments

@mgeier
Copy link
Contributor

mgeier commented Apr 26, 2020

This feature was added in #5407 and removed from the documentation in #5734.

It doesn't look like it's enabled by default, e.g. in https://readthedocs.org/projects/nbsphinx/builds/10912447/:

python /home/docs/checkouts/readthedocs.org/user_builds/nbsphinx/envs/latest/bin/sphinx-build -T -b readthedocs -d _build/doctrees-readthedocs -D language=en . _build/html
python /home/docs/checkouts/readthedocs.org/user_builds/nbsphinx/envs/latest/bin/sphinx-build -T -b readthedocssinglehtmllocalmedia -d _build/doctrees-readthedocssinglehtmllocalmedia -D language=en . _build/localmedia
python /home/docs/checkouts/readthedocs.org/user_builds/nbsphinx/envs/latest/bin/sphinx-build -b latex -D language=en -d _build/doctrees . _build/latex
python /home/docs/checkouts/readthedocs.org/user_builds/nbsphinx/envs/latest/bin/sphinx-build -T -b epub -d _build/doctrees-epub -D language=en . _build/epub

Isn't this just a horrible waste of resources?

Regarding this comment by @stsewd (#5734 (comment)):

This isn't a setting that will change the build result in any way, or at least it shouldn't. It will save resources, maybe, but that's not something the user will care about so much, that's more our concern. It's like having a cache for pip, that's useful for us, not something the user cares about so much.

I think this can and will affect users if they are waiting for their build to be finished. But probably that's not an important point.

However, using separate "doctree" directories can make a build go over the given resource limits totally unnecessarily. So you might end up losing users who think their docs cannot be built on RTD, even when it wouldn't be a problem when sharing the "doctree".

@humitos
Copy link
Member

humitos commented Apr 26, 2020

This feature was added in #5407 and removed from the documentation in #5734.

The feature flag is still there in the code. Although, it was never "officially" added to the documentation 😄 --only an attempt in that PR

Honestly, I don't really know why this was introduced the first time. I found it was at a5d9d45 5 years ago. How builds were built in that time was super different and I'd assume there was a concurrency problem and the easiest solution was to not share anything. Although, I haven't find anything that mentioned the reasons of this.

Another possible problematic was that builders had a cache for 24hs of the whole environment. So, maybe sharing and old doctree was a problem.

Now, our builders are single-process and they build only one project at a time from a clean environment. So, if it was a concurrency/caching problem, it's not there anymore; but we will need to take care again about this when adding a new caching layer we were thinking to add (but hasn't give us good results yet).

I think this can and will affect users if they are waiting for their build to be finished

I'd be happy to test this more, document the flag (because I think it's important for users if it affects considerable to reduce their build times) and eventually make it the default if we can trust on this setting. Would you like me to enable this feature flag in your projects and give us feedback about the improvements on time plus issues in case they appear?

Edit: this seems to be the PR where this was introduced the first time: #1540

@mgeier
Copy link
Contributor Author

mgeier commented Apr 27, 2020

Would you like me to enable this feature flag in your projects and give us feedback about the improvements on time plus issues in case they appear?

Yes, please!

These two would be the most interesting:

https://nbsphinx.readthedocs.io/
https://splines.readthedocs.io/

This one currently has memory problems due to conda, but I might switch away from conda, and then this also becomes an interesting candidate:

https://sfs-python.readthedocs.io/


I guess the large majority of projects on RTD uses only HTML builds, right?
So most users shouldn't be affected by this either way.

But a smaller group of users (including me) cares about PDF output, where this could potentially save some time and space. Same for EPUB (which I'm not that much interested in).

And an even smaller group of users use Sphinx extensions like nbsphinx, which spend a long time on the "input parsing" step (because Jupyter notebooks are executed at this point). This time-consuming step is completely unnecessarily repeated for each output format.

@humitos
Copy link
Member

humitos commented Apr 28, 2020

I enabled it the flag in all of them (3). Let me know about how it goes.

This one currently has memory problems due to conda, but I might switch away from conda, and then this also becomes an interesting candidate:

We migrated our infrastructure to bigger servers (a blog post is coming about this) and you should be able to build it conda without problems now.

<personal opinion>However, I usually recommend to use conda on Read the Docs only if it's strictly needed --like compiled packages that are not possible to install via pip</personal opinion>

I guess the large majority of projects on RTD uses only HTML builds, right?

I'd say you are right.

@mgeier
Copy link
Contributor Author

mgeier commented Apr 28, 2020

Thanks!

I just re-triggered the builds and looked at the build durations compared to the preceding build:

https://readthedocs.org/projects/splines/builds/10925647/

total: 395 -> 239 seconds
HTML: 169 -> 156s
LaTeX: 152 -> 2s
LaTeX -> PDF: 6 -> 6s

https://readthedocs.org/projects/nbsphinx/builds/10925639/

total: 321 -> 197 seconds
HTML: 31 -> 34s
htmlzip: 36 -> 20s
LaTeX: 59 -> 24s
LaTeX -> PDF: 4 -> 6s

And indeed the conda build works now, the previous successful build is quite a while ago:

https://readthedocs.org/projects/sfs-python/builds/10925651/

total: 680 -> 429 seconds
HTML: 119 -> 76s
htmlzip: 65 -> 3s
LaTeX: 92 -> 3s
LaTeX -> PDF: 5 -> 5s

Of course the timing values have a large variance, but in the projects that do some significant processing in the notebooks, the savings are obvious, aren't they?


I agree in not using conda if it isn't absolutely necessary. I had to switch conda (but this was years ago) to get a sufficiently recent Matplotlib version. In another case, I had to use it to get a more recent Pandoc. I think both are fine nowadays, so for me there is no reason to use conda anymore.

@humitos
Copy link
Member

humitos commented Apr 28, 2020

Happy to read those numbers and happy that conda now works for you! Please, keep us posted in case you find some inconsistency when updating your docs anything that you could find related to this.

I think the only missing things from this issue is to document this feature flag so users can discover it. In the future, we can make this the default if we feel comfortable enough :)

@humitos humitos added Accepted Accepted issue on our roadmap Needed: documentation Documentation is required labels Apr 28, 2020
@mgeier
Copy link
Contributor Author

mgeier commented Apr 28, 2020

Cool, thanks, I'll let you know when something comes up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Accepted issue on our roadmap Needed: documentation Documentation is required
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants