Build time out #1767

chebee7i · 2015-10-18T16:50:49Z

I'm trying to build, but sphinx takes more than 900 seconds to finish and so the job is timing out.

https://readthedocs.org/projects/chebee7i-networkx/builds/3408967/

Is there a way to increase this limit?

martenson · 2015-10-19T18:47:39Z

For two weeks or so we also cannot build the Galaxy docs on readthedocs.org. Every build fails with timeout (which is 900s) and the suspected offender is the command where pip installs our requirements.txt (takes ~500s, see the 3rd command on this page https://readthedocs.org/projects/galaxy/builds/3407847/ ). Because of this taking so long we do not have enough time to build our docs before the build times out.

In the last successful build this step took 11 seconds instead of 500 because the cache was used. (see https://readthedocs.org/projects/galaxy/builds/3358410/ )

How to solve this? Did something change on RTD side?

agjohnson · 2015-10-21T22:49:10Z

We've recently updated our build backend processes. Among other changes for security, they now also timeout after taking too long to build. Consider ways to shorten your build if possible -- pruning requirements and mocking unnecessary modules for documentation generation will speed up install and compilation times.

In the near future, we'd like to allow projects that have donated or have a gold subscription longer build times and more memory. We're also working on https://github.com/rtfd/sphinx-autoapi to get around the whole issue of Python needing to execute code just to obtain docstrings.

chebee7i · 2015-10-21T23:13:03Z

How about a more reasonable timeout? I think this is just going to push users/projects away from rtd.

agjohnson · 2015-10-22T00:48:51Z

Sure, a more reasonable timeout would be fine, with more reasonable compensation :)

Really though, we're not necessarily against raising the limit here -- what would you consider to be a fair timeout limit?

In the end, we're a free service with a minuscule budget. We need to be able to put restrictions on builds to maintain fair queueing. We can't continue to offer free, unbound, unlimited builds without some support -- our budget can't support it. We hope more users are willing to support us if they find they are heavy users.

martenson · 2015-10-22T01:41:28Z

@agjohnson Firstly, thank you for your response. Getting a support for a free service feels heavenly. We much appreciate the services you guys are providing and we are giving you credit on many occasions including our release notes. Thank you for the work you are doing.

I do not consider Galaxy Project being a 'heavy user' - we do doc builds cca every two weeks. It is just 'heavy project' for your process because it has many dependencies. Moreover, if the cache was used, as in pre-change builds, the build would take ~500s less.

We will explore possible solutions of this situation from our side and update this thread if we come up with something.

agjohnson · 2015-10-22T02:16:19Z

@martenson Agreed, I think build frequency would be good to gauge here as well. A project that is built every couple of days is not the same as a project that sees frequent commits and long build times.

Maybe that means that keeping a low base timeout limit, but weight infrequent builds with more additional build time.

martenson · 2015-10-22T18:52:20Z

I tried to rebuild our docs with
Give the virtual environment access to the global site-packages dir. set to True and got a speedup in the dependencies. The build failed too but with a different error and before the 900s timeout (Build took 845 seconds). Is this still the same issue - the build being killed form outside?

writing... Killed
Command time: 142s Return: 137

link to build: https://readthedocs.org/projects/galaxy/builds/3422092/

agjohnson · 2015-10-22T19:26:04Z

@martenson Yeah, that looks like an OOM kill on the build vm, due to limits set up with Docker. It should catch that as a failure and report it, though we use metadata from Docker's API to determine an OOM kill, this might just be an uncaught kill.

Currently, our build memory limit is 1G, it might be worth figuring out if that is normal usage. We've done some complex builds that track large API reference sets, and still haven't hit that sort of memory limit. I'll see if I can't dig up any more info on our end.

Another byproduct of Docker containerization is that we aren't sharing the pip cache anymore -- which might speed up the builds in your case. I'll open a ticket about localizing the pip cache for the containers.

martenson · 2015-10-23T19:03:36Z

We touched this topic at our internal meeting and the outcome is that

If the feature was available we would be happy to pay for being able to build and host on readthedocs.

But afaik that option is not possible yet and with putting these restrictions on builds we are in a bad spot. :/

chebee7i · 2015-10-29T01:37:34Z

Sorry I haven't had time to come back to this. I think many projects do not need to build the docs on each commit. Once a day at a pre-specified time would be more than enough...not sure if it is possible to set something like that up though.

agjohnson · 2015-11-10T20:20:22Z

I haven't had time to get back to this. I have some thoughts on making the build timeout a bit more fair for users, past users that have donated. Unfortunately, I haven't had time to work on this much in the last week. I'll see about just bumping up the limits for now to allow for your projects to build for the short term.

mscuthbert · 2015-11-28T22:12:48Z

Would allowing projects that don't need it to opt out of the single page build (no one will ever read this) or the json build be appropriate? I just moved a project to readthedocs -- wonderful! So thankful for this VOLUNTEER, FREE service -- and I'm running into the timeouts (music21.readthedocs.org), but only when it gets to the second and third builds (json and single-page). I'd gladly opt out of that if I could. Thanks!

I'd also be fine with limits on total time per day (though, from experiences with other projects, increasing this during the "setup" period can help to ease frustration; I've probably built 25 times today trying to get the transition set up; after this, I'll probably need a build less than once a week)

martenson · 2015-11-28T23:13:05Z

Galaxy has very similar experience to @mscuthbert - the additional builds are not needed, and one build per week would probably be enough. To be frank I personally do not understand the direction readthedocs is moving. This is documentation, not a production bug that needs to be built and deployed asap.

In the meantime we were forced to host our own documentation. The docs build from scratch takes us 4mins on an old machine.

it will always fail until readthedocs/readthedocs.org#1767 is resolved

we cannot use it until readthedocs/readthedocs.org#1767 is resolved

martenson · 2016-01-06T16:48:44Z

Just letting you know that due to these issues we had to leave readthedocs and started to host our own docs...sadly.

https://docs.galaxyproject.org/en/master/index.html

agjohnson · 2016-01-06T22:40:22Z

Unfortunate, yes, however enforcing build timeouts for projects like this has addressed a number of issues that required our constant attention: rampant builds, daily build queue congestion, resource contention on the build servers. This greatly improves the service for 97% of users, though at the expense of 3% of users. If we had the funds and the time, we could take on the operational costs that unchecked builds incur -- we have neither though.

As I mentioned above, there's room for improvement on fairer build queue timing, however this work currently has a lower priority than the work that is keeping everything moving. Users can already support a project using gold subscriptions, we look to add a longer timeout for these blessed projects. This will happen likely this month or next. I think the most correct answer for the future is a longer timeout dependent on build queue depth, but that's more work than we can immediately muster.

tl;dr - free services can't be fair to all of their users and those donating their time to keeping the service running at the same time.

agjohnson · 2016-01-28T04:44:54Z

For the project mentioned here, I've increased the allowed build timeouts. We've recently added per project settings for some of the container level settings. An additional builder has helped to keep build queue congestion down slightly.

Again, this will eventually be a gold subscription feature, if you find it useful, consider donating. We still need to add the ability for gold subscription blessed project to alter the build settings, which is the next piece here.

agjohnson · 2016-02-27T03:26:34Z

Closing this here, as we have the ability to increase limits for specific projects now.

arsenovic · 2017-01-04T16:50:32Z

i also have run into this problem.

what is the recommended solution?

perone · 2018-04-18T21:18:23Z

I'm also facing the same problem: "Command killed due to excessive memory consumption" when pip is installing dependencies. What can be done ? @agjohnson how can the limit be increased ?

davidfischer · 2018-04-18T21:25:48Z

@perone, can you post more details preferably in a separate issue with some details? Which project is it for? How long does the build take to run locally? How much memory does it use locally?

Even large projects to do not generally take more than a few hundred MB of memory unless something is wrong.

perone · 2018-04-18T21:45:33Z

@davidfischer Hi David, I'm pretty sure that the problem is with PyTorch pip installation. It has a large file size:

torch-0.3.1-cp27-cp27mu-manylinux1_x86_64.whl (496.9MB)

However this shouldn't trigger a OOM killer right ?

PS: PyTorch is a pretty common framework, so I wonder how other people solved that.

humitos · 2018-04-19T02:13:54Z

However this shouldn't trigger a OOM killer right ?

It seems there are more people with the same problem:

pytorch/pytorch#1022

I also think that pip has some problems with big files. It's not the first time a read something similar.

humitos · 2018-04-19T02:15:03Z

Also, @perone, do you need pytorch to build your docs? Maybe it's a good reason to have a separate requirements.txt file for RTD.

(and also save a lot of bandwith :) )

perone · 2018-04-19T02:22:38Z

Thanks @humitos, I already did that (separate requirements for RTD), but the problem is that then I need to mock and mocking creates another issue for the documentation of inherited classes, so it's another problem to solve =(

agjohnson added the Support Support question label Oct 21, 2015

agjohnson self-assigned this Oct 22, 2015

agjohnson mentioned this issue Oct 22, 2015

Establish pip cache for docker containers #1783

Closed

chebee7i mentioned this issue Oct 29, 2015

autosummary docs dont work on readthedocs networkx/networkx#1796

Closed

martenson added a commit to martenson/galaxy that referenced this issue Dec 3, 2015

remove readthedocs build badge

7c9c81d

it will always fail until readthedocs/readthedocs.org#1767 is resolved

martenson added a commit to martenson/galaxy that referenced this issue Dec 3, 2015

replace the readthedocs badge,

6735c9d

we cannot use it until readthedocs/readthedocs.org#1767 is resolved

martenson mentioned this issue Dec 3, 2015

replace the readthedocs badge, galaxyproject/galaxy#1229

Merged

martenson mentioned this issue Jan 26, 2016

Build of "gecko" project timing out #1874

Closed

agjohnson closed this as completed Feb 27, 2016

hammer mentioned this issue Mar 27, 2016

Add documentation to readthedocs openvax/pyensembl#143

Open

return42 mentioned this issue Apr 7, 2016

build on Read The Docs failed / to large return42/sphkerneldoc#1

Closed

akien-mga mentioned this issue Aug 4, 2016

Build timeouts on rtd.org since mid July (longer than 900 s) godotengine/godot-docs#218

Closed

ltalirz mentioned this issue May 3, 2018

readthedocs documentation no longer building aiidateam/aiida-core#1472

Closed

mdraw mentioned this issue May 25, 2018

readthedocs page with auto-generated API documentation ELEKTRONN/elektronn3#16

Closed

mscuthbert mentioned this issue Jul 10, 2018

Documentation cuthbertLab/music21#311

Closed

timpitman mentioned this issue May 5, 2020

Bump version to 1.0.0 stellargraph/stellargraph#1513

Merged

mscuthbert mentioned this issue Dec 11, 2020

Include full documentation in the Community Group Report w3c/musicxml#353

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build time out #1767

Build time out #1767

chebee7i commented Oct 18, 2015

martenson commented Oct 19, 2015

agjohnson commented Oct 21, 2015

chebee7i commented Oct 21, 2015

agjohnson commented Oct 22, 2015

martenson commented Oct 22, 2015

agjohnson commented Oct 22, 2015

martenson commented Oct 22, 2015

agjohnson commented Oct 22, 2015

martenson commented Oct 23, 2015

chebee7i commented Oct 29, 2015

agjohnson commented Nov 10, 2015

mscuthbert commented Nov 28, 2015

martenson commented Nov 28, 2015

martenson commented Jan 6, 2016

agjohnson commented Jan 6, 2016

agjohnson commented Jan 28, 2016

agjohnson commented Feb 27, 2016

arsenovic commented Jan 4, 2017

perone commented Apr 18, 2018

davidfischer commented Apr 18, 2018

perone commented Apr 18, 2018 •

edited

Loading

humitos commented Apr 19, 2018

humitos commented Apr 19, 2018

perone commented Apr 19, 2018

Build time out #1767

Build time out #1767

Comments

chebee7i commented Oct 18, 2015

martenson commented Oct 19, 2015

agjohnson commented Oct 21, 2015

chebee7i commented Oct 21, 2015

agjohnson commented Oct 22, 2015

martenson commented Oct 22, 2015

agjohnson commented Oct 22, 2015

martenson commented Oct 22, 2015

agjohnson commented Oct 22, 2015

martenson commented Oct 23, 2015

chebee7i commented Oct 29, 2015

agjohnson commented Nov 10, 2015

mscuthbert commented Nov 28, 2015

martenson commented Nov 28, 2015

martenson commented Jan 6, 2016

agjohnson commented Jan 6, 2016

agjohnson commented Jan 28, 2016

agjohnson commented Feb 27, 2016

arsenovic commented Jan 4, 2017

perone commented Apr 18, 2018

davidfischer commented Apr 18, 2018

perone commented Apr 18, 2018 • edited Loading

humitos commented Apr 19, 2018

humitos commented Apr 19, 2018

perone commented Apr 19, 2018

perone commented Apr 18, 2018 •

edited

Loading