-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build time out #1767
Comments
For two weeks or so we also cannot build the Galaxy docs on readthedocs.org. Every build fails with timeout (which is 900s) and the suspected offender is the command where pip installs our requirements.txt (takes ~500s, see the 3rd command on this page https://readthedocs.org/projects/galaxy/builds/3407847/ ). Because of this taking so long we do not have enough time to build our docs before the build times out. In the last successful build this step took 11 seconds instead of 500 because the cache was used. (see https://readthedocs.org/projects/galaxy/builds/3358410/ ) How to solve this? Did something change on RTD side? |
We've recently updated our build backend processes. Among other changes for security, they now also timeout after taking too long to build. Consider ways to shorten your build if possible -- pruning requirements and mocking unnecessary modules for documentation generation will speed up install and compilation times. In the near future, we'd like to allow projects that have donated or have a gold subscription longer build times and more memory. We're also working on https://github.com/rtfd/sphinx-autoapi to get around the whole issue of Python needing to execute code just to obtain docstrings. |
How about a more reasonable timeout? I think this is just going to push users/projects away from rtd. |
Sure, a more reasonable timeout would be fine, with more reasonable compensation :) Really though, we're not necessarily against raising the limit here -- what would you consider to be a fair timeout limit? In the end, we're a free service with a minuscule budget. We need to be able to put restrictions on builds to maintain fair queueing. We can't continue to offer free, unbound, unlimited builds without some support -- our budget can't support it. We hope more users are willing to support us if they find they are heavy users. |
@agjohnson Firstly, thank you for your response. Getting a support for a free service feels heavenly. We much appreciate the services you guys are providing and we are giving you credit on many occasions including our release notes. Thank you for the work you are doing. I do not consider Galaxy Project being a 'heavy user' - we do doc builds cca every two weeks. It is just 'heavy project' for your process because it has many dependencies. Moreover, if the cache was used, as in pre-change builds, the build would take ~500s less. We will explore possible solutions of this situation from our side and update this thread if we come up with something. |
@martenson Agreed, I think build frequency would be good to gauge here as well. A project that is built every couple of days is not the same as a project that sees frequent commits and long build times. Maybe that means that keeping a low base timeout limit, but weight infrequent builds with more additional build time. |
I tried to rebuild our docs with
link to build: https://readthedocs.org/projects/galaxy/builds/3422092/ |
@martenson Yeah, that looks like an OOM kill on the build vm, due to limits set up with Docker. It should catch that as a failure and report it, though we use metadata from Docker's API to determine an OOM kill, this might just be an uncaught kill. Currently, our build memory limit is 1G, it might be worth figuring out if that is normal usage. We've done some complex builds that track large API reference sets, and still haven't hit that sort of memory limit. I'll see if I can't dig up any more info on our end. Another byproduct of Docker containerization is that we aren't sharing the pip cache anymore -- which might speed up the builds in your case. I'll open a ticket about localizing the pip cache for the containers. |
We touched this topic at our internal meeting and the outcome is that If the feature was available we would be happy to pay for being able to build and host on readthedocs. But afaik that option is not possible yet and with putting these restrictions on builds we are in a bad spot. :/ |
Sorry I haven't had time to come back to this. I think many projects do not need to build the docs on each commit. Once a day at a pre-specified time would be more than enough...not sure if it is possible to set something like that up though. |
I haven't had time to get back to this. I have some thoughts on making the build timeout a bit more fair for users, past users that have donated. Unfortunately, I haven't had time to work on this much in the last week. I'll see about just bumping up the limits for now to allow for your projects to build for the short term. |
Would allowing projects that don't need it to opt out of the single page build (no one will ever read this) or the json build be appropriate? I just moved a project to readthedocs -- wonderful! So thankful for this VOLUNTEER, FREE service -- and I'm running into the timeouts (music21.readthedocs.org), but only when it gets to the second and third builds (json and single-page). I'd gladly opt out of that if I could. Thanks! I'd also be fine with limits on total time per day (though, from experiences with other projects, increasing this during the "setup" period can help to ease frustration; I've probably built 25 times today trying to get the transition set up; after this, I'll probably need a build less than once a week) |
Galaxy has very similar experience to @mscuthbert - the additional builds are not needed, and one build per week would probably be enough. To be frank I personally do not understand the direction readthedocs is moving. This is documentation, not a production bug that needs to be built and deployed asap. In the meantime we were forced to host our own documentation. The docs build from scratch takes us 4mins on an old machine. |
it will always fail until readthedocs/readthedocs.org#1767 is resolved
we cannot use it until readthedocs/readthedocs.org#1767 is resolved
Just letting you know that due to these issues we had to leave readthedocs and started to host our own docs...sadly. |
Unfortunate, yes, however enforcing build timeouts for projects like this has addressed a number of issues that required our constant attention: rampant builds, daily build queue congestion, resource contention on the build servers. This greatly improves the service for 97% of users, though at the expense of 3% of users. If we had the funds and the time, we could take on the operational costs that unchecked builds incur -- we have neither though. As I mentioned above, there's room for improvement on fairer build queue timing, however this work currently has a lower priority than the work that is keeping everything moving. Users can already support a project using gold subscriptions, we look to add a longer timeout for these blessed projects. This will happen likely this month or next. I think the most correct answer for the future is a longer timeout dependent on build queue depth, but that's more work than we can immediately muster. tl;dr - free services can't be fair to all of their users and those donating their time to keeping the service running at the same time. |
For the project mentioned here, I've increased the allowed build timeouts. We've recently added per project settings for some of the container level settings. An additional builder has helped to keep build queue congestion down slightly. Again, this will eventually be a gold subscription feature, if you find it useful, consider donating. We still need to add the ability for gold subscription blessed project to alter the build settings, which is the next piece here. |
Closing this here, as we have the ability to increase limits for specific projects now. |
i also have run into this problem. what is the recommended solution? |
I'm also facing the same problem: "Command killed due to excessive memory consumption" when pip is installing dependencies. What can be done ? @agjohnson how can the limit be increased ? |
@perone, can you post more details preferably in a separate issue with some details? Which project is it for? How long does the build take to run locally? How much memory does it use locally? Even large projects to do not generally take more than a few hundred MB of memory unless something is wrong. |
@davidfischer Hi David, I'm pretty sure that the problem is with PyTorch pip installation. It has a large file size:
However this shouldn't trigger a OOM killer right ? PS: PyTorch is a pretty common framework, so I wonder how other people solved that. |
It seems there are more people with the same problem: I also think that pip has some problems with big files. It's not the first time a read something similar. |
Also, @perone, do you need pytorch to build your docs? Maybe it's a good reason to have a separate requirements.txt file for RTD. (and also save a lot of bandwith :) ) |
Thanks @humitos, I already did that (separate requirements for RTD), but the problem is that then I need to mock and mocking creates another issue for the documentation of inherited classes, so it's another problem to solve =( |
I'm trying to build, but sphinx takes more than 900 seconds to finish and so the job is timing out.
https://readthedocs.org/projects/chebee7i-networkx/builds/3408967/
Is there a way to increase this limit?
The text was updated successfully, but these errors were encountered: