-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI timing out during puppeteer browser download #1843
Comments
Instead of restructuring the GH Action workflow, could setup-node's cache help reduce the time? |
+1 for not restructuring the GH Action workflow yet (it is parallelized to reduce overall CI run time), but I'm not sure that adding cache will fix the issue which seems more like a timeout than just slowness. We should first investigate the root cause. These are some questions that came to mind. The following query and some internet searching provided some insight.
gh run list --workflow ci --status failure
Is this specific to node=16? The two examples linked in the issue description both failed on node=16.No. 10427440548 failed on node=18, 10605318313 failed on node=20, and 10517503671 failed on node=22. When did this start happening?The first occurence was 23 days ago with 10273500575. Can we figure out why
|
I hadn't seen this in a while, but just noticed it again in https://github.com/nextstrain/auspice/actions/runs/11469100955/job/31915629485 |
Seems more reproducible today so I'll start investigating. |
Trying to catch and dissect the elusive fish that is npm ci's stochastic timeout.¹ ¹ #1843
Here's the log:
While that was running, another
This looks similar to #1709 (comment) where puppeteer browser download was timing out on Heroku. It seems like we aren't actively using puppeteer: usage was removed in 2047696. I think properly getting rid of all references to it, including the dev dependency, will solve this issue. That shouldn't discourage us from bringing it back in the future though. We're downloading a deprecated version of the package so my wishful thinking is that newer versions won't come with this issue. |
I misspoke above. We are using jest-puppeteer for tests such as: auspice/test/smoke-test/urls.test.js Line 32 in def2ffb
I'll look into either replacing that or upgrading puppeteer. |
npm ci
calls in CI to reduce stochastic failures
A typical CI run calls
npm ci
18 times. While this mostly just works,npm ci
stochastically takes a looong time (couple of examples from the past week where it hit the 6 hour time limit for an action) which means stochastic CI failures. Reducing the number ofnpm ci
calls is an obvious way to make this less likely to affecta jobthe action.Instead of running independent jobs for build, unit-test, smoke-test, type-check, bundlesize, check-lockfile we could collect them into a single job . Across a matrix of 4 node versions this would reduce the
npm ci
calls from 18 to 4.This has the downside of not showing each step in the LHS sidebar's "Jobs summary", we'd just see 4 jobs of "test (node version)" or similar. Using
if: always()
for steps should allow them to run even if a preceeding step failed whilst still giving them the red check mark;continue-on-error: true
for the job may do the same.The text was updated successfully, but these errors were encountered: