-
Notifications
You must be signed in to change notification settings - Fork 1.2k
cross platform: Change default timeout to 120 secs #1413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CI / Debug build tests fail due to timeout. Array/array_slice.js test actually takes 14 seconds to complete on an empty macbook-pro with Quad 2.9 Ghz i7 cores on it. Previous 60 seconds timeout may not cover some tests on CI VM / Debug build
|
FYI @ianwjhalliday Does it make sense to change the default timeout, or would it be better to just selectively increase the timeout for the problematic tests? Since we see this for some VMs internally on Windows builds, if we're going to increase the default timeout, we could increase it across all OSs. But I'm a bit worried about extending the timeout and then missing tests that do become problematic. The real solution is to either make the tests shorter to accomplish the same thing, or if the longer runtime is important to the test, then increase the timeout selectively for those tests and run them in the slow/nightly configurations only. |
@dilijev While sharing the same CI / physical machine resources with many other projects, I can't see how test timings are important. (from 60 to 120 seconds) |
|
@obastemur It's an indication that these tests may be badly written or belong in slow tests. Unless there's a reason to have them run for a long time, unit tests should basically finish instantly. It's just an issue of codebase quality (which includes tests). |
In that case we shouldn't even have 60 seconds timeout.. The problem here is pretty straight-forward. As it is written on PR's description; That particular test takes 14 secs on a very decent machine with no additional task. CI can be pretty busy hence it may need more than 60.. IMO there is no practical difference (on a heavily shared CI) between 60 and 120. |
|
@obastemur You may be right about there being no practical difference between 60 and 120. There's definitely a real reason to have some timeout because some tests may not complete at all for one reason or another and block up a test machine, which is part of why we added the timeouts in the first place. For the set of tests we see failing regularly, it seems 120s is not enough on some VMs anyway, so I've submitted #1421 to increase the timeouts to 5 minutes to solve the immediate problem. Few enough tests run beyond even 10s that it seems really overkill to expand the default to a place where it might take twice as long to notice a test is starting to take a long time. We don't have this problem very often and we're seeing it now because of new test infra configurations. If we take the tests that fail regularly and fix up the timeouts, the problem gets resolved without the sledgehammer approach. |
|
In an ideal world where CI machines had enough power that they didn't starve tests of CPU time long enough that they timeout I would say we should not up the default timeout. But then in that ideal world I would have to admit that my principles here say that we should shrink the default timeout to no more than something small like 5 seconds. This is squishy. I feel on the one hand the 60 second timeout is nice to catch longer running tests that starve easily, and get our attention on them to decide whether they are worthwhile or have room for improvement (or should be relegated to the slower tests bucket). Extending it to 120 or higher would still catch infinite loops, but it would allow slower tests to creep in unnoticed. I think I prefer leaving the default as is and dealing with individual offending tests on a per case basis. If this continues to be a problem only on CI then we could up the default timeout, or maybe we need to push for better CI resources if there is a large disparity between them and our dev machines. Alternatively, as Doug mentioned elsewhere, we could introduce a warning threshold and then increase the timeout significantly. In which case I would suggest finding a way to have a 5 second warning threshold without it becoming desensitizing noise in CI. |
|
@dilijev @ianwjhalliday the whole claim sounds like comparing apples to oranges. We don't have JIT etc. enabled on xplat and there is no Debug vs Release test timeout approach on both RL and runtests.py. This PR was targeting There is a need for Release/Debug/JIT/no JIT timeout approach in case (which I don't agree) we want to keep 60 secs around. IMO, timeout we have serves one purpose; kill a test/process somehow failed to succeed in a given max time. We share the same CI with many other projects and it is hard (if it is possible) to measure anything making sense from the current CI. It just helps us to see if we break anything with a particular PR. In case a PR affects the performance, this CI is definitely not the place to catch that. Reminder; closed this PR already. Since we have another PR merged to master no need for merging this one IMHO. |
|
That's fair. Note that we do run the test suite in We did an exercise almost a couple years ago now where we looked at the time each test took on a sample dev machine and examined all the tests that took longer than 5 seconds, and either moved them to the Slow bucket or modified them to not be slow. Perhaps we should do this exercise again, on a lower end machine. We should have the data from the CI. I'll work with @dilijev to go through it and see what we find. |
CI / Debug build tests fail due to timeout.
Array/array_slice.js test actually takes 14 seconds to complete
on an empty macbook-pro with Quad 2.9 Ghz i7 cores on it.
Previous 60 seconds timeout may not cover some tests on CI VM / Debug build.
Fixes #1406