-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test Timeout: test/cmd/observe.sh:22: executing 'oc observe services --once --all-namespaces' #12930
Comments
@stevekuznetsov would the |
No, I don't think so -- that exits with a successful result code, and we want the test to fail if it times out. |
since updating |
I just don't know that this is part of the observe use case. It runs
forever, which is the whole point. The test is timing out, and we just
lack the test infrastructure in test/cmd to properly detect test timeout
and cleanup the child process.
…On Mon, Feb 13, 2017 at 5:08 PM, Juan Vallejo ***@***.***> wrote:
since updating --exit-after to timeout with a non-zero exit code would
break scripts / change behavior of the flag unexpectedly, I can open a PR
that adds a second flag --timeout-after which would exit with 1 after the
specified duration. If a second timeout flag is non-ideal, we could
introduce a new option --signal which would override the 0 exit code of
--exit-after
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#12930 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p0211ody8Np_4ol_J26At9AD3_cYks5rcNR6gaJpZM4L-gv7>
.
|
@juanvallejo this is a particularly nasty test failure ... not sure I understand why the prio downgrade happened. @smarterclayton I suggested to @juanvallejo to just change this to |
There is literally no scenario under which it's reasonable for this to hang
unless we have a very serious bug. Adding specific changes to each test is
the wrong solution. A top level timeout is the right solution, coupled
with the fix. Understanding why this is hanging is p0.
…On Tue, Feb 14, 2017 at 3:01 PM, Steve Kuznetsov ***@***.***> wrote:
@juanvallejo <https://github.com/juanvallejo> this is a particularly
nasty test failure ... not sure I understand why the prio downgrade
happened.
@smarterclayton <https://github.com/smarterclayton> I suggested to
@juanvallejo <https://github.com/juanvallejo> to just change this to os::cmd
... "timeout oc observe ... " -- I understand what you mean about the
failure cause here, but we cannot have this run for five hours. Give it a
ten minute window and then give up.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#12930 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p7WW8Rw23xZQbtv6xNJyI7a7Duw9ks5rcggTgaJpZM4L-gv7>
.
|
Right, but 6h long merge queue execution intervals don't bode well for the merge queue leading up to feature complete. If you think we should live with the pain while we figure out the flake, sure. I'm not in agreement but at least if it's painful it will be higher priority to look at. I also saw this in: |
Put a timeout on "make check"
On Feb 14, 2017, at 7:10 PM, Steve Kuznetsov <[email protected]> wrote:
Right, but 6h long merge queue execution intervals don't bode well for the
merge queue leading up to feature complete. If you think we should live
with the pain while we figure out the flake, sure. I'm not in agreement but
at least if it's painful it will be higher priority to look at.
I also saw this in:
https://ci.openshift.redhat.com/jenkins/job/test_job/3/console
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#12930 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p3alILgB7xZu2ahVQCo9RT6TBTwIks5rckJwgaJpZM4L-gv7>
.
|
Then we get corrupted test output and no XML. Do we want that? |
Timeout won't tell you what the actual problem is - if that's the point here, add something to panic child processes. Whatever issue this is isn't going to be found by granular timeouts. |
closing via #12980 |
Looks like this:
We should add a client timeout
The text was updated successfully, but these errors were encountered: