MINOR: Print out stdout output on non-zero exit code in ssh_capture() #234

stanislavkozlovski · 2020-06-26T08:13:42Z

This patch changes the ssh_capture command to additionally print out stdout output on non-zero exit codes. While it's expected for the command to print out its error in stderr, not all do - Kafka's tools (e.g kafka-topics) do not, for example.

ghost · 2020-06-26T08:13:44Z

@confluentinc It looks like @stanislavkozlovski just signed our Contributor License Agreement. 👍

Always at your service,

clabot

stanislavkozlovski · 2020-06-26T08:16:32Z

It's difficult to catch the stdout output in tests since the exception gets raised in the generator. Here's an example test I had:

command_output_lines = node.account.ssh_capture(cmd, allow_fail=False)
        command_output = ''.join([line for line in command_output_lines])
        self.logger.debug(command_output)

        return command_output

but it would throw an exception inside the generator, on the last iteration. The output would have been:

RemoteCommandError: ducker@ducker03: Command '/opt/kafka-dev/bin/command.sh --args --bootstrap-server ducker03:9092,ducker04:9092,ducker05:9092 --command-config /mnt/command/command.properties' returned non-zero exit status 1. Remote error message:

Making it hard to debug what the issue was.
After the changes, it's much clearer:

  File "/usr/local/lib/python2.7/dist-packages/ducktape-0.7.7-py2.7.egg/ducktape/cluster/remoteaccount.py", line 331, in output_generator
    raise RemoteCommandError(self, cmd, exit_status, full_output)
RemoteCommandError: ducker@ducker03: Command '/opt/kafka-dev/bin/command.sh --args --bootstrap-server ducker03:9092,ducker04:9092,ducker05:9092 --command-config /mnt/command/command.properties' returned non-zero exit status 1. Remote error message: stderr:
stdout: SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/kafka-dev/core/build/dependant-libs-2.13.2/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/ce-aegis/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/ce-metrics/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/ce-sbk/build/dependant-libs-2.13.2/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/tools/build/dependant-libs-2.13.2/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/ce-broker-plugins/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/ce-auth-providers/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/ce-audit/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/connect/api/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/connect/transforms/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/connect/runtime/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/connect/file/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/connect/mirror/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/connect/mirror-client/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/connect/json/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/kafka-dev/connect/basic-auth-extension/build/dependant-libs/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Initiating request...
Error while issuing request
Error while executing ____ org.apache.kafka.common.errors.Exception: blah blah.
[2020-06-26 08:07:17,530] ERROR java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.Exception: ...
	at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
	at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
	at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
	at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
....
Caused by: org.apache.kafka.common.errors.Exception: blah blah

steverod · 2020-06-29T16:21:26Z

ducktape/cluster/remoteaccount.py

                if callback is None:
                    yield line
                else:
                    yield callback(line)
            try:
                exit_status = stdout.channel.recv_exit_status()
                if exit_status != 0:
+                    full_output = "stderr: {}\nstdout: {}".format(stderr.read(), ''.join(stdout_lines))


My only nit would be if we want to have some sort of (both easily parseable and visually compelling) delineator between these two output streams. E.g.

===== stderr === ... ==== stdout === ...

ewencp

The problem with doing this universally though is that it can drastically drive up memory usage when ssh_capture is used for long processes. For example, verifiable_producer and consumer in Kafka actually stream the output of those clients and process them during the test. The tracked state may be much smaller than the raw text output, so tracking all of this could easily more than double the costs (which is why we put SSHOutputIter as the output type originally).

So this should probably be behind a flag (defaulting off at least for now since the new behavior isn't expected by existing uses) or an alternative method.

imcdo · 2022-07-11T22:56:48Z

hi @stanislavkozlovski I still feel this would be a great improvment for ducktape as it is one of the greater barriers to debugging right now. Ewen's point still stands. I think something we generally want is the ability to grab stderr from any failed command that fails (maybe by caching stderr into a file on the remote machine and then printing that file out if the command fails otherwise delete the file on completion could be a compromise here)

stan-is-hate · 2022-07-11T23:30:06Z

I would suggest getting together and brainstorming this - it's a good feature for sure, but also no rush on it either. As with any ssh communication changes, I would also want this to be a change on the master branch only (ie no backporting), to have older versions to roll back to

stanislavkozlovski · 2022-07-18T16:53:18Z

I am also happy to hand this over to anybody willing to take this home

stan-is-hate · 2022-07-18T22:31:34Z

@stanislavkozlovski yeah, our team should be owning this in all honesty. Thanks for pushing it forward! I hope either me or Ian can get some bandwidth to work on this in the near future.

cla-assistant · 2023-08-16T05:00:08Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

MINOR: Print out stdout output on non-zero exit code in ssh_capture()

a706b84

stanislavkozlovski requested review from ewencp and a team June 26, 2020 08:17

scott-hendricks approved these changes Jun 29, 2020

View reviewed changes

steverod reviewed Jun 29, 2020

View reviewed changes

ewencp reviewed Jun 29, 2020

View reviewed changes

imcdo mentioned this pull request Nov 21, 2022

Show stdout and stderr with failure in remote account #369

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MINOR: Print out stdout output on non-zero exit code in ssh_capture() #234

MINOR: Print out stdout output on non-zero exit code in ssh_capture() #234

stanislavkozlovski commented Jun 26, 2020

ghost commented Jun 26, 2020

stanislavkozlovski commented Jun 26, 2020 •

edited

Loading

steverod Jun 29, 2020

ewencp left a comment

imcdo commented Jul 11, 2022

stan-is-hate commented Jul 11, 2022

stanislavkozlovski commented Jul 18, 2022

stan-is-hate commented Jul 18, 2022

cla-assistant bot commented Aug 16, 2023

MINOR: Print out stdout output on non-zero exit code in ssh_capture() #234

Are you sure you want to change the base?

MINOR: Print out stdout output on non-zero exit code in ssh_capture() #234

Conversation

stanislavkozlovski commented Jun 26, 2020

ghost commented Jun 26, 2020

stanislavkozlovski commented Jun 26, 2020 • edited Loading

steverod Jun 29, 2020

Choose a reason for hiding this comment

ewencp left a comment

Choose a reason for hiding this comment

imcdo commented Jul 11, 2022

stan-is-hate commented Jul 11, 2022

stanislavkozlovski commented Jul 18, 2022

stan-is-hate commented Jul 18, 2022

cla-assistant bot commented Aug 16, 2023

stanislavkozlovski commented Jun 26, 2020 •

edited

Loading