diff --git a/docs/source/reference-io.rst b/docs/source/reference-io.rst index e395f2c6fb..9435b3684d 100644 --- a/docs/source/reference-io.rst +++ b/docs/source/reference-io.rst @@ -652,9 +652,22 @@ Spawning subprocesses Trio provides support for spawning other programs as subprocesses, communicating with them via pipes, sending them signals, and waiting -for them to exit. Currently this interface consists of the -:class:`trio.Process` class, which is modelled after :class:`subprocess.Popen` -in the standard library. +for them to exit. The interface for doing so consists of two layers: + +* :func:`trio.run_process` runs a process from start to + finish and returns a :class:`~subprocess.CompletedProcess` object describing + its outputs and return value. This is what you should reach for if you + want to run a process to completion before continuing, while possibly + sending it some input or capturing its output. It is modelled after + the standard :func:`subprocess.run` with some additional features + and safer defaults. + +* :class:`trio.Process` starts a process in the background and optionally + provides Trio streams for interacting with it (sending input, + receiving output and errors). Using it requires a bit more code + than :func:`~trio.run_process`, but exposes additional capabilities: + back-and-forth communication, processing output as soon as it is generated, + and so forth. It is modelled after the standard :class:`subprocess.Popen`. .. _subprocess-options: @@ -662,98 +675,75 @@ in the standard library. Options for starting subprocesses ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The standard :mod:`subprocess` module supports a dizzying array -of `options `__ -for controlling the environment in which a process starts and the -mechanisms used for communicating with it. (If you find that list -overwhelming, you're not alone; you might prefer to start with -just the `frequently used ones -`__.) - -Trio makes use of the :mod:`subprocess` module's logic for spawning -processes, so almost all of these options can be used with their same -semantics when starting subprocesses under Trio; pass them wherever you see -``**options`` in the API documentation below. (You may need to -``import subprocess`` in order to access constants such as ``PIPE`` or -``DEVNULL``.) The exceptions are ``encoding``, ``errors``, -``universal_newlines`` (and its 3.7+ alias ``text``), and ``bufsize``; -Trio always uses unbuffered byte streams for communicating with a -process, so these options don't make sense. Text I/O should use a -layer on top of the raw byte streams, just as it does with sockets. -[This layer does not yet exist, but is in the works.] +All of Trio's subprocess APIs accept the numerous keyword arguments used +by the standard :mod:`subprocess` module to control the environment in +which a process starts and the mechanisms used for communicating with +it. These may be passed wherever you see ``**options`` in the +documentation below. See the `full list +`__ +or just the `frequently used ones +`__ +in the :mod:`subprocess` documentation. (You may need to ``import +subprocess`` in order to access constants such as ``PIPE`` or +``DEVNULL``.) + +Currently, Trio always uses unbuffered byte streams for communicating +with a process, so it does not support the ``encoding``, ``errors``, +``universal_newlines`` (alias ``text`` in 3.7+), and ``bufsize`` +options. Running a process and waiting for it to finish ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We're `working on `__ -figuring out the best API for common higher-level subprocess operations. -In the meantime, you can implement something like the standard library -:func:`subprocess.run` in terms of :class:`trio.Process` -as follows:: - - async def run( - command, *, input=None, capture_output=False, **options - ): - if input is not None: - options['stdin'] = subprocess.PIPE - if capture_output: - options['stdout'] = options['stderr'] = subprocess.PIPE - - stdout_chunks = [] - stderr_chunks = [] - - async with trio.Process(command, **options) as proc: - - async def feed_input(): - async with proc.stdin: - if input: - try: - await proc.stdin.send_all(input) - except trio.BrokenResourceError: - pass - - async def read_output(stream, chunks): - async with stream: - while True: - chunk = await stream.receive_some(32768) - if not chunk: - break - chunks.append(chunk) - - async with trio.open_nursery() as nursery: - if proc.stdin is not None: - nursery.start_soon(feed_input) - if proc.stdout is not None: - nursery.start_soon(read_output, proc.stdout, stdout_chunks) - if proc.stderr is not None: - nursery.start_soon(read_output, proc.stderr, stderr_chunks) - await proc.wait() - - stdout = b"".join(stdout_chunks) if proc.stdout is not None else None - stderr = b"".join(stderr_chunks) if proc.stderr is not None else None - - if proc.returncode: - raise subprocess.CalledProcessError( - proc.returncode, proc.args, output=stdout, stderr=stderr - ) - else: - return subprocess.CompletedProcess( - proc.args, proc.returncode, stdout, stderr - ) +The basic interface for running a subprocess start-to-finish is +:func:`trio.run_process`. It always waits for the subprocess to exit +before returning, so there's no need to worry about leaving a process +running by mistake after you've gone on to do other things. +:func:`~trio.run_process` is similar to the standard library +:func:`subprocess.run` function, but tries to have safer defaults: +with no options, the subprocess's input is empty rather than coming +from the user's terminal, and a failure in the subprocess will be +propagated as a :exc:`subprocess.CalledProcessError` exception. Of +course, these defaults can be changed where necessary. + +.. autofunction:: trio.run_process Interacting with a process as it runs ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -You can spawn a subprocess by creating an instance of +If you want more control than :func:`~trio.run_process` affords, +you can spawn a subprocess by creating an instance of :class:`trio.Process` and then interact with it using its :attr:`~trio.Process.stdin`, :attr:`~trio.Process.stdout`, and/or :attr:`~trio.Process.stderr` streams. .. autoclass:: trio.Process - :members: + + .. autoattribute:: returncode + + .. automethod:: aclose + + .. automethod:: wait + + .. automethod:: poll + + .. automethod:: kill + + .. automethod:: terminate + + .. automethod:: send_signal + + .. note:: :meth:`~subprocess.Popen.communicate` is not provided as a + method on :class:`~trio.Process` objects; use :func:`~trio.run_process` + instead, or write the loop yourself if you have unusual + needs. :meth:`~subprocess.Popen.communicate` has quite unusual + cancellation behavior in the standard library (on some platforms it + spawns a background thread which continues to read from the child + process even after the timeout has expired) and we wanted to + provide an interface with fewer surprises. .. _subprocess-quoting: @@ -850,52 +840,6 @@ Further reading: * https://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts -Differences from :class:`subprocess.Popen` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -* All arguments to the constructor of - :class:`~trio.Process`, except the command to run, must be - passed using keywords. - -* :meth:`~subprocess.Popen.communicate` is not provided as a method on - :class:`~trio.Process` objects; use a higher-level - function instead, or write the loop yourself if - you have unusual needs. :meth:`~subprocess.Popen.communicate` has - quite unusual cancellation behavior in the standard library (on some - platforms it spawns a background thread which continues to read from - the child process even after the timeout has expired) and we wanted - to provide an interface with fewer surprises. - -* :meth:`~trio.Process.wait` is an async function that does - not take a ``timeout`` argument; combine it with - :func:`~trio.fail_after` if you want a timeout. - -* Text I/O is not supported: you may not use the - :class:`~trio.Process` constructor arguments - ``universal_newlines`` (or its 3.7+ alias ``text``), ``encoding``, - or ``errors``. - -* :attr:`~trio.Process.stdin` is a :class:`~trio.abc.SendStream` and - :attr:`~trio.Process.stdout` and :attr:`~trio.Process.stderr` - are :class:`~trio.abc.ReceiveStream`\s, rather than file objects. The - :class:`~trio.Process` constructor argument ``bufsize`` is - not supported since there would be no file object to pass it to. - -* :meth:`~trio.Process.aclose` (and thus also - ``__aexit__``) behave like the standard :class:`~subprocess.Popen` - context manager exit (close pipes to the process, then wait for it - to exit), but add additional behavior if cancelled: kill the process - and wait for it to finish terminating. This is useful for scoping - the lifetime of a simple subprocess that doesn't spawn any children - of its own. (For subprocesses that do in turn spawn their own - subprocesses, there is not currently any way to clean up the whole - tree; moreover, using the :class:`Process` context manager in such - cases is likely to be counterproductive as killing the top-level - subprocess leaves it no chance to do any cleanup of its children - that might be desired. You'll probably want to write your own - supervision logic in that case.) - - Signals ------- diff --git a/newsfragments/822.feature.rst b/newsfragments/822.feature.rst new file mode 100644 index 0000000000..1578a2273d --- /dev/null +++ b/newsfragments/822.feature.rst @@ -0,0 +1,2 @@ +Add :func:`trio.run_process` as a high-level helper for running a process +and waiting for it to finish, like the standard :func:`subprocess.run` does. diff --git a/trio/__init__.py b/trio/__init__.py index 652253e2fe..9c16214308 100644 --- a/trio/__init__.py +++ b/trio/__init__.py @@ -48,7 +48,7 @@ from ._path import Path -from ._subprocess import Process +from ._subprocess import Process, run_process from ._ssl import SSLStream, SSLListener, NeedHandshakeError diff --git a/trio/_subprocess.py b/trio/_subprocess.py index 9cd069cfd3..958e1e4a72 100644 --- a/trio/_subprocess.py +++ b/trio/_subprocess.py @@ -9,11 +9,8 @@ wait_child_exiting, create_pipe_to_child_stdin, create_pipe_from_child_output ) - import trio -__all__ = ["Process"] - class Process(AsyncResource): r"""Execute a child program in a new process. @@ -177,14 +174,29 @@ def __init__( self.args = self._proc.args self.pid = self._proc.pid + def __repr__(self): + if self.returncode is None: + status = "running with PID {}".format(self.pid) + else: + if self.returncode < 0: + status = "exited with signal {}".format(-self.returncode) + else: + status = "exited with status {}".format(self.returncode) + return "".format(self.args, status) + @property def returncode(self): - """The exit status of the process (an integer), or ``None`` if it has - not exited. + """The exit status of the process (an integer), or ``None`` if it is + not yet known to have exited. - Negative values indicate termination due to a signal (on UNIX only). - Like :attr:`subprocess.Popen.returncode`, this is not updated outside - of a call to :meth:`wait` or :meth:`poll`. + By convention, a return code of zero indicates success. On + UNIX, negative values indicate termination due to a signal, + e.g., -11 if terminated by signal 11 (``SIGSEGV``). On + Windows, a process that exits due to a call to + :meth:`Process.terminate` will have an exit status of 1. + + Accessing this attribute does not check for termination; + use :meth:`poll` or :meth:`wait` for that. """ return self._proc.returncode @@ -214,10 +226,7 @@ async def wait(self): """Block until the process exits. Returns: - The exit status of the process (a nonnegative integer, with - zero usually indicating success). On UNIX systems, a process - that exits due to a signal will have its exit status reported - as the negative of that signal number, e.g., -11 for ``SIGSEGV``. + The exit status of the process; see :attr:`returncode`. """ if self.poll() is None: async with self._wait_lock: @@ -229,17 +238,226 @@ async def wait(self): return self.returncode def poll(self): - """Forwards to :meth:`subprocess.Popen.poll`.""" + """Check if the process has exited yet. + + Returns: + The exit status of the process, or ``None`` if it is still + running; see :attr:`returncode`. + """ return self._proc.poll() def send_signal(self, sig): - """Forwards to :meth:`subprocess.Popen.send_signal`.""" + """Send signal ``sig`` to the process. + + On UNIX, ``sig`` may be any signal defined in the + :mod:`signal` module, such as ``signal.SIGINT`` or + ``signal.SIGTERM``. On Windows, it may be anything accepted by + the standard library :meth:`subprocess.Popen.send_signal`. + """ self._proc.send_signal(sig) def terminate(self): - """Forwards to :meth:`subprocess.Popen.terminate`.""" + """Terminate the process, politely if possible. + + On UNIX, this is equivalent to + ``send_signal(signal.SIGTERM)``; by convention this requests + graceful termination, but a misbehaving or buggy process might + ignore it. On Windows, :meth:`terminate` forcibly terminates the + process in the same manner as :meth:`kill`. + """ self._proc.terminate() def kill(self): - """Forwards to :meth:`subprocess.Popen.kill`.""" + """Immediately terminate the process. + + On UNIX, this is equivalent to + ``send_signal(signal.SIGKILL)``. On Windows, it calls + ``TerminateProcess``. In both cases, the process cannot + prevent itself from being killed, but the termination will be + delivered asynchronously; use :meth:`wait` if you want to + ensure the process is actually dead before proceeding. + """ self._proc.kill() + + +async def run_process( + command, + *, + stdin=b"", + capture_stdout=False, + capture_stderr=False, + check=True, + **options +): + """Run ``command`` in a subprocess, wait for it to complete, and + return a :class:`subprocess.CompletedProcess` instance describing + the results. + + If cancelled, :func:`run_process` terminates the subprocess and + waits for it to exit before propagating the cancellation, like + :meth:`Process.aclose`. + + **Input:** The subprocess's standard input stream is set up to + receive the bytes provided as ``stdin``. Once the given input has + been fully delivered, or if none is provided, the subprocess will + receive end-of-file when reading from its standard input. + Alternatively, if you want the subprocess to read its + standard input from the same place as the parent Trio process, you + can pass ``stdin=None``. + + **Output:** By default, any output produced by the subprocess is + passed through to the standard output and error streams of the + parent Trio process. If you would like to capture this output and + do something with it, you can pass ``capture_stdout=True`` to + capture the subprocess's standard output, and/or + ``capture_stderr=True`` to capture its standard error. Captured + data is provided as the + :attr:`~subprocess.CompletedProcess.stdout` and/or + :attr:`~subprocess.CompletedProcess.stderr` attributes of the + returned :class:`~subprocess.CompletedProcess` object. The value + for any stream that was not captured will be ``None``. + + If you want to capture both stdout and stderr while keeping them + separate, pass ``capture_stdout=True, capture_stderr=True``. + + If you want to capture both stdout and stderr but mixed together + in the order they were printed, use: ``capture_stdout=True, stderr=subprocess.STDOUT``. + This directs the child's stderr into its stdout, so the combined + output will be available in the `~subprocess.CompletedProcess.stdout` + attribute. + + **Error checking:** If the subprocess exits with a nonzero status + code, indicating failure, :func:`run_process` raises a + :exc:`subprocess.CalledProcessError` exception rather than + returning normally. The captured outputs are still available as + the :attr:`~subprocess.CalledProcessError.stdout` and + :attr:`~subprocess.CalledProcessError.stderr` attributes of that + exception. To disable this behavior, so that :func:`run_process` + returns normally even if the subprocess exits abnormally, pass + ``check=False``. + + Args: + command (list or str): The command to run. Typically this is a + sequence of strings such as ``['ls', '-l', 'directory with spaces']``, + where the first element names the executable to invoke and the other + elements specify its arguments. With ``shell=True`` in the + ``**options``, or on Windows, ``command`` may alternatively + be a string, which will be parsed following platform-dependent + :ref:`quoting rules `. + stdin (:obj:`bytes`, file descriptor, or None): The bytes to provide to + the subprocess on its standard input stream, or ``None`` if the + subprocess's standard input should come from the same place as + the parent Trio process's standard input. As is the case with + the :mod:`subprocess` module, you can also pass a + file descriptor or an object with a ``fileno()`` method, + in which case the subprocess's standard input will come from + that file. + capture_stdout (bool): If true, capture the bytes that the subprocess + writes to its standard output stream and return them in the + :attr:`~subprocess.CompletedProcess.stdout` attribute + of the returned :class:`~subprocess.CompletedProcess` object. + capture_stderr (bool): If true, capture the bytes that the subprocess + writes to its standard error stream and return them in the + :attr:`~subprocess.CompletedProcess.stderr` attribute + of the returned :class:`~subprocess.CompletedProcess` object. + check (bool): If false, don't validate that the subprocess exits + successfully. You should be sure to check the + ``returncode`` attribute of the returned object if you pass + ``check=False``, so that errors don't pass silently. + **options: :func:`run_process` also accepts any :ref:`general subprocess + options ` and passes them on to the + :class:`~trio.Process` constructor. This includes the + ``stdout`` and ``stderr`` options, which provide additional + redirection possibilities such as ``stderr=subprocess.STDOUT``, + ``stdout=subprocess.DEVNULL``, or file descriptors. + + Returns: + A :class:`subprocess.CompletedProcess` instance describing the + return code and outputs. + + Raises: + UnicodeError: if ``stdin`` is specified as a Unicode string, rather + than bytes + ValueError: if multiple redirections are specified for the same + stream, e.g., both ``capture_stdout=True`` and + ``stdout=subprocess.DEVNULL`` + subprocess.CalledProcessError: if ``check=False`` is not passed + and the process exits with a nonzero exit status + OSError: if an error is encountered starting or communicating with + the process + + .. note:: The child process runs in the same process group as the parent + Trio process, so a Ctrl+C will be delivered simultaneously to both + parent and child. If you don't want this behavior, consult your + platform's documentation for starting child processes in a different + process group. + + """ + + if isinstance(stdin, str): + raise UnicodeError("process stdin must be bytes, not str") + if stdin == subprocess.PIPE: + raise ValueError( + "stdin=subprocess.PIPE doesn't make sense since the pipe " + "is internal to run_process(); pass the actual data you " + "want to send over that pipe instead" + ) + if isinstance(stdin, (bytes, bytearray, memoryview)): + input = stdin + options["stdin"] = subprocess.PIPE + else: + # stdin should be something acceptable to Process + # (None, DEVNULL, a file descriptor, etc) and Process + # will raise if it's not + input = None + options["stdin"] = stdin + + if capture_stdout: + if "stdout" in options: + raise ValueError("can't specify both stdout and capture_stdout") + options["stdout"] = subprocess.PIPE + if capture_stderr: + if "stderr" in options: + raise ValueError("can't specify both stderr and capture_stderr") + options["stderr"] = subprocess.PIPE + + stdout_chunks = [] + stderr_chunks = [] + + async with Process(command, **options) as proc: + + async def feed_input(): + async with proc.stdin: + try: + await proc.stdin.send_all(input) + except trio.BrokenResourceError: + pass + + async def read_output(stream, chunks): + async with stream: + while True: + chunk = await stream.receive_some(32768) + if not chunk: + break + chunks.append(chunk) + + async with trio.open_nursery() as nursery: + if proc.stdin is not None: + nursery.start_soon(feed_input) + if proc.stdout is not None: + nursery.start_soon(read_output, proc.stdout, stdout_chunks) + if proc.stderr is not None: + nursery.start_soon(read_output, proc.stderr, stderr_chunks) + await proc.wait() + + stdout = b"".join(stdout_chunks) if proc.stdout is not None else None + stderr = b"".join(stderr_chunks) if proc.stderr is not None else None + + if proc.returncode and check: + raise subprocess.CalledProcessError( + proc.returncode, proc.args, output=stdout, stderr=stderr + ) + else: + return subprocess.CompletedProcess( + proc.args, proc.returncode, stdout, stderr + ) diff --git a/trio/tests/test_subprocess.py b/trio/tests/test_subprocess.py index 52b08359d3..4edc4c4e65 100644 --- a/trio/tests/test_subprocess.py +++ b/trio/tests/test_subprocess.py @@ -3,9 +3,11 @@ import subprocess import sys import pytest +import random from .. import ( - _core, move_on_after, fail_after, sleep, sleep_forever, Process + _core, move_on_after, fail_after, sleep, sleep_forever, Process, + run_process ) from .._core.tests.tutil import slow from ..testing import wait_all_tasks_blocked @@ -38,9 +40,21 @@ def got_signal(proc, sig): async def test_basic(): + repr_template = "".format(EXIT_TRUE) async with Process(EXIT_TRUE) as proc: assert proc.returncode is None + assert repr(proc) == repr_template.format( + "running with PID {}".format(proc.pid) + ) assert proc.returncode == 0 + assert repr(proc) == repr_template.format("exited with status 0") + + async with Process(EXIT_FALSE) as proc: + pass + assert proc.returncode == 1 + assert repr(proc) == "".format( + EXIT_FALSE, "exited with status 1" + ) async def test_multi_wait(): @@ -71,6 +85,9 @@ async def test_kill_when_context_cancelled(): await sleep_forever() assert scope.cancelled_caught assert got_signal(proc, SIGKILL) + assert repr(proc) == "".format( + SLEEP(10), "exited with signal 9" if posix else "exited with status 1" + ) COPY_STDIN_TO_STDOUT_AND_BACKWARD_TO_STDERR = python( @@ -183,6 +200,72 @@ async def drain_one(stream, count, digit): assert proc.returncode == 0 +async def test_run(): + data = bytes(random.randint(0, 255) for _ in range(2**18)) + + result = await run_process( + CAT, stdin=data, capture_stdout=True, capture_stderr=True + ) + assert result.args == CAT + assert result.returncode == 0 + assert result.stdout == data + assert result.stderr == b"" + + result = await run_process(CAT, capture_stdout=True) + assert result.args == CAT + assert result.returncode == 0 + assert result.stdout == b"" + assert result.stderr is None + + result = await run_process( + COPY_STDIN_TO_STDOUT_AND_BACKWARD_TO_STDERR, + stdin=data, + capture_stdout=True, + capture_stderr=True, + ) + assert result.args == COPY_STDIN_TO_STDOUT_AND_BACKWARD_TO_STDERR + assert result.returncode == 0 + assert result.stdout == data + assert result.stderr == data[::-1] + + # invalid combinations + with pytest.raises(UnicodeError): + await run_process(CAT, stdin="oh no, it's text") + with pytest.raises(ValueError): + await run_process(CAT, stdin=subprocess.PIPE) + with pytest.raises(ValueError): + await run_process(CAT, capture_stdout=True, stdout=subprocess.DEVNULL) + with pytest.raises(ValueError): + await run_process(CAT, capture_stderr=True, stderr=None) + + +async def test_run_check(): + cmd = python("sys.stderr.buffer.write(b'test\\n'); sys.exit(1)") + with pytest.raises(subprocess.CalledProcessError) as excinfo: + await run_process(cmd, stdin=subprocess.DEVNULL, capture_stderr=True) + assert excinfo.value.cmd == cmd + assert excinfo.value.returncode == 1 + assert excinfo.value.stderr == b"test\n" + assert excinfo.value.stdout is None + + result = await run_process( + cmd, capture_stdout=True, capture_stderr=True, check=False + ) + assert result.args == cmd + assert result.stdout == b"" + assert result.stderr == b"test\n" + assert result.returncode == 1 + + +async def test_run_with_broken_pipe(): + result = await run_process( + [sys.executable, "-c", "import sys; sys.stdin.close()"], + stdin=b"x" * 131072, + ) + assert result.returncode == 0 + assert result.stdout is result.stderr is None + + async def test_stderr_stdout(): async with Process( COPY_STDIN_TO_STDOUT_AND_BACKWARD_TO_STDERR, @@ -204,6 +287,17 @@ async def test_stderr_stdout(): assert b"".join(output) == b"12344321" assert proc.returncode == 0 + # equivalent test with run_process() + result = await run_process( + COPY_STDIN_TO_STDOUT_AND_BACKWARD_TO_STDERR, + stdin=b"1234", + capture_stdout=True, + stderr=subprocess.STDOUT, + ) + assert result.returncode == 0 + assert result.stdout == b"12344321" + assert result.stderr is None + # this one hits the branch where stderr=STDOUT but stdout # is not redirected async with Process(