You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We implement multiple features with the following basic idea:
Spawn the process
As soon as spawned, pipe some value to process.stdin or from process.stdout|stderr
The features that use this idea include: the input option, the inputFile option, and the stdio/stdin/stdout/stderr options when their value is a file URL, a file path, a web stream (not a Node.js stream) or an iterable.
All those features revolve around the same idea: make it easy for users to write/read any type of values to a process: raw strings, files, streams, iterables.
This simplify working with processes a lot. The alternative is for users to manually pipe values, which has many pitfalls to avoid (such as proper error handling). Streams and processes are two topics many of us find confusing.
Why do we implement those features like we do?
I cannot think of another way to implement those features. This is due to child_process.spawn() being quite restrictive with the stdio option:
It only allows: pipe/overlapped/null/undefined (which we use as described above), inherit/ipc/ignore (which cannot be used for the above purpose)
It also allows passing a Node.js stream. One might think this would be a better way to implement the above features. However, Node.js requires this stream to have an underlying file descriptor, which is a major restriction in our case.
Side note: we can't even implement the inputFile option by calling child_process.spawn(..., { stdio: [createReadStream(filePath), ...] }) because createReadStream() does not have an underlying file descriptor until its open event has been emitted. We cannot await that open event because Execa must return the child process synchronously.
So, I feel like, for the above features, we are stuck with the strategy of: spawn the process, then pipe to/from it.
Problem
However, this strategy creates a race condition: even if we pipe "as soon" as the process is spawned, the process might have run for many milliseconds already.
This problem is often softened by the following:
Processes that read stdinshould not assume the data to be available on process start, but wait for it instead.
The values written to stdout/stderr is buffered at the OS level.
Although processes might hit the stdout/stderr buffer max limit, they should apply back pressure then, i.e. wait for the buffer to empty up before keeping on writing to it.
Processes should not exit until their writes on stdout/stderr has been consumed
I.e. everything should work fine with well-behaving processes (or underlying runtimes). However this is a potential gotcha that I felt was worth writing down in an issue, in case anyone is experiencing this problem, or has any feedback.
The text was updated successfully, but these errors were encountered:
Execa helps users pass data to/from processes
We implement multiple features with the following basic idea:
process.stdin
or fromprocess.stdout|stderr
The features that use this idea include: the
input
option, theinputFile
option, and thestdio
/stdin
/stdout
/stderr
options when their value is a file URL, a file path, a web stream (not a Node.js stream) or an iterable.All those features revolve around the same idea: make it easy for users to write/read any type of values to a process: raw strings, files, streams, iterables.
This simplify working with processes a lot. The alternative is for users to manually pipe values, which has many pitfalls to avoid (such as proper error handling). Streams and processes are two topics many of us find confusing.
Why do we implement those features like we do?
I cannot think of another way to implement those features. This is due to
child_process.spawn()
being quite restrictive with thestdio
option:pipe
/overlapped
/null
/undefined
(which we use as described above),inherit
/ipc
/ignore
(which cannot be used for the above purpose)inputFile
option by callingchild_process.spawn(..., { stdio: [createReadStream(filePath), ...] })
becausecreateReadStream()
does not have an underlying file descriptor until itsopen
event has been emitted. We cannot await thatopen
event because Execa must return the child process synchronously.So, I feel like, for the above features, we are stuck with the strategy of: spawn the process, then pipe to/from it.
Problem
However, this strategy creates a race condition: even if we pipe "as soon" as the process is spawned, the process might have run for many milliseconds already.
This problem is often softened by the following:
stdin
should not assume the data to be available on process start, but wait for it instead.stdout
/stderr
is buffered at the OS level.stdout
/stderr
buffer max limit, they should apply back pressure then, i.e. wait for the buffer to empty up before keeping on writing to it.stdout
/stderr
has been consumedI.e. everything should work fine with well-behaving processes (or underlying runtimes). However this is a potential gotcha that I felt was worth writing down in an issue, in case anyone is experiencing this problem, or has any feedback.
The text was updated successfully, but these errors were encountered: