stdout/stderr buffering considerations #6379

ChALkeR · 2016-04-25T21:03:08Z

I tried to discuss this some time ago at IRC, but postponed it for quite a long time. Also I started the discussion of this in #1741, but I would like to extract the more specific discussion to a separate issue.

I could miss some details, but will try to give a quick overview here.

Several issues here:

Many calls to console.log (e.g. calling it in a loop) could chew up all the memory and die — Why does node/io.js hang when printing to the console inside a loop? #1741, Silly program or memory leak? #2970, Strange memory leaks #3171, memory leaks of console.log #18013.
console.log has different behavior while printing to a terminal and being redirected to a file. — Why does node/io.js hang when printing to the console inside a loop? #1741 (comment).
Output is sometimes truncated — A libuv upgrade in Node v6 breaks the popular david package #6297, there were other ones as far as I remember.
The behaviour seems to differ across platforms.

As I understand it — the output has an implicit write buffer (as it's non-blocking) of unlimited size.

One approach to fixing this would be to:

Introduce an explicit cyclic write buffer.
Make writes to that cyclic buffer blocking.
Make writes from the buffer to the actual output non blocking.
When the cyclic buffer reaches it's maximum size (e.g. 10 MiB) — block further writes to the buffer until a corresponding part of it is freed.
On (normal) exit, make sure the buffer is flushed.

For almost all cases, except for the ones that are currently broken, this would behave as a non-blocking buffer (because writes to the buffer are considerably faster than writes from the buffer to file/terminal).

For cases when the data is being piped to the output too quickly and when the output file/terminal does not manage to output it at the same rate — the write would turn into a blocking operation. It would also be blocking at the exit until all the data is written.

Another approach would be to monitor (and limit) the size of data that is contained in the implicit buffer coming from the async queue, and make the operations block when that limit is reached.

The text was updated successfully, but these errors were encountered:

ChALkeR · 2016-04-25T21:07:56Z

process.stdout._handle.setBlocking(true); as proposed in #1741 (comment) and #6297 (comment) would considerably hurt the performance. Exposing that isn't a good solution to any of this.

kzc · 2016-04-25T22:25:08Z

process.stdout._handle.setBlocking(true);
would considerably hurt the performance

Developers are a smart bunch. With the proposed setBlocking API they can decide for themselves if they want stdout/stderr to block on writes. The default could still remain non-blocking. There's no harm in giving people the option. Having CLI applications block on writes for stdout/stderr is quite desirable. This is not unlike all the Sync calls in the fs module. In a high-performance server scenario you don't have to use them, but they are there if you need them.

ChALkeR · 2016-04-26T06:50:42Z

There's no harm in giving people the option.

There is, every option comes at a cost from both sides. That option would need to be supported, and it would actually just toggle the behaviour between two states that are far from being ideal in many cases.

As for the option here, the said cyclic buffer maximum size could perhaps be configured using a command-line argument, at the buffer size 0 turning in into a fully blocking API.

kzc · 2016-04-26T12:50:56Z

toggle the behaviour between two states that are far from being ideal in many cases

The proposed API process.stdout._handle.setBlocking(Boolean) could be documented to only be guaranteed to work before any output is sent to the stream.

On UNIX it is a trivial system call on a fd. It's pretty much the same thing on Windows as far as I recall.

Edit: See #6297 (comment)

mscdex · 2016-04-26T13:00:32Z

The proposed API process.stdout._handle.setBlocking(Boolean) could be documented to only be guaranteed to work before any output is sent to the stream.

We should not be documenting underscore-prefixed properties. This would need to be a separate, true public API.

sindresorhus · 2016-04-27T06:37:03Z

In CLI tools, I don't want to have stdout/stderr be blocking though, rather I want to be able to explicitly exit the process and have stdout/stderr flushed before exiting. Has there been any discussion about introducing a way to explicitly exit the process gracefully? Meaning flushing the stdout/stderr and then exit. Maybe something like a process.gracefulExit() method? In ES2015 modules, we can no longer return in the top-scope, so short-circuiting will then be effectively impossible, without a nesting mess.

kzc · 2016-04-27T07:09:34Z

The exit module is a drop in replacement for process.exit() that effectively waits for stdout/stderr to be drained. Other streams can optionally be drained as well. Something like it really ought to rolled into node core with a different name.

Edit: upon further analysis, the exit module is no longer recommended.

Qix- · 2016-04-27T07:18:51Z

I think there's a slight discrepancy in interpretation here.

Are we talking about console blocking, or console buffering? Albeit similar, these are two different considerations.

Further, I'm not entirely sure what the problem is you're trying to fix, @ChALkeR. Could you elaborate on a problem that is currently being faced? I'm not groking what it is that's broken here.

One thing to note is that on most systems (usually UNIX systems) stdout is buffered to begin with, whereas stderr is usually not (or sometimes just line buffered).

However, an explicit fflush(stdout) or fflush(stderr) is not necessary upon a process closing because these buffers are handled by the system and are flushed when main() returns. If the buffer grows too large during execution, the system automatically flushes it.

This is why the following program still outputs while the loop is running, and on most terminal emulators the same reason why it looks like it's 'jittery'.

#include <stdio.h>

int main(void) {
    for (int i = 10000000; i > 0; i--) {
        printf("This is a relatively not-short line of text. :) %d\n", i);
    }
    return 0;
}

Blocking, on the other hand, doesn't make much sense. What do you want to block? Are you suggesting we fflush(stdout) every time we console.log()? Because if so, that is a terrible idea.

Developers are a smart bunch. With the proposed setBlocking API they can decide for themselves if they want stdout/stderr to block on writes.

I respectfully disagree. There is absolutely no point in blocking on stdout/stderr. It's very rarely done in the native world and when it is it's usually for concurrency problems (two FILE* handles being written to by two threads that cannot otherwise be locked; i.e. TTY descriptors). Let the system handle it.

The above program without fflush(stdout) on every iteration produced the following average:

real 6.32s
sys  0.54
405 involuntary context switches

and with fflush(stdout) on each iteration:

real 7.48s
sys  0.56s
1607  involuntary context switches

Thats 18.3% of a time increase on this microbenchmark alone. I'm in the 'time is relative and isn't a great indicator of actual performance' camp, so I included the involuntary context switch count (voluntary context switches would be the same between the two runs, obviously).

The results show that fflush(stdout) upon every message encouraged the system to switch out of the process 1202 more times - that's 296% higher than ye olde system buffering.

Further, the feature that was requested by @sindresorhus is part of how executables are formed - at least on UNIX systems, all file descriptors that are buffered are flushed prior to/during close()-ing or fclose()-ing, flushing prior to a process exiting is a needless step and exposing a new function for it would be completely unnecessary.

I'm still lost as to what is trying to be fixed here... I wouldn't mind seeing a process.stdout.flush() method, but it'd be abused in upstream dependencies and I'd find myself writing some pretty annoyed issues in that case. I've never run into a problem with Node's use of standard I/O...

Also, just to reiterate - please don't even start with a .setBlocking() method. That's asking for trouble. The second a dependency makes standard I/O blocking then everything else is affected and then I have to answer to my technical lead about why my simple little program is switching out of context a bazillion times when it didn't before.

ChALkeR · 2016-04-27T07:51:49Z

@Qix-

Further, I'm not entirely sure what the problem is you're trying to fix, @ChALkeR. Could you elaborate on a problem that is currently being faced? I'm not groking what it is that's broken here.

See the issues I linked above.

Blocking, on the other hand, doesn't make much sense. What do you want to block? Are you suggesting we fflush(stdout) every time we console.log()? Because if so, that is a terrible idea.

No, of course I don't. console.log is not the same as printf in your example.
It's an async operation. I don't propose fflush, I propose to somehow make sure that we don't fill up the async queue with console.logs.

Qix- · 2016-04-27T07:59:06Z

@ChALkeR are we actually seeing this happen?

var str = new Array(8000).join('.');

while (true) {
    console.log(str);
}

works just fine. In what realistic scenario are we seeing the async queue fill up? The three linked issues are toy programs that are meant to break Node, and would break in most languages, in just about all situations, on all systems (resources are finite, obviously).

The way I personally see it, there are tradeoffs to having coroutine-like execution paradigms and having slightly higher memory usage is a perfectly acceptable tradeoff for the asynchronicity Node provides. It doesn't handle extreme cases because those cases are well outside any normal or even above normal scenario.

The first case, which mentions a break in execution to flush the async buffer, is perfectly reasonable and expected.

Side note: thanks for clarifying :) that makes more sense. I was hoping it wasn't seriously proposed we fflush() all the time. Just had to make sure to kill that idea before anyone ran with it.

ChALkeR · 2016-04-27T08:27:29Z

@Qix-

What's your platform?
Are you sure that you are not redirecting the output anywhere?

Qix- · 2016-04-27T08:33:14Z

@ChALkeR

OS X and linux (tested on both)
Positive.

ChALkeR · 2016-04-27T08:48:33Z

@Qix- Are you saying that memory usage eventually stabilizes when you run this? How much memory does it consume for you?

Qix- · 2016-04-27T08:57:15Z

@ChALkeR ah no, memory usage goes off the charts. Again, though, is this happening in practical environments outside of stress tests?

ChALkeR · 2016-04-27T08:58:41Z

@Qix- Yes, it does. Users have reported it several times already, see links above.

Qix- · 2016-04-27T10:07:28Z

@ChALkeR The three links you mentioned (#1741, #2970, #3171) are not practical environments outside of stress tests.

Truncated output (#6297 et al) is unrelated to the ones mentioned above.

ChALkeR · 2016-04-27T10:16:40Z

@Qix- #3171, for example, doesn't look like just a stress test to me.

Qix- · 2016-04-27T10:24:56Z

@ChALkeR

for (var i = 0; i < 1000000000; i++) {
    value = nextPrime(value);
    console.log(value);
}

This is definitely a stress test. This isn't a practical or realistic use of Node.js - an async scripting wrapper for IO-heavy applications.

For instance,

function loggit() {
    console.log('hello');
    setImmediate(loggit);
}

loggit()

tops out at 15M memory consumption. I've tweaked #3171's code to reflect the changes here, which now hovers around 45M (instead of inflating to over a gig in about a minute).

Of course a stupidly huge while/for loop with a ton of non-blocking IO (not to mention, a function that is solving an NP hard problem) is going to cause problems. Don't do it. It's a simple misunderstanding of how Node works, and if it were up to me I'd brush it off as an error on the developer's part.

The second we start baby-sitting bad design like all three of the examples you mentioned and start ad-hoc adding methods like .setBlocking(true) is when we start stepping on the toes of dependents down-stream, which begins to break everything.

kzc · 2016-04-27T12:25:37Z

Are we talking about console blocking, or console buffering? Albeit similar, these are two different considerations.

Either blocking tty writes or graceful drain of stdout/stderr upon process.exit() would fix the problem at hand with truncated tty output.

I think both APIs should be part of node core. Devs should have the option to have stdout/stderr block if they so should choose or have stdout/stdout gracefully flushed if they find that more appropriate.

Logging to the console 10000000 times in a tight loop is not real life behavior. In reality performance oriented servers rarely log to the console - and when they do it's important that the information is output. But if you did want to log to the console 1e7 times (and run out of memory due to another known node issue) then you would not enable blocking on stdout - so it's a red herring.

Qix- · 2016-04-27T12:30:44Z

stdout/stderr should flush on program exit anyway. It's the fact node uses the exit() call that messes quite a few things up. That's actually an easy enough fix with Node specifically.

My problem with .setBlocking() is that it affects the process globally, not just specific pieces of code or files. That means a dependency could mess up how my code functions without really understanding what .setBlocking() does.

kzc · 2016-04-27T12:41:28Z

Comment just posted in other thread: #6297 (comment)

process.stdout.flush() and process.stderr.flush()

It's not like C where the data is buffered in libc and can be flushed. The unflushed data is in some Javascript data structure in V8 and the event loop has been halted.

Qix- · 2016-04-27T22:43:51Z

@kzc yes, that is also a problem. exit() in and of itself is the cause of your specific problem, as has been mentioned quite a few times.

eljefedelrodeodeljefe · 2016-04-29T07:46:52Z

@ChALkeR I dragged over the contents from the issue to #6456 and would be closing this, since I think it's highly related. Please feel free to reopen if you see the need. Thx.

kurahaupo · 2021-10-19T03:20:59Z

The idea that this is "expected" for non blocking streams is fatally flawed: nobody expects logging or stdio to be non blocking, and if you've changed the file descriptor flags, that's an implementation detail that should be hidden from the outside world.

Moreover buffering constitutes an implicit promise that it will eventually be written, so at least at exit all buffers should be fully written out.

Likewise OOM should trigger memory recovery, including writing out buffers.

Delays are less important than broken promises.

silverwind · 2022-03-10T13:14:00Z

Just posting workaround for this issue for future reference:

for (const stream of [process.stdout, process.stderr]) {
  stream?._handle?.setBlocking?.(true);
}

Before:

$ node -p 'process.stdout.write("x".repeat(5e6)); process.exit()' | wc -c
65536

After:

$ node -p 'for (const stream of [process.stdout, process.stderr]) stream?._handle?.setBlocking?.(true); process.stdout.write("x".repeat(5e6)); process.exit()' | wc -c
5000000

kurahaupo · 2022-03-15T01:02:42Z

It seems that the phrase "non blocking" has been used in this discussion to mean something different from the POSIX definition, and that may account for some of the confusion.

In POSIX terms, buffering is what happens within a process, while non blocking and asynchronous are two different ways of interacting with the kernel.

A non blocking (O_NONBLOCK or O_NDELAY) kernel call will fail rather than block; this is certainly not what is envisioned by most participants in this discussion. (Moreover, it will only do so for streams that in principle can block for an arbitrary period, such as terminals, pipes, and network sockets; it does not affect writing to most ordinary files and block devices. )

In contrast, an asynchronous kernel call will neither succeed nor fail, but rather it returns to the running process immediately and subsequently uses a signal to notify the process of the success or failure of the requested operation. That's also not what is being discussed here, but could be useful as an implementation detail for buffering in userspace.

What is wanted here is the equivalent of the C language's setvbuf function: the ability to change the buffer size from "infinite" down to something smaller.

I would strongly urge avoiding using the term "blocking" when defining a new public method to control this, as it unnecessarily muddles the meaning of related terms cf O_NONBLOCK.

gireeshpunathil · 2022-03-15T04:46:49Z

@kurahaupo - Irrespective of whether the usage of the term blocking is consistent with POSIX definition or not (I don't know if that consistency is relevant here or not), its usage in this project's documentation is consistent: a call that effectively blocks the Node.js event loop's forward progress (by virtue of the current thread yielding to the kernel for actively processing I/O). And I guess our users had not much issues with that definition.

guest271314 · 2022-07-02T15:45:03Z

If I understand this issue correctly this is the place to post my feedback. I wrote a Node.js Native Messaging host that streams output from child.stdout.on('data' (data) => {}) to the browser with process.stdout.write(data).

I observed that RSS constantly increases during usage. Perhaps someone in the Node.js realm can explain why this occurs and how to fix this memory leak https://thankful-butter-handball.glitch.me/.

bnoordhuis · 2023-04-05T08:34:31Z

This issue has been open for nearly 7 years. In the mean time mitigating work has been done and lots of discussion took place in other issues. I'm going to put this one out to pasture.

midzelis · 2023-05-26T12:05:57Z

For people that are looking for a workarounds, many are given here: https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe.

The one I choose was https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe/25378#25378, which is to wrap the command you want to execute with the 'script' command, which has a way to unbuffer the stdout.

script -q /dev/null long_running_command | print_progress      # (FreeBSD, Mac OS X)
script -q -c "long_running_command" /dev/null | print_progress # (Linux)

All the other solutions in that linked thread required installing a new program, but the script command is available on most POSIX-ish systems, albeit with different cmd line arg styles. This works for me - hopefully it can work for you.

ChALkeR added the discuss Issues opened for discussions and feedbacks. label Apr 25, 2016

ChALkeR changed the title ~~stdio buffering considerations~~ stdout/stderr buffering considerations Apr 25, 2016

Fishrock123 added net Issues and PRs related to the net subsystem. process Issues and PRs related to the process subsystem. libuv Issues and PRs related to the libuv dependency or the uv binding. labels Apr 25, 2016

Qix- mentioned this issue Apr 27, 2016

A libuv upgrade in Node v6 breaks the popular david package #6297

Closed

kzc mentioned this issue Apr 29, 2016

stdio buffered writes (chunked) issues & process.exit() truncation #6456

Closed

Trott removed the discuss Issues opened for discussions and feedbacks. label Nov 10, 2018

gireeshpunathil mentioned this issue Nov 29, 2018

fs.createWriteStream is creating file but not writing to it nodejs/help#1641

Closed

gireeshpunathil mentioned this issue Jan 22, 2019

console: use synchronous write when the process is piped #25638

Closed

3 tasks

pbelbin mentioned this issue Jun 2, 2020

error() should flush stdout before emitting message sparksuite/waterfall-cli#258

Closed

jasnell added the stdio Issues and PRs related to stdio. label Jun 25, 2020

SounD120 mentioned this issue Jul 3, 2020

The debugger / React Native was hang if too many console.log messages output to Debug Console microsoft/vscode-react-native#1335

Closed

alexlamsl mentioned this issue Dec 27, 2020

ufuzz failure mishoo/UglifyJS#4466

Closed

brainrake mentioned this issue Apr 23, 2021

Support prim__fork on javascript idris-lang/Idris2#1238

Closed

Domiii mentioned this issue May 7, 2021

[dbux-runtime] make sure data gets sent out, even upon uncaughtException Domiii/dbux#201

Closed

4 tasks

jakebailey mentioned this issue May 24, 2021

Don't use process.exit in CLI to prevent truncated stdout microsoft/pyright#1905

Merged

antonmedv mentioned this issue Oct 18, 2021

Stdout gets truncated google/zx#242

Closed

This was referenced May 24, 2022

In Node.js writing to stdout doesn’t work purescript-deprecated/purescript-node-streams-aff#1

Closed

Stream Promises API? purescript-node/purescript-node-streams#44

Closed

CMCDragonkai mentioned this issue Aug 1, 2022

Standardise Asynchronous Logging and Message Encoding on Windows MatrixAI/js-logger#22

Open

Trott mentioned this issue Sep 8, 2022

console.log stops printing in while loop #44577

Closed

SimenB mentioned this issue Mar 6, 2023

[Feature]: Drop unmaintained dependency exit jestjs/jest#13983

Closed

bnoordhuis closed this as not planned Won't fix, can't repro, duplicate, stale Apr 5, 2023

rom1v mentioned this issue Apr 21, 2023

Start scrcpy 2.0 by NodeJS but cannot receive "Recording Started" in stdout pipe Genymobile/scrcpy#3933

Open

2 tasks

dec0dOS mentioned this issue Sep 6, 2023

Node stdout/stderr buffer flushing is still a problem. Wait for console.log to complete before exit sindresorhus/exit-hook#34

Open

bradymholt mentioned this issue Oct 2, 2023

Wait for streams to drain before exiting bradymholt/jsh#26

Merged

davireis mentioned this issue Dec 15, 2023

No test output vitest-dev/vscode#137

Closed

$@refractalize$ refractalize mentioned this issue Feb 13, 2024

fix truncated JSON output with JSON reporter webpro-nl/knip#512

Merged

dreamorosi mentioned this issue Sep 16, 2024

Bug: powertools Logger Excessive memory usage with large messages and log rate aws-powertools/powertools-lambda-typescript#3065

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stdout/stderr buffering considerations #6379

stdout/stderr buffering considerations #6379

ChALkeR commented Apr 25, 2016 •

edited

Loading

ChALkeR commented Apr 25, 2016

kzc commented Apr 25, 2016

ChALkeR commented Apr 26, 2016 •

edited

Loading

kzc commented Apr 26, 2016 •

edited

Loading

mscdex commented Apr 26, 2016 •

edited

Loading

sindresorhus commented Apr 27, 2016 •

edited

Loading

kzc commented Apr 27, 2016 •

edited

Loading

Qix- commented Apr 27, 2016 •

edited

Loading

ChALkeR commented Apr 27, 2016

Qix- commented Apr 27, 2016 •

edited

Loading

ChALkeR commented Apr 27, 2016

Qix- commented Apr 27, 2016

ChALkeR commented Apr 27, 2016

Qix- commented Apr 27, 2016

ChALkeR commented Apr 27, 2016 •

edited

Loading

Qix- commented Apr 27, 2016

ChALkeR commented Apr 27, 2016

Qix- commented Apr 27, 2016 •

edited

Loading

kzc commented Apr 27, 2016

Qix- commented Apr 27, 2016

kzc commented Apr 27, 2016

Qix- commented Apr 27, 2016

eljefedelrodeodeljefe commented Apr 29, 2016

kurahaupo commented Oct 19, 2021

silverwind commented Mar 10, 2022

kurahaupo commented Mar 15, 2022

gireeshpunathil commented Mar 15, 2022

guest271314 commented Jul 2, 2022

bnoordhuis commented Apr 5, 2023

midzelis commented May 26, 2023

stdout/stderr buffering considerations #6379

stdout/stderr buffering considerations #6379

Comments

ChALkeR commented Apr 25, 2016 • edited Loading

ChALkeR commented Apr 25, 2016

kzc commented Apr 25, 2016

ChALkeR commented Apr 26, 2016 • edited Loading

kzc commented Apr 26, 2016 • edited Loading

mscdex commented Apr 26, 2016 • edited Loading

sindresorhus commented Apr 27, 2016 • edited Loading

kzc commented Apr 27, 2016 • edited Loading

Qix- commented Apr 27, 2016 • edited Loading

ChALkeR commented Apr 27, 2016

Qix- commented Apr 27, 2016 • edited Loading

ChALkeR commented Apr 27, 2016

Qix- commented Apr 27, 2016

ChALkeR commented Apr 27, 2016

Qix- commented Apr 27, 2016

ChALkeR commented Apr 27, 2016 • edited Loading

Qix- commented Apr 27, 2016

ChALkeR commented Apr 27, 2016

Qix- commented Apr 27, 2016 • edited Loading

kzc commented Apr 27, 2016

Qix- commented Apr 27, 2016

kzc commented Apr 27, 2016

Qix- commented Apr 27, 2016

eljefedelrodeodeljefe commented Apr 29, 2016

kurahaupo commented Oct 19, 2021

silverwind commented Mar 10, 2022

kurahaupo commented Mar 15, 2022

gireeshpunathil commented Mar 15, 2022

guest271314 commented Jul 2, 2022

bnoordhuis commented Apr 5, 2023

midzelis commented May 26, 2023

ChALkeR commented Apr 25, 2016 •

edited

Loading

ChALkeR commented Apr 26, 2016 •

edited

Loading

kzc commented Apr 26, 2016 •

edited

Loading

mscdex commented Apr 26, 2016 •

edited

Loading

sindresorhus commented Apr 27, 2016 •

edited

Loading

kzc commented Apr 27, 2016 •

edited

Loading

Qix- commented Apr 27, 2016 •

edited

Loading

Qix- commented Apr 27, 2016 •

edited

Loading

ChALkeR commented Apr 27, 2016 •

edited

Loading

Qix- commented Apr 27, 2016 •

edited

Loading