Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

console: use synchronous write when the process is piped #25638

Closed
wants to merge 1 commit into from

Conversation

mcollina
Copy link
Member

Fixes: #24992

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added console Issues and PRs related to the console subsystem. fs Issues and PRs related to the fs subsystem / file system. labels Jan 22, 2019
@mcollina
Copy link
Member Author

cc @gireeshpunathil @addaleax

@gireeshpunathil
Copy link
Member

I guess apart from fixing #24992, this might address the old issues such as #784, #6379 , #6456 and #19218 !

I am just thinking aloud on any side effects of this - such as causing unwanted blocking effects on long running processes of which the child make use of console.log to talk to parent?

@addaleax
Copy link
Member

I am just thinking aloud on any side effects of this - such as causing unwanted blocking effects on long running processes of which the child make use of console.log to talk to parent?

I’m worried about this too. It’s the reason why piped stdio is async in the first place.

globalConsole[kBindStreamsLazy]({
get stdout() { return makeSync(process.stdout); },
get stderr() { return makeSync(process.stderr); }
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't all console instances behave the same? I would move this into the kBindStreamsEager and kBindStreamsLazy functions in the console constructor file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely. We might also move this whole sync logic into the console, and adding capability there to manage the fd directly.

I’ll wait and see if we can settle on this approach before making any changes.

try {
const n = writeSync(this.fd, chunk, 0, chunk.length);
if (n !== chunk.length) {
chunk = chunk.slice(0, n);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the last bytes written first? AFAIK this should be chunk = chunk.slice(n)? Maybe just verify that in the test as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcollina PTAL. This seems wrong to me.

@addaleax
Copy link
Member

Thinking about it, I think starting to have indefinitely blocking console.log() calls is not really what we want.

I think I have time this week to look more closely into a solution based on libuv/libuv#390, if that’s okay.

@mcollina
Copy link
Member Author

@addaleax

Thinking about it, I think starting to have indefinitely blocking console.log() calls is not really what we want.

I don't understand. The assumption of console.log is that it's synchronous and safe to use. Thus, we need to slow down the producer, which is the JS thread.

I think I have time this week to look more closely into a solution based on libuv/libuv#390, if that’s okay.

Definitely, this has been an open issue for so long! BTW, I have this solution implemented inside pino, if you would like tinker with it on the ecosystem.


@gireeshpunathil

I am just thinking aloud on any side effects of this - such as causing unwanted blocking effects on long running processes of which the child make use of console.log to talk to parent?

The whole goal is to slow down the producer of that data, i.e. the JS vm. That has the effect of any CPU-bound activity on the event loop.

@addaleax
Copy link
Member

@mcollina Is the ultimate goal here really to slow the producer down, or to resolve the stdio issues (interleaved stdio + bad interaction with process.exit())? i.e. is it actually the assumption that console.log() is synchronous?

@mcollina
Copy link
Member Author

My goal is to solve the memory issues with the unbounded (no backpressure) producer that is console.log and all the data loss that entails with that. I do not see any other way than making console.log synchronous, and I think that’s the expectations of our users as well.

Note that this PR also introduces a bunch of fixes fo SyncWriteStream that we should land anyway.

Moreover this would enable us to remove the synchronous access to stdout and stderr, speeding up significantly the processing of that output in a streaminf fashion (that might something far more
breaking that this one).

@addaleax
Copy link
Member

Note that this PR also introduces a bunch of fixes fo SyncWriteStream that we should land anyway.

@mcollina As far as I can tell, that fix is adding a 100 % CPU usage loop that busy polls writes into a pipe, so it’s something that shouldn’t be needed for FDs referring to regular files (which is the only usage for SyncWriteStream so far, as far as I know)?

@mcollina
Copy link
Member Author

As far as I can tell, that fix is adding a 100 % CPU usage loop that busy polls writes into a pipe, so it’s something that shouldn’t be needed for FDs referring to regular files (which is the only usage for SyncWriteStream so far, as far as I know)?

I think EAGAIN could potentially happen also on FDs that refers to regular files (but it's extremely less likely).

Copy link
Member

@Fishrock123 Fishrock123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. This backtracks on a lot of what we'd settled on. Guess I need to dig up stuff...

FWIW: I already tried to do this. (some years ago)
There's some context in the later comments within the PR: #1771

Notably, this but from @bnoordhuis:

A bit of background: some years ago, I think it was in v0.7, it was decided to make stdout and stderr blocking. Turns out it doesn't work so well for pipes; ttys and files are usually very fast (local ones anyway) but pipes tend to fill up rapidly.

A number of people complained about it so we made stdio-to-pipe non-blocking again (except on Windows, where it's not supported.) I forgot the exact bug reports but the theme was that stdio was too slow; on OS X, the kernel pipe buffer is only about 4 kB, so it's easy to max out.

As already mentioned here but I'm not sure if considered deeply enough as this isn't currently tagged as semver-major is that doing this can cause a blocking dependency upstream of a child.

Related, I've been planning to make a PR removing the "blocking to pipes" thing on Windows. It's unnecessary (literally only done to be the same as unix years ago, although since unix changed) and now inconsistent.

(Example patch of removing said behavior.)
diff --git a/lib/net.js b/lib/net.js
index 9eb7448c59..6c70cae264 100644
--- a/lib/net.js
+++ b/lib/net.js
@@ -283,23 +283,6 @@ function Socket(options) {
       throw errnoException(err, 'open');
 
     this[async_id_symbol] = this._handle.getAsyncId();
-
-    if ((fd === 1 || fd === 2) &&
-        (this._handle instanceof Pipe) &&
-        process.platform === 'win32') {
-      // Make stdout and stderr blocking on Windows
-      err = this._handle.setBlocking(true);
-      if (err)
-        throw errnoException(err, 'setBlocking');
-
-      this._writev = null;
-      this._write = makeSyncWrite(fd);
-      // makeSyncWrite adjusts this value like the original handle would, so
-      // we need to let it do that by turning it into a writable, own property.
-      Object.defineProperty(this._handle, 'bytesWritten', {
-        value: 0, writable: true
-      });
-    }
   }
 
   // shut down the socket when we're finished with it.

Also 🙈, was a bit clueless back in #1771 but the context and points still stand.

@Fishrock123
Copy link
Member

Ah, also managed to (again) dig up the original issue about this class of issues: nodejs/node-v0.x-archive#3584

@mcollina
Copy link
Member Author

There are a significant number of issues that points out that this is worth discussing again. The behavior of process.stdout in different scenarios is so different that is very hard to rely on.

Note that I’m referring to console and stdout as two different things, while a lot of those original discussions talked about stdout and stderr. IMHO we have different requirements for the two, and they should be treated differently.

@gireeshpunathil
Copy link
Member

Given that this change affects console.log on a child process alone, and the fact that the benefit of making this change outweighs any drawback that is identified, I am in favor of this change; with a semver-major label and a doc update to the API to that effect.

@Fishrock123
Copy link
Member

Fishrock123 commented Jan 24, 2019

So, to be clear, post this PR the state of things would look as follows:

  • stdio
    • To TTYS
      • Windows: Async1
      • Unix: Sync
    • To files
      • All: Sync
    • To pipes
      • Windows: Sync2
      • Unix: Async unless using console.*() (This PR's change)
  1. Cannot be changed.
  2. Makes no sense.

This would make one case vary depending on the implicit details of an in-application api... idk how I feel about that but I guess I'll think about it.

@mcollina
Copy link
Member Author

@Fishrock123 that is correct.

@sam-github
Copy link
Contributor

@Fishrock123 great overview. To be clear, the difference from before this PR is: unless using console.*() for stdio, to pipes, on Unix.

Shouldn't it also be sync, to ttys, on Windows? That looks like the last async place for console.log.

@Fishrock123
Copy link
Member

Shouldn't it also be sync, to ttys, on Windows? That looks like the last async place for console.log.

As far as I recall, this is literally not possible. Windows TTYs don't have any way of telling you if a message was actually "received" or "processed". You send it something and hope for the best.

@ChALkeR
Copy link
Member

ChALkeR commented Feb 8, 2019

I'm conflicted about this.

It fixes the above-mentioned issues, but I suppose this could introduce unwanted sync delays for servers that do not do much logging in some cases.

Back in #6379, I was thinking of using a buffer instead, i.e. make console calls sync, but introduce a limited-size buffer, so that:

  1. While the buffer is less than a certain size, console.log data goes in there, synchronously.
  2. Buffer is printed to pipe asynchronously.
  3. If the buffer is larger than a certain size, console.log starts (synchronously) waiting until it shrinks.
  4. On exit, buffer is printed to pipe synchronously.

@Fishrock123 thoughs?

@BridgeAR
Copy link
Member

BridgeAR commented Mar 6, 2019

What should we do here? @mcollina @Fishrock123

@gireeshpunathil
Copy link
Member

@ChALkeR - correct me if I am wrong; but the problematic scenario (large data comes for writing blocks the app) can still happen, in your sequence no. 3?

In other words, if the data is sufficiently large (larger than the proposed internal buffer can hold) we are probably looking at the same thing?

the main difference seems to be that with a buffer in between, we have a trade-off to make: the larger the buffer the better we manage the writing, but at the expense of added memory overhead?

@mcollina
Copy link
Member Author

mcollina commented Mar 6, 2019

@ChALkeR this PR would also solve the problem of lost output in case of a crash. The buffer approach you are describing is likely to lose output in case of crashes.

If we add this, we should give an API to create a stdout/stderr that are async in nature (or maybe make stdout/stderr async by default).

@mcollina
Copy link
Member Author

@Fishrock123 @nodejs/tsc wdyt? Should we do this for v12, v13? Or should I just close this?

@gireeshpunathil
Copy link
Member

ping @Fishrock123 and @nodejs/tsc again.

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but yes, I think this should be closed.

@mcollina
Copy link
Member Author

We are not in agreement. How do you plan to fix #24992? Should we leave that as a wont-fix?

@addaleax
Copy link
Member

addaleax commented May 1, 2019

@mcollina I think you are not in favor of the solutions that I could come up with (as I read #24992 (comment)) … I’d prefer wontfix over something that will break use cases, yes, and I still think we should try to move away from sync stdio in the long term (still a ton of work).

My personal preference would be setting a limit to the total length of all data passed to a single _writev() call. If that means that data queues up in memory and the process fails with OOM, then so be it – at least that’s an explicit error, and one that hints at what the problem is.

@mcollina
Copy link
Member Author

mcollina commented May 2, 2019

I’m lost on why this is breaking use cases, what are those? It seems this fixes a lot of old bugs, at least in unix systems.

@gireeshpunathil
Copy link
Member

I did a testing with this PR's patch to see how much is the extent of latency for different data chunks in console.log:

cat foo.js

const h = require('http')
const child_process = require('child_process')

if (process.argv[2] === 'child') {
  const x = 'x'.repeat(+process.argv[3])
  console.log(`response: ${x}`)
  process.exit(0)
} else {
  const cp = child_process.spawn(process.execPath, [__filename, 'child', process.argv[2]])
  let start = process.hrtime();
  let count = 0
  cp.stdout.on('data', (d) => {
    count += d.length
  })
  cp.stdout.on('end', () => {
    const diff = process.hrtime(start);
    console.error((diff[0] * 1e9 + diff[1]) / (1000 * 10000))
    process.exit(0)
  })
}

I took MAC because it has the smallest pipes:

#./node foo 1
7.6472037
#./node foo 10
6.8697352
#./node foo 100
6.6587272
#./node foo 1000
6.6576567
#./node foo 10000
6.9509733
#./node foo 100000
7.6435326
#./node foo 1000000
7.2952235
#./node foo 10000000
9.7720658
#./node foo 100000000
32.8319722

Observation: for reasonably large values there is no visible impact. Writing 10MB or above (where slowness starts) is probably not a meaningful scenario?

So I am not seeing blocking stdout being problematic to pipes.

@addaleax
Copy link
Member

addaleax commented May 3, 2019

@mcollina @gireeshpunathil The issue isn’t latency, it’s that this blocks the process when the output pipe is full. @Fishrock123’s request-changes comment explains the issue and provides some historical context. (We already tried this, it didn’t work out well.)

@mcollina
Copy link
Member Author

mcollina commented May 3, 2019

Ok, then we should document this and mark all related issues as won’t fix.

@BridgeAR
Copy link
Member

BridgeAR commented May 3, 2019

I read so many issues and PRs concering this issue now but I can't really get a clear picture for the issue. Often it's somewhat diffuse for me.

@addaleax if I'm not mistaken your concern is that this might block the event loop for a significant time. What about allowing the write to block up to a specific time? That way we'll likely overcome most truncation while making sure that the process is not blocked too long. We could say the process may only be blocked for e.g., up to one second.

@gireeshpunathil
Copy link
Member

@addaleax - I briefly looked at an strace data (of the child). To my surprise the writes were not blocking, but re-attempted as many times until the buffer is drained. I don't know how this could be! I will come up with a more concrete report, tomorrow.

@addaleax
Copy link
Member

addaleax commented May 3, 2019

@BridgeAR How would that be implemented without significant extra complexity? It likely also wouldn’t solve the problem that this PR is originally seeking to solve, namely writing a very large amount of data (> 1GB) to stdio.

@mcollina If you are still set on not introducing an upper limit for _writev() or some similar solution, then yes, I think #24992 is a wontfix.

@mcollina
Copy link
Member Author

mcollina commented May 5, 2019

@mcollina If you are still set on not introducing an upper limit for _writev() or some similar solution, then yes, I think #24992 is a wontfix.

I think I'm fine with an upper limit to _writev is it throws, if it loses data and the node process prints a warning on stderr. Do you think it would be feasible?

@mcollina
Copy link
Member Author

Closing as there is no consensus on this change.

@mcollina mcollina closed this Aug 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
console Issues and PRs related to the console subsystem. fs Issues and PRs related to the fs subsystem / file system.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

regression: data truncation with stream.writev in IPC channel
9 participants