Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster IPC ERR_IPC_DISCONNECTED EPIPE #32106

Closed
pool683 opened this issue Mar 5, 2020 · 4 comments
Closed

Cluster IPC ERR_IPC_DISCONNECTED EPIPE #32106

pool683 opened this issue Mar 5, 2020 · 4 comments

Comments

@pool683
Copy link

pool683 commented Mar 5, 2020

This is not production code.

'use strict';

const cluster = require('cluster');
const os = require('os');

if (cluster.isMaster) {
  const workers = [];
  const numCPUs = 4; //os.cpus().length;
  var wait_online = numCPUs;
  for (var i = 0; i < numCPUs; i++) {
    const worker = cluster.fork();
    workers[i] = worker;
    worker.on('online', function() {
      if (--wait_online == 0) {
        for (const worker of workers) {
          if (worker.isConnected()) {
            worker.send('die');
          }
        }
      }
    });
    worker.on('error', function(err) {
      console.log(err);
    });
    worker.on('disconnect', function(code, signal) {
      for (const worker of workers) {
        if (worker.isConnected()) {
          worker.send('die');
        }
      }
    });
  }
} else {
  process.on('uncaughtException', (err, origin) => {
    console.log(err);
  });
  process.on('message', function (msg) {
    if (msg == 'die') {
      if (cluster.worker.isConnected()) {
        cluster.worker.disconnect();
      }
    }
  });
}

Randomly generates errors:

Error: write EPIPE
    at ChildProcess.target._send (internal/child_process.js:806:20)
    at ChildProcess.target.send (internal/child_process.js:676:19)
    at sendHelper (internal/cluster/utils.js:22:15)
    at send (internal/cluster/master.js:351:10)
    at exitedAfterDisconnect (internal/cluster/master.js:263:3)
    at Worker.onmessage (internal/cluster/master.js:250:5)
    at ChildProcess.onInternalMessage (internal/cluster/utils.js:43:8)
    at ChildProcess.emit (events.js:228:7)
    at emit (internal/child_process.js:876:12)
    at processTicksAndRejections (internal/process/task_queues.js:82:21) {
  errno: 'EPIPE',
  code: 'EPIPE',
  syscall: 'write'
}
Error: write EPIPE
    at ChildProcess.target._send (internal/child_process.js:806:20)
    at ChildProcess.target.send (internal/child_process.js:676:19)
    at Worker.send (internal/cluster/worker.js:47:28)
    at Worker.<anonymous> (/run/media/user/Disk/test/worker.js:28:18)
    at Worker.emit (events.js:223:5)
    at ChildProcess.<anonymous> (internal/cluster/master.js:209:12)
    at Object.onceWrapper (events.js:312:28)
    at ChildProcess.emit (events.js:223:5)
    at finish (internal/child_process.js:861:14)
    at processTicksAndRejections (internal/process/task_queues.js:76:11) {
  errno: 'EPIPE',
  code: 'EPIPE',
  syscall: 'write'
}
Error [ERR_IPC_DISCONNECTED]: IPC channel is already disconnected
    at process.target.disconnect (internal/child_process.js:832:26)
    at Worker.<anonymous> (internal/cluster/child.js:208:62)
    at process.onInternalMessage (internal/cluster/utils.js:43:8)
    at process.emit (events.js:228:7)
    at emit (internal/child_process.js:876:12)
    at processTicksAndRejections (internal/process/task_queues.js:82:21) {
  code: 'ERR_IPC_DISCONNECTED'
}

node v12.15.0
Linux 5.5.6-201.fc31.x86_64

@himself65
Copy link
Member

himself65 commented Mar 6, 2020

this still occurs on the latest version, randomly running, only when more than 2 child processes

@himself65
Copy link
Member

error stack

anonymous(), foo.js:21
emit(), events.js:316
anonymous(), worker.js:29
emit(), events.js:316
anonymous(), child_process.js:815
processTicksAndRejections(), task_queues.js:79
Async call from TickObject
init(), inspector_async_hook.js:25
emitInitNative(), async_hooks.js:144
emitInitScript(), async_hooks.js:346
nextTick(), task_queues.js:135
target._send(), child_process.js:815
target.send(), child_process.js:682
Worker.send(), worker.js:45
anonymous(), foo.js:26
emit(), events.js:316
anonymous(), master.js:214
onceWrapper(), events.js:422
emit(), events.js:316
finish(), child_process.js:866
processTicksAndRejections(), task_queues.js:79
Async call from TickObject
init(), inspector_async_hook.js:25
emitInitNative(), async_hooks.js:144
emitInitScript(), async_hooks.js:346
nextTick(), task_queues.js:135
target._disconnect(), child_process.js:877
target.disconnect(), child_process.js:848
channel.onread(), child_process.js:588
Async call from PIPEWRAP
init(), inspector_async_hook.js:25
emitInitNative(), async_hooks.js:144
anonymous(), child_process.js:963
getValidStdio(), child_process.js:927
ChildProcess.spawn(), child_process.js:341
spawn(), child_process.js:548
fork(), child_process.js:116
createWorkerProcess(), master.js:134
cluster.fork(), master.js:169
anonymous(), foo.js:9
Module._compile(), loader.js:1147
Module._extensions..js(), loader.js:1167
Module.load(), loader.js:996
Module._load(), loader.js:896
executeUserEntryPoint(), run_main.js:71
anonymous(), run_main_module.js:17

@himself65
Copy link
Member

himself65 commented Mar 9, 2020

after reading the source code. I think the source of the problem is the Multithreading that child_process runs faster or slower than the main process IPC

with node v13.10.1 on Windows

'use strict'

const cluster = require('cluster')

if (cluster.isMaster) {
  const workers = []
  let workerNum = 2
  for (let i = 0; i < 2; i++) {
    const worker = cluster.fork()
    workers.push(worker)
    worker.on('online', function () {
      if (--workerNum === 0) {
        for (const worker of workers) {
          if (worker.isConnected()) {
            console.log('send message die:', worker.id)
            worker.send('die')
          }
        }
      }
    })
    worker.on('disconnect', function () {
      for (const worker of workers) {
        if (worker.isConnected()) {
          console.log('still connected', worker.id)
          worker.send('die2')
        }
      }
    })
  }
} else {
  process.on('message', function (msg) {
    if (msg === 'die') {
      console.log('get message die:', cluster.worker.id)
      if (cluster.worker.isConnected()) {
        cluster.worker.disconnect()
      }
    } else if (msg === 'die2') {
      console.log('get message die2:', cluster.worker.id)
    }
  })
}

output

C:\Users\Himself65\Desktop\github\test>node example1.js
send message die: 1
send message die: 2
get message die: 1
still connected 2
get message die: 2
get message die2: 2

and we can see the second worker gets messages slower than the first worker. because
worker.send('xxx') in the main thread is synchronized code (btw, I haven't read IPC source c++ code for now). so the workers may unresponsive. when you call many workers, the bug will appear which isConnected is still connected but just that the moment.

santigimeno added a commit to santigimeno/node that referenced this issue Apr 12, 2020
Avoid sending multiple `exitedAfterDisconnect` messages when
concurrently calling `disconnect()` and/or `destroy()` from the worker
so `ERR_IPC_DISCONNECTED` errors are not generated.

Fixes: nodejs#32106
targos pushed a commit that referenced this issue May 4, 2020
Avoid sending multiple `exitedAfterDisconnect` messages when
concurrently calling `disconnect()` and/or `destroy()` from the worker
so `ERR_IPC_DISCONNECTED` errors are not generated.

Fixes: #32106

PR-URL: #32793
Reviewed-By: Zeyu Yang <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
targos pushed a commit that referenced this issue May 7, 2020
Avoid sending multiple `exitedAfterDisconnect` messages when
concurrently calling `disconnect()` and/or `destroy()` from the worker
so `ERR_IPC_DISCONNECTED` errors are not generated.

Fixes: #32106

PR-URL: #32793
Reviewed-By: Zeyu Yang <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
targos pushed a commit that referenced this issue May 13, 2020
Avoid sending multiple `exitedAfterDisconnect` messages when
concurrently calling `disconnect()` and/or `destroy()` from the worker
so `ERR_IPC_DISCONNECTED` errors are not generated.

Fixes: #32106

PR-URL: #32793
Reviewed-By: Zeyu Yang <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
@ceciliachoi
Copy link

ceciliachoi commented Apr 21, 2021

Still able to recreate this issue with Node.js v14.16.1

node-v14.16.1-darwin-x64/bin/node test
Error: write EPIPE
    at ChildProcess.target._send (internal/child_process.js:832:20)
    at ChildProcess.target.send (internal/child_process.js:703:19)
    at Worker.send (internal/cluster/worker.js:46:10)
    at Worker.<anonymous> (/Users/cecilia/Documents/myNode/test.js:28:18)
    at Worker.emit (events.js:315:20)
    at ChildProcess.<anonymous> (internal/cluster/master.js:220:12)
    at Object.onceWrapper (events.js:421:28)
    at ChildProcess.emit (events.js:315:20)
    at finish (internal/child_process.js:888:14)
    at processTicksAndRejections (internal/process/task_queues.js:75:11) {
  errno: -32,
  code: 'EPIPE',
  syscall: 'write'
}

Mitigation: To avoid uncaught exception cause service abort, add callback to capture the error

worker.send('msg', (err)=>{console.log(err)});

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants