Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Segfault node v0.11.14 related to TLS #8780

Closed
matejkramny opened this issue Nov 25, 2014 · 14 comments
Closed

Segfault node v0.11.14 related to TLS #8780

matejkramny opened this issue Nov 25, 2014 · 14 comments
Labels

Comments

@matejkramny
Copy link

Running https://github.com/CastawayLabs/dproxy.js.. Occasionally, it quits with segmentation fault..

It is caught using segfault-handler.

Stack trace:

PID 20 received SIGSEGV for address: 0x11
/srv/app/node_modules/segfault-handler/build/Release/segfault-handler.node(+0x1185)[0x7fc0d1619185]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7fc0d1dfe340]
/usr/local/bin/node(uv_write+0x1c)[0xb8a84c]
/usr/local/bin/node(_ZN4node12TLSCallbacks6EncOutEv+0x23a)[0xb621ea]
/usr/local/bin/node(_ZN4node12TLSCallbacks21OnClientHelloParseEndEPv+0x30)[0xb64340]
/usr/local/bin/node(_ZN4node6crypto7SSLWrapINS_12TLSCallbacksEE9EndParserERKN2v820FunctionCallbackInfoINS4_5ValueEEE+0x5d)[0xb49b1d]
/usr/local/bin/node(_ZN2v88internal25FunctionCallbackArguments4CallEPFvRKNS_20FunctionCallbackInfoINS_5ValueEEEE+0x92)[0x781002]
/usr/local/bin/node[0x79e5f8]
[0x1fb117c0634e]

If i'm reading this correctly, a client isn't sending proper TLS request and causes an internal node failure.

I'm determined to find the root cause of this. If this is genuinely not my fault (dproxy project), then other projects using TLS and SNI may be endangered.

Happened for v0.11.13 and v.0.11.14

/cc @tjfontaine as it seems related to TLS

@migounette
Copy link

@matejkramny please provide more details, for instance distro (Debian, ...), and how to produce the issue. A simple test case help for faster debugging and for get the time to look at it.
eg. npm start
My 2 cents

@misterdjules
Copy link

@matejkramny as @migounette suggested, could you please provide us with more information on how to reproduce this issue somewhat reliably? Ideally, we would come up with a test case that doesn't have any external dependency (a Node.js program that doesn't use any external module). However, anything that documents how to reproduce the issue with clear steps would already help.

@matejkramny
Copy link
Author

I'll find time for this by end of the week.

It is really difficult to reproduce however, and seems to happen * randomly *.

The server seemed to crash every time i used the iOS simulator. Then quitting the simulator and relaunching it fixed the server. Next time i'll try to capture the network :p

@misterdjules
Copy link

One way to gather more information regarding what's going on would be to:

  • Try to reproduce the problem either with a debug build or with a 32 bits binary. The problem with optimized x64 builds (like the one you're using) is that it can be difficult to examine a core dump due to the calling conventions used.
  • Let the operating system generate a core dump when the program receives SIGSEGV. You'll have to make sure that the core file limits are set to unlimited with ulimit -c unlimited.
  • Examine this core dump to understand what input causes this segfault. You can use gdb to do that.

If you need help doing this, please ping me (jgi) on #libuv.

Thanks!

@matejkramny
Copy link
Author

@misterdjules Thanks. is that on freenode?

@misterdjules
Copy link

Yes, on Freenode.

@matejkramny
Copy link
Author

Right. I compiled node.js on the system as 32 bit (i think). I couldn't get the server to crash (no matter what.). It crashed this morning for some reason.

Here's the log.. doesn't look very different from the 64 bit one. ulimit is set to unlimited

/home/proxy/node_modules/segfault-handler/build/Debug/segfault-handler.node(+0x2ca9)[0x7fa00e2e5ca9]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf030)[0x7fa00ea99030]
node(uv_write2+0x27)[0xbb83b7]
node(_ZN4node12TLSCallbacks6EncOutEv+0x239)[0xb92c49]
node(_ZN4node12TLSCallbacks21OnClientHelloParseEndEPv+0x48)[0xb93a28]
node(_ZN4node6crypto7SSLWrapINS_12TLSCallbacksEE9EndParserERKN2v820FunctionCallbackInfoINS4_5ValueEEE+0x5d)[0xb7c12d]
node(_ZN2v88internal25FunctionCallbackArguments4CallEPFvRKNS_20FunctionCallbackInfoINS_5ValueEEEE+0x9b)[0x7a675b]
node[0x7c577d]
[0x2047f850634e]

Is it of any use? I thought of having gdb set up, but i've never done that with node.js before (apart from using node-inspector which i suspect uses that).

@migounette
Copy link

@misterdjules Well, I found a problem a week ago but I didn't succeed to reproduce, it was the source of a crash in intensive TLS traffic.

The enc_out_ may be null... my 2 cents

void TLSCallbacks::EncOutCb(uv_write_t* req, int status) {
TLSCallbacks* callbacks = static_cast<TLSCallbacks*>(req->data);

// Handle error
if (status) {
// Ignore errors after shutdown
if (callbacks->shutdown_)
return;

// Notify about error
callbacks->InvokeQueued(status);
return;

}

// Commit

// FIX <<------- START
if (callbacks->enc_out_) {
NodeBIO::FromBIO(callbacks->enc_out_)->Read(NULL, callbacks->write_size_);
} // FIX <<--- END
// Try writing more data
callbacks->write_size_ = 0;
callbacks->EncOut();
}

@matejkramny
Copy link
Author

@migounette can you make a PR?

indutny added a commit to indutny/io.js that referenced this issue May 14, 2015
When loading session, OCSP response, SNI, always check that the
`self._handle` is present. If it is not - the socket was closed - and we
should emit the error instead of throwing an uncaught exception.

Fix: nodejs/node-v0.x-archive#8780
Fix: nodejs#1696
@indutny
Copy link
Member

indutny commented May 14, 2015

May I ask you to give a try to nodejs/node#1702 ?

@matejkramny
Copy link
Author

Ok will do.

Thanks @indutny!

indutny added a commit to indutny/io.js that referenced this issue May 18, 2015
* Destroy `SSL*` and friends on a next tick to make sure that we are not
  doing it in one of the OpenSSL callbacks
* Add more checks to the C++ methods that might be invoked during
  destructor's pending queue cleanup

Fix: nodejs/node-v0.x-archive#8780
Fix: nodejs#1696
indutny added a commit to indutny/io.js that referenced this issue May 29, 2015
When loading session, OCSP response, SNI, always check that the
`self._handle` is present. If it is not - the socket was closed - and we
should emit the error instead of throwing an uncaught exception.

Fix: nodejs/node-v0.x-archive#8780
Fix: nodejs#1696
indutny added a commit to indutny/io.js that referenced this issue May 29, 2015
* Destroy `SSL*` and friends on a next tick to make sure that we are not
  doing it in one of the OpenSSL callbacks
* Add more checks to the C++ methods that might be invoked during
  destructor's pending queue cleanup

Fix: nodejs/node-v0.x-archive#8780
Fix: nodejs#1696
indutny added a commit to indutny/io.js that referenced this issue Jun 1, 2015
When loading session, OCSP response, SNI, always check that the
`self._handle` is present. If it is not - the socket was closed - and we
should emit the error instead of throwing an uncaught exception.

Fix: nodejs/node-v0.x-archive#8780
Fix: nodejs#1696
indutny added a commit to indutny/io.js that referenced this issue Jun 1, 2015
* Destroy `SSL*` and friends on a next tick to make sure that we are not
  doing it in one of the OpenSSL callbacks
* Add more checks to the C++ methods that might be invoked during
  destructor's pending queue cleanup

Fix: nodejs/node-v0.x-archive#8780
Fix: nodejs#1696
@jasnell
Copy link
Member

jasnell commented Jun 4, 2015

@matejkramny ... any updates on this?
@indutny ... was this fixed in io.js?

@migounette
Copy link

@indutny @misterdjules @jasnell @matejkramny
Guys, the problem found on my side is (and was) not easy to reproduce. My environment in which I have this depends on multiple servers in order to get loads on a node with a WebRTC application.

At this stage, I don't have any valid core dump due to a stupid problem of ulimit on our testing system (fixed now), but so far so good, I can't see this issue since we moved to 0.12
I didn't get the time to revert the node to a previous version.

So, I can help for investigating this issue if we can succeed to have a light test case.

indutny added a commit to nodejs/node that referenced this issue Jun 4, 2015
When loading session, OCSP response, SNI, always check that the
`self._handle` is present. If it is not - the socket was closed - and we
should emit the error instead of throwing an uncaught exception.

Fix: nodejs/node-v0.x-archive#8780
Fix: #1696
PR-URL: #1702
Reviewed-By: Trevor Norris <[email protected]>
indutny added a commit to nodejs/node that referenced this issue Jun 4, 2015
* Destroy `SSL*` and friends on a next tick to make sure that we are not
  doing it in one of the OpenSSL callbacks
* Add more checks to the C++ methods that might be invoked during
  destructor's pending queue cleanup

Fix: nodejs/node-v0.x-archive#8780
Fix: #1696
PR-URL: #1702
Reviewed-By: Trevor Norris <[email protected]>
@indutny
Copy link
Member

indutny commented Jun 6, 2015

@migounette what do you think about nodejs/node#1910 ?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants