Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1512 #1563

Closed
wants to merge 3 commits into from
Closed

Fix #1512 #1563

wants to merge 3 commits into from

Conversation

indutny
Copy link
Member

@indutny indutny commented Apr 29, 2015

n/t

See #1558

Dispatch requests in the implementation of the stream, not in the code
creating these requests. The requests might be piled up and invoked
internally in the implementation, so it should know better when it is
the time to dispatch them.

In fact, TLS was doing exactly this thing which led us to...

Fix: nodejs#1512
Make sure that no WriteItem's callback will be invoked synchronously.
Doing so may lead to the use of uninitialized `req` object, or even
worse use-after-free in the caller code.

Fix: nodejs#1512
@indutny
Copy link
Member Author

indutny commented Apr 29, 2015

@silverwind
Copy link
Contributor

Rebuilt again with this PR, no crashes 👍

@indutny
Copy link
Member Author

indutny commented Apr 29, 2015

LGTY anyone? cc @shigeki @bnoordhuis @iojs/collaborators

@shigeki
Copy link
Contributor

shigeki commented Apr 30, 2015

@indutny I reviewed the patch and made several tests on Win and Linux. The issue was resolved with this. But I still could not figure out the reason why series of InvokeQueued were called in sync when ECONNRESET error only on Windows not on Linux. Do you have any information about this?

@indutny
Copy link
Member Author

indutny commented Apr 30, 2015

Perhaps the roots are in libuv, and the difference in the way it reports error on windows and on unix. /cc @piscisaureus @bnoordhuis @saghul

@indutny
Copy link
Member Author

indutny commented Apr 30, 2015

@shigeki and thank you for taking a look

@saghul
Copy link
Member

saghul commented Apr 30, 2015

@indutny can you brief me on the core issue?

@indutny
Copy link
Member Author

indutny commented Apr 30, 2015

Oh right. Basically, uv_write() returns error synchronously more often on windows than it does on unix. The code wasn't well written and didn't handle the synchronous error in a right manner (it should emulate asynchronous behavior in some cases), but this didn't show up anywhere except windows.

@indutny
Copy link
Member Author

indutny commented Apr 30, 2015

I suspect that it could be that windows always invokes the syscall from the main thread, while unix does it off-thread when the queue is not empty. But I can't believe that the queue was full all the time.

Maybe it is about the way unix reports errors like ECONNRESET?

@saghul
Copy link
Member

saghul commented Apr 30, 2015

@indutny Yeah. Even when Unix does the write on the spot (because the write queue is empty) the error is reported on the next loop iteration: https://github.com/libuv/libuv/blob/v1.x/src/unix/stream.c#L1381

On Windows, however we don't queue the writes, the kernel does. If the call to WSASend fails, however, we return the error immediately. Can you try this (untested) patch? https://gist.github.com/saghul/e05b99b1b16a7c9c15e9

At any rate, I'd say we should handle this in libuv, uv_write should only return the following errors:

  • ENOMEM: we couldn't copy the buffer structures
  • EBADF: the stream was closed or not bound, and same goes for the handle we're sneding, if any
  • ENOBUFS: if we're trying to send nothing

@silverwind
Copy link
Contributor

@saghul unfortunately, it fails to compile:

  tcp.c
src\win\tcp.c(872): error C2039: 'u' : is not a member of 'uv_write_s' [C:\git\iojs\deps\uv\libuv.vcxproj]
          C:\git\iojs\deps\uv\include\uv.h(468) : see declaration of 'uv_write_s'
src\win\tcp.c(874): error C2039: 'stream' : is not a member of 'uv_tcp_s' [C:\git\iojs\deps\uv\libuv.vcxproj]
          C:\git\iojs\deps\uv\include\uv.h(490) : see declaration of 'uv_tcp_s'

@saghul
Copy link
Member

saghul commented Apr 30, 2015

Ah, that patch was done against latest libuv, I'll make one against 1.4.2 for you to test.

@saghul
Copy link
Member

saghul commented Apr 30, 2015

@silverwind here is a patch which should compile with libuv 1.4.2: https://gist.github.com/saghul/52677dba652ab8e78051

@silverwind
Copy link
Contributor

@saghul thanks, the patch works fine and looks to fix the crashes too.

@saghul
Copy link
Member

saghul commented Apr 30, 2015

@silverwind Fantastic! I'll open an issue over at libuv so we can land this properly.

@saghul
Copy link
Member

saghul commented Apr 30, 2015

libuv/libuv#339

@indutny
Copy link
Member Author

indutny commented Apr 30, 2015

I suggest landing this PR anyway. @silverwind, @shigeki LGTY?

@silverwind
Copy link
Contributor

This is a critical bug, so I'm fine with landing a temporary fix for it, to be removed in the next libuv release. Can't really review the code, sorry.

@indutny
Copy link
Member Author

indutny commented Apr 30, 2015

This is not a temporary fix. The error wasn't properly handed anyway, it is the bad libuv behavior the exposed the problem, but it was here for a long time anyway.

@saghul
Copy link
Member

saghul commented Apr 30, 2015

@indutny The fix looks kinda obvious, but a LGTM from @piscisaureus would be nice.

@piscisaureus
Copy link
Contributor

@indutny
Copy link
Member Author

indutny commented Apr 30, 2015

@piscisaureus and this one? ^

@saghul
Copy link
Member

saghul commented Apr 30, 2015

@indutny it's the same fix. We changed the internal structure to not use an anonymous union.

@indutny
Copy link
Member Author

indutny commented Apr 30, 2015

@saghul I meant reviewing this PR...

@piscisaureus
Copy link
Contributor

@indutny 5992187 lgtm although I don't understand what you mean by "In fact, TLS was doing exactly this thing which led us to...".

@indutny
Copy link
Member Author

indutny commented Apr 30, 2015

@piscisaureus the commit log is borked, I'll rewrite it. Thanks for the review!

@shigeki
Copy link
Contributor

shigeki commented Apr 30, 2015

@saghul Thanks for the patch. I really understand this behavior on Windows. It is very helpful for me.

@indutny I have one more question. Do sync/async checks need per WriteItem? Because most of InvokeQueued are called in async, if InvokeQueued is called in sync only when an error returns from uv_write(), can we also invoke ``InvokeQueuedin async viauv_check`? Like

diff --git a/src/tls_wrap.cc b/src/tls_wrap.cc
index c774a84..1bd1f05 100644
--- a/src/tls_wrap.cc
+++ b/src/tls_wrap.cc
@@ -117,6 +117,23 @@ bool TLSWrap::InvokeQueued(int status) {
 }


+void TLSWrap::CheckInvokeQueued::CheckCb(uv_check_t* check) {
+  CheckInvokeQueued* c = reinterpret_cast<CheckInvokeQueued*>(check->data);
+  c->w_->InvokeQueued(c->status_);
+  uv_close(reinterpret_cast<uv_handle_t*>(check), nullptr);
+  delete c;
+}
+
+
+int TLSWrap::InvokeQueuedAsync(int status) {
+  CheckInvokeQueued* check = new CheckInvokeQueued(this, status);
+  uv_check_init(env()->event_loop(), &check->check_);
+  check->check_.data = check;
+  int err = uv_check_start(&check->check_, CheckInvokeQueued::CheckCb);
+  return err;
+}
+
+
 void TLSWrap::NewSessionDoneCb() {
   Cycle();
 }
@@ -311,7 +328,7 @@ void TLSWrap::EncOut() {
   // Ignore errors, this should be already handled in js
   if (err) {
     write_req->Dispose();
-    InvokeQueued(err);
+    InvokeQueuedAsync(err);
   } else {
     NODE_COUNT_NET_BYTES_SENT(write_size_);
   }
diff --git a/src/tls_wrap.h b/src/tls_wrap.h
index 9f09535..de7140f 100644
--- a/src/tls_wrap.h
+++ b/src/tls_wrap.h
@@ -62,6 +62,22 @@ class TLSWrap : public crypto::SSLWrap<TLSWrap>,
   // Maximum number of buffers passed to uv_write()
   static const int kSimultaneousBufferCount = 10;

+  class CheckInvokeQueued {
+  public:
+  CheckInvokeQueued(TLSWrap* w, int status) : w_(w), status_(status)  {
+    }
+
+    ~CheckInvokeQueued() {
+      w_ = nullptr;
+    }
+
+    static void CheckCb(uv_check_t* check);
+
+    TLSWrap* w_;
+    int status_;
+    uv_check_t check_;
+  };
+
   // Write callback queue's item
   class WriteItem {
    public:
@@ -88,6 +104,7 @@ class TLSWrap : public crypto::SSLWrap<TLSWrap>,
   void ClearOut();
   void MakePending();
   bool InvokeQueued(int status);
+  int  InvokeQueuedAsync(int status);

   inline void Cycle() {
     // Prevent recursion

I've just borrowed your code. If it is right, I think it can reduce the costs to check that if every WriteItem are in async or sync.

@saghul
Copy link
Member

saghul commented Apr 30, 2015

I might be completely off here... but is this really needed if libuv does The Right Thing (TM)? If uv_write only fails with the errors I mentioned here: #1563 (comment) then, from iojs' prespective, those are programming errors and you could abort(). In all other cases the callback would be called in the future, as expected.

@shigeki
Copy link
Contributor

shigeki commented Apr 30, 2015

@saghul Yes, I agree it. After fixing libuv, only ENOMEM, EBADF or ENOBUFS returns from uv_write() so that we need not to save in the sync case.

@indutny
Copy link
Member Author

indutny commented May 1, 2015

Hm... if you insist on it.... Let's land the first commit then.

@indutny
Copy link
Member Author

indutny commented May 1, 2015

Any objections?

@shigeki
Copy link
Contributor

shigeki commented May 1, 2015

No, objection. LGTM.

indutny added a commit that referenced this pull request May 1, 2015
Dispatch requests in the implementation of the stream, not in the code
creating these requests. The requests might be piled up and invoked
internally in the implementation, so it should know better when it is
the time to dispatch them.

In fact, TLS was doing exactly this thing which led us to...

Fix: #1512
PR-URL: #1563
Reviewed-By: Shigeki Ohtsu <[email protected]>
@indutny
Copy link
Member Author

indutny commented May 1, 2015

Landed in 30b7349, thank you!

@indutny indutny closed this May 1, 2015
@indutny indutny deleted the fix/gh-1512 branch May 1, 2015 11:53
@rvagg rvagg mentioned this pull request May 2, 2015
rvagg added a commit that referenced this pull request May 4, 2015
PR-URL: #1532

Notable Changes:

* crypto: significantly reduced memory usage for TLS (Fedor Indutny & Сковорода
  Никита Андреевич) #1529
* net: socket.connect() now accepts a 'lookup' option for a custom DNS
  resolution mechanism, defaults to dns.lookup() (Evan Lucas) #1505
* npm: Upgrade npm to 2.9.0. See the v2.8.4 and v2.9.0 release notes for
  details. Notable items:
  - Add support for default author field to make npm init -y work without
    user-input (@othiym23) npm/npm/d8eee6cf9d
  - Include local modules in npm outdated and npm update (@ArnaudRinquin)
    npm/npm#7426
  - The prefix used before the version number on npm version is now configurable
    via tag-version-prefix (@kkragenbrink) npm/npm#8014
* os: os.tmpdir() is now cross-platform consistent and will no longer returns a
  path with a trailling slash on any platform (Christian Tellnes) #747
* process:
  - process.nextTick() performance has been improved by between 2-42% across the
    benchmark suite, notable because this is heavily used across core (Brian White) #1548
  - New process.geteuid(), process.seteuid(id), process.getegid() and
    process.setegid(id) methods allow you to get and set effective UID and GID
    of the process (Evan Lucas) #1536
* repl:
  - REPL history can be persisted across sessions if the NODE_REPL_HISTORY_FILE
    environment variable is set to a user accessible file,
    NODE_REPL_HISTORY_SIZE can set the maximum history size and defaults to 1000
    (Chris Dickinson) #1513
  - The REPL can be placed in to one of three modes using the NODE_REPL_MODE
    environment variable: sloppy, strict or magic (default); the new magic mode
    will automatically run "strict mode only" statements in strict mode
    (Chris Dickinson) #1513
* smalloc: the 'smalloc' module has been deprecated due to changes coming in V8
  4.4 that will render it unusable
* util: add Promise, Map and Set inspection support (Christopher Monsanto) #1471
* V8: upgrade to 4.2.77.18, see the ChangeLog for full details. Notable items:
  - Classes have moved out of staging; the class keyword is now usable in strict
    mode without flags
  - Object literal enhancements have moved out of staging; shorthand method and
    property syntax is now usable ({ method() { }, property })
  - Rest parameters (function(...args) {}) are implemented in staging behind the
    --harmony-rest-parameters flag
  - Computed property names ({['foo'+'bar']:'bam'}) are implemented in staging
    behind the --harmony-computed-property-names flag
  - Unicode escapes ('\u{xxxx}') are implemented in staging behind the
    --harmony_unicode flag and the --harmony_unicode_regexps flag for use in
    regular expressions
* Windows:
  - Random process termination on Windows fixed (Fedor Indutny) #1512 / #1563
  - The delay-load hook introduced to fix issues with process naming (iojs.exe /
    node.exe) has been made opt-out for native add-ons. Native add-ons should
    include 'win_delay_load_hook': 'false' in their binding.gyp to disable this
    feature if they experience problems . (Bert Belder) #1433
* Governance:
  - Rod Vagg (@rvagg) was added to the Technical Committee (TC)
  - Jeremiah Senkpiel (@Fishrock123) was added to the Technical Committee (TC)
@indutny indutny restored the fix/gh-1512 branch May 5, 2015 14:23
@indutny indutny reopened this May 5, 2015
@indutny
Copy link
Member Author

indutny commented May 5, 2015

I suggest landing the second part of this changeset too, at least until it'll be fixed in libuv.

@indutny
Copy link
Member Author

indutny commented May 5, 2015

cc @saghul @shigeki @Fishrock123

@indutny
Copy link
Member Author

indutny commented May 5, 2015

PTAL

@saghul
Copy link
Member

saghul commented May 5, 2015

The patch will land as is plus a test on libuv, so I guess you could float
it.
On May 5, 2015 16:24, "Fedor Indutny" [email protected] wrote:

PTAL


Reply to this email directly or view it on GitHub
#1563 (comment).

@indutny indutny closed this May 5, 2015
@indutny indutny deleted the fix/gh-1512 branch May 5, 2015 20:42
@silverwind
Copy link
Contributor

@indutny should we revert 30b7349 now that the libuv update is here in #1646?

@indutny
Copy link
Member Author

indutny commented May 6, 2015

@silverwind nope, it is good to have anyway.

@silverwind
Copy link
Contributor

Alright!

Fishrock123 pushed a commit to Fishrock123/node that referenced this pull request May 19, 2015
Dispatch requests in the implementation of the stream, not in the code
creating these requests. The requests might be piled up and invoked
internally in the implementation, so it should know better when it is
the time to dispatch them.

In fact, TLS was doing exactly this thing which led us to...

Fix: nodejs#1512
PR-URL: nodejs#1563
Reviewed-By: Shigeki Ohtsu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants