Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inspector: fix process._debugEnd() for inspector #12777

Merged
merged 0 commits into from
May 22, 2017
Merged

inspector: fix process._debugEnd() for inspector #12777

merged 0 commits into from
May 22, 2017

Conversation

eugeneo
Copy link
Contributor

@eugeneo eugeneo commented May 1, 2017

This change ensures that the WebSocket server can be stopped
(and restarted if needed) buy calling process._debugEnd.

Fixes: #12559

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

inspector: fixes around WS server stopping and (re)starting.

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. dont-land-on-v4.x inspector Issues and PRs related to the V8 inspector protocol labels May 1, 2017
@TimothyGu
Copy link
Member

This needs a rebase, and what is the nature of process._debugEnd()? Are users allowed to call it themselves or is it a private API?

@823639792
Copy link

823639792 commented May 2, 2017

I need it

@eugeneo
Copy link
Contributor Author

eugeneo commented May 2, 2017

@TimothyGu this is not a new API - it was working with the old debugger. I did not know about this API until #12559 - so now I am fixing the issues.

There are some genuine bugs that this API uncovered - those really need to be fixed.

@eugeneo
Copy link
Contributor Author

eugeneo commented May 2, 2017

(I am looking into CI failures, looks like some new bugs were introduced)

@TimothyGu
Copy link
Member

While I agree that we should at the minimum make the function working again under Inspector, I think we should consider making it fully public (document it and rename it to process.debugEnd) if it is deemed to be useful, especially now that it seems some people are using it.

@eugeneo
Copy link
Contributor Author

eugeneo commented May 2, 2017

As this PR is not introducing/altering existing API, I would like to proceed with it in its current form (well, after I figure out how to fix intermittent issues :) )

Feel free to open a separate issue about the public API. I expect a lot of discussions there...

@eugeneo
Copy link
Contributor Author

eugeneo commented May 2, 2017

All issues seem to be resolved. CI run: https://ci.nodejs.org/job/node-test-pull-request/7810/

@eugeneo eugeneo requested review from refack and removed request for refack May 2, 2017 21:57
static_cast<Agent*>(handle->data)->StartIoThread(false);
}

void StartIoCallback(Isolate* isolate, void* agent) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there an Isolate* isolate arg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}).on('close', () => assert(this.expectClose_, 'Socket closed prematurely'));
}).on('close', () => {
assert(this.expectClose_, 'Socket closed prematurely');
this.closeCallback_ && this.closeCallback_();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just trying to grok the code...
Who will set this.closeCallback_?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test "framework" is really messy - I am trying to clean it up, but have not yet achieved the stage to create a pull request...

Currently the callback is set in sendCommandsAndExpectClose, e.g. when there's a need to handle connection closing and not just assert.

refack
refack previously requested changes May 2, 2017
Copy link
Contributor

@refack refack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need another run through the code, it's been 3 years since I was there this...
Could you summarize the change in 3-4 sentences

@@ -89,13 +90,18 @@ void HandleSyncCloseCb(uv_handle_t* handle) {
int CloseAsyncAndLoop(uv_async_t* async) {
bool is_closed = false;
async->data = &is_closed;
uv_close(reinterpret_cast<uv_handle_t*>(async), HandleSyncCloseCb);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help me here.
What happened to HandleSyncCloseCb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! Thank you. I removed it by mistake together with debug statements when debugging the intermittent test failures :/

@refack
Copy link
Contributor

refack commented May 2, 2017

If I get the jist, we need to stop the InspectorAgent better, as it wasn't fully implemented during the transition?

]).expectShutDown(0);
}

const script = 'process._debugEnd();' +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a new functionality? detaching and reattaching?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a new functionality. Both APIs were already there (_debugProcess is meant to attach to other processes - but looks like it can be made to work on self just fine).

_debugProcess sends a signal on Posix systems. It is the function used to attach to processes that were not started with --inspect.
__debugEnd stops the agent. Looks like it had been there for ages, personally I am not sure why it is needed :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personally I am not sure why it is needed :)

Which one _debugProcess, I assume?

Or am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_debugProcess is useful, frontends use it to attach to running instances.

__debugEnd, on the other hand, can only be called on self so the application needs to explicitly "want" to make itself undebuggable. I am not sure why...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the complex script?
Run once with just _debugEnd since this is the change.
We want to know if just this is failing, now it could fail on any of the two.
You can do a second run with this script.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, but, but, that is what you're fixing.... 😕

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're just doing it for @823639792?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Complex script - I noticed that starting inspector after it had been stopped crashed the Node, so some test coverage definitely helps.
  2. _debugEnd - the goal here is to ensure that inspector can cleanly stop. There were some genuine bugs and race conditions that need to be addressed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, so do two tests...

@eugeneo
Copy link
Contributor Author

eugeneo commented May 2, 2017

@refack Right. Initial inspector prototype was a fork of the debug_agent.cc - so it compiled but I did not pay much attention to Stop (e.g. the assumption was it is only called during the shutdown).

@refack
Copy link
Contributor

refack commented May 2, 2017

@eugeneo can you rename this PR to fix process._debugEnd() or transition process._debugEnd() to inspector

@eugeneo eugeneo changed the title inspector: implement process._debugEnd() inspector: fix process._debugEnd() for inspector May 3, 2017
@eugeneo
Copy link
Contributor Author

eugeneo commented May 3, 2017

@refack I updated commit message and PR title. I also added a test case that only stops the inspector.

harness.expectShutDown(42);
}

helper.startNodeForInspectorTest(testStop, '--inspect',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it be '--inspect-brk'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. That test does not wait for the session. The tests cover shutting down the inspector with or without the session.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering because of the failures....

@refack
Copy link
Contributor

refack commented May 3, 2017

Do You have any idea why is fails only on aix & feeBSD?


const script = 'process._debugEnd();' +
'process._debugProcess(process.pid);' +
'process._debugEnd();' +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it this line new?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am testing if this fixes the failures. Server is starting (and printing out the message) on a background thread so I suspect the reason test fails is because process shuts down too fast. _debugEnd synchronizes the threads.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my guess...

@eugeneo
Copy link
Contributor Author

eugeneo commented May 3, 2017

@refack looks like process exits before the stdio buffers are flushed - adding console.log fixes the failure (this is BSD stress test - https://ci.nodejs.org/job/node-stress-single-test/1192/nodes=freebsd10-64/)

I uploaded updated test.

@eugeneo
Copy link
Contributor Author

eugeneo commented May 3, 2017

Still failing on AIX: https://ci.nodejs.org/job/node-stress-single-test/1193/nodes=aix61-ppc64/console, investigating...

@eugeneo
Copy link
Contributor Author

eugeneo commented May 5, 2017

Looks like the only way to fix the test once and for all would be to add the delay - which I won't do as it increases duration of the test run for little gain. This is AIX stress test with arbitrary delay added: https://ci.nodejs.org/job/node-stress-single-test/nodes=aix61-ppc64/1198/

What I uploaded here is to stop checking for the inspector startup message. If the inspector actually fails to start there will be crash so checking the process exit code should be enough.

@refack
Copy link
Contributor

refack commented May 5, 2017

What I uploaded here is to stop checking for the inspector startup message. If the inspector actually fails to start there will be crash so checking the process exit code should be enough.

Sounds reasonable. There's an explicit test for that in #11207

@eugeneo
Copy link
Contributor Author

eugeneo commented May 9, 2017

@eugeneo
Copy link
Contributor Author

eugeneo commented May 11, 2017

@refack, @sam-github please let me know if I have not addressed any of your comments - I'm a bit confused with the GitHub UI...

CI is green: https://ci.nodejs.org/job/node-test-pull-request/7974/

@sam-github
Copy link
Contributor

@eugeneo you didn't respond to #12777 (comment), and yes, the new review UI has some problems.

@eugeneo
Copy link
Contributor Author

eugeneo commented May 11, 2017

@sam-github

I'm specifically looking for ways to start and stop the debugger programmatically, this seems to only have stop (called "end", I assume that is historical, because its not an obvious name)

True, these are the legacy APIs. The closest way to (re)start debugger is process._debugProcess(process.pid). It uses IPC (signals on *nix and debug API on Windows) but seems to work well if called on self.

Its not clear whether start/stop can be done with #12263, or whether these _ prefixed APIs should be documented, or perhaps renamed or even moved to the inspector module.

API design is out of scope for this PR. #12263 also deals with the JS bindings for the session management - but it will be trivial to add some nicer start/stop APIs to inspector module, if need be.

Copy link
Contributor

@sam-github sam-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple small nits, but basically LGTM

io_->Stop();
io_.reset();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have trouble understanding Node core C++ conventions, does it makes sense that Stop() is upper case, and reset() is lower case @addaleax ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Node core is usually more UpperSnakeCase() for methods, but reset() doesn’t come from Node core. I think this is okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stop is InspectorIo::Stop which is Node.
reset is std::unique_ptr::reset which is STL...

@@ -96,6 +97,12 @@ int CloseAsyncAndLoop(uv_async_t* async) {
return uv_loop_close(async->loop);
}

void ReleasePairOnAsyncClose(uv_handle_t* async) {
AsyncAndAgent* pair = node::ContainerOf(&AsyncAndAgent::first,
reinterpret_cast<uv_async_t*>(async));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is supposed to be indented to line up after the parentheses (that you have to do this after changing the name of the type in the line before is why I don't like this style, but its the node style)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

DispatchMessages();
}
return true;
}

void InspectorIo::Stop() {
CHECK(state_ == State::kAccepting || state_ == State::kConnected);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@eugeneo
Copy link
Contributor Author

eugeneo commented May 11, 2017

@sam-github
Copy link
Contributor

LGTM, @refack PTAL

@eugeneo
Copy link
Contributor Author

eugeneo commented May 17, 2017

@refack - gentle ping, please take another look :)

@sam-github
Copy link
Contributor

@refack PTAL

@refack refack dismissed their stale review May 19, 2017 19:55

What I've seen was good

@refack
Copy link
Contributor

refack commented May 19, 2017

Sorry, for some reason your pings got filtered out from my email list.

@sam-github
Copy link
Contributor

@eugeneo noticed a typo in the commit message s/buy/by

Copy link
Contributor

@refack refack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending CI and commit message nits.

@refack
Copy link
Contributor

refack commented May 19, 2017

@sam-github
Copy link
Contributor

Failures displayed in the PR may not be accurate, check @refack 's build on ci (I started a build 30 seconds after, and cancelled it, so I think the cancelled status is what is showing up in the github view).

@refack
Copy link
Contributor

refack commented May 19, 2017

PS fedora22 is backlogged with something like 6hr work, if it's the only one left, I say land.
https://ci.nodejs.org/job/node-test-commit-linux/nodes=fedora22/

@sam-github
Copy link
Contributor

Thanks @refack. I'll let @eugeneo land it so he can do any fixups he wants on the way in.

@eugeneo
Copy link
Contributor Author

eugeneo commented May 22, 2017

@eugeneo eugeneo closed this May 22, 2017
@eugeneo eugeneo deleted the fix__debugEnd branch May 22, 2017 17:36
@eugeneo eugeneo merged commit 9cd991d into nodejs:master May 22, 2017
@eugeneo
Copy link
Contributor Author

eugeneo commented May 22, 2017

I forgot to update the commit message to include PR-URL and Reviewed by. Does this justify revert?

@refack
Copy link
Contributor

refack commented May 22, 2017

Force push, and apologize in the IRC (that's the procedure)

@eugeneo
Copy link
Contributor Author

eugeneo commented May 22, 2017

Done, thanks! Will push the proper commit.

@eugeneo
Copy link
Contributor Author

eugeneo commented May 22, 2017

Landed as 5c26378

@sam-github
Copy link
Contributor

@eugeneo

** CID 169616:  Resource leaks  (CTOR_DTOR_LEAK)
/src/inspector_io.cc: 179 in node::inspector::InspectorIo::InspectorIo(node::Environment *, v8::Platform *, const std::basic_string<char, std::char_traits<char>, std::allocator<char>>&, const node::DebugOptions &, bool)()


________________________________________________________________________________________________________
*** CID 169616:  Resource leaks  (CTOR_DTOR_LEAK)
/src/inspector_io.cc: 179 in node::inspector::InspectorIo::InspectorIo(node::Environment *, v8::Platform *, const std::basic_string<char, std::char_traits<char>, std::allocator<char>>&, const node::DebugOptions &, bool)()
173                              : options_(options), thread_(), delegate_(nullptr),
174                                state_(State::kNew), parent_env_(env),
175                                io_thread_req_(), platform_(platform),
176                                dispatching_messages_(false), session_id_(0),
177                                script_name_(path),
178                                wait_for_connect_(wait_for_connect) {
>>>     CID 169616:  Resource leaks  (CTOR_DTOR_LEAK)
>>>     The constructor allocates field "main_thread_req_" of "node::inspector::InspectorIo" but the destructor and whatever functions it calls do not free it.
179       main_thread_req_ = new AsyncAndAgent({uv_async_t(), env->inspector_agent()});
180       CHECK_EQ(0, uv_async_init(env->event_loop(), &main_thread_req_->first,
181                                 InspectorIo::MainThreadAsyncCb));
182       uv_unref(reinterpret_cast<uv_handle_t*>(&main_thread_req_->first));
183       CHECK_EQ(0, uv_sem_init(&start_sem_, 0));
184     }

jasnell pushed a commit that referenced this pull request May 23, 2017
This change ensures that the WebSocket server can be stopped
(and restarted if needed) buy calling process._debugEnd.

PR-URL: #12777
Fixes: #12559
Reviewed-By: Sam Roberts <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
@eugeneo
Copy link
Contributor Author

eugeneo commented May 23, 2017

@sam-github it actually gets freed in ReleasePairOnAsyncClose that is called asynchronously after the request is closed from InspectorIo::~InspectorIo

jasnell pushed a commit that referenced this pull request May 23, 2017
This change ensures that the WebSocket server can be stopped
(and restarted if needed) buy calling process._debugEnd.

PR-URL: #12777
Fixes: #12559
Reviewed-By: Sam Roberts <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
@jasnell jasnell mentioned this pull request May 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. inspector Issues and PRs related to the V8 inspector protocol
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stop inspect the main thread Blocked.
8 participants