[Maybe controversial?] Add diagnostics to tasks by tomaka · Pull Request #4752 · paritytech/substrate

tomaka · 2020-01-28T17:11:12Z

This adds names to tasks and integrates the futures-diagnose tool into Substrate.

Usage

Start your node with the PROFILE_DIR=profiles environment variable, and futures-diagnose will create a directory named profiles that will contain a trace of all the future tasks being executed in Substrate (at least, all the tasks that got wrapped around futures-diagnose, I hope I didn't forget any).

You can then open the traces by starting Chrome and browsing to chrome://tracing. There's a load button top.

Example output:

The X axis is the time, and the Y axis is the thread number.
Each block represents a task being polled. Here we can see the the import queue monopolizes an entire thread (unsurprisingly, this is while syncing). The little green lines are networking and telemetry sockets being polled. They are normally rectangles, but they are too thin in this screenshot.

As an example of usefulness, this would have easily diagnosed the performance issue of last week, where everything started running in a single thread.

Why "[Maybe controversial?]"

To me it seems like half of the planet is integrating their own profiling solution inside Substrate, so I'm not sure whether this one is appropriate. Another option is to add names to tasks (what this PR does), but leave out the futures-diagnose tool. It can then easily be restored by tweaking the source code in case there's a performance issue.

It also uses an environment variable, which isn't great compared to a CLI option. Ideally, we should use a single runtime for everything (including the import queue), wrap this runtime around the diagnose tool, and customize it there.

I also have no idea where to document this, and this seems like a hidden undiscoverable feature.

mxinden

Overall I think this is a great idea to get more visibility into what is happening. In addition I don't think this change is very intrusive.

Have you been able to bechmark the performance impact futures-diagnose has here? On the one hand we only do this for root futures, on the other hand futures-diagnose seems to serialize all calls through a single Mutex (correct me if I am wrong).

In regards to configuration through an environment variable, what do you think of making this only configurable at compile time. With the latter the compiler can remove all the if enabled, thus this change having zero impact on performance when disabled. Downside of the compile time option is that we can't tell users to just set something at runtime to diagnose an issue but need to provide them with another binary instead.

tomusdrw

If it doesn't affect performance when profiling data is not being collected, I'm all in, but it would be really nice to get a document describing how to collect and analyze the outputs.

tomaka · 2020-01-29T10:31:56Z

On the one hand we only do this for root futures, on the other hand futures-diagnose seems to serialize all calls through a single Mutex (correct me if I am wrong).

I didn't notice any performance degradation with this tool.

It's true that this Mutex mostly comes out of laziness, as figuring out how to properly do log rotation without any locking isn't trivial.

Add diagnostics to tasks

2961478

tomaka added the A0-please_review Pull request needs code review. label Jan 28, 2020

tomaka requested a review from tomusdrw as a code owner January 28, 2020 17:11

mxinden self-requested a review January 29, 2020 08:57

mxinden mentioned this pull request Jan 29, 2020

src/fut_with_diag: Reduce overhead when logging disabled tomaka/futures-diagnose#1

Merged

mxinden reviewed Jan 29, 2020

View reviewed changes

tomusdrw approved these changes Jan 29, 2020

View reviewed changes

gavofyork approved these changes Jan 29, 2020

View reviewed changes

gavofyork merged commit 38c5ed0 into paritytech:master Jan 29, 2020

tomaka deleted the diagnose2 branch January 29, 2020 10:48

tomaka mentioned this pull request Jan 29, 2020

Companion PR to Substrate#4752 paritytech/polkadot#806

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Maybe controversial?] Add diagnostics to tasks#4752

[Maybe controversial?] Add diagnostics to tasks#4752
gavofyork merged 1 commit intoparitytech:masterfrom
tomaka:diagnose2

tomaka commented Jan 28, 2020 •

edited

Loading

Uh oh!

mxinden left a comment

Uh oh!

tomusdrw left a comment

Uh oh!

tomaka commented Jan 29, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

tomaka commented Jan 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Usage

Why "[Maybe controversial?]"

Uh oh!

mxinden left a comment

Choose a reason for hiding this comment

Uh oh!

tomusdrw left a comment

Choose a reason for hiding this comment

Uh oh!

tomaka commented Jan 29, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tomaka commented Jan 28, 2020 •

edited

Loading