Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flamegraphs is not working properly in v16+ #293

Closed
RafaelGSS opened this issue Oct 27, 2021 · 7 comments
Closed

Flamegraphs is not working properly in v16+ #293

RafaelGSS opened this issue Oct 27, 2021 · 7 comments
Assignees

Comments

@RafaelGSS
Copy link
Contributor

Current Behavior

@mcollina has an insight about the flame graphs is not working properly in the latest versions of Node.js.
I'll be looking at it and update in this issue.

@RafaelGSS RafaelGSS self-assigned this Oct 27, 2021
@RafaelGSS
Copy link
Contributor Author

RafaelGSS commented Oct 29, 2021

I was checking all the tools before going to the flamegraphs and I've found a bug inside bubbleprof:

  • By running the following command:
./bin.js bubbleprof --open=false --autocannon [ / --method GET ] -- node server.js

On node v14 it works perfectly, however, at v16 the spawned process never exits for two main reasons:

  • When the application just doesn't finish(because of the above behavior) and you send a SIGINT signal the spawned process does not close as well. Looks like the SIGINT behavior at v16 has changed. On SIGINT event the app tries to close the spawned process as well (here). At v14 it works well, but at v16 it doesn't seem to close the app itself, but if you remove the signal to just proc.kill() (default SIGTERM) it will work even in v16.

So I would say that we are dealing with 2 different issues in my point of view. I'm still debugging and trying to create a reproducible example. If you have any tips, let me know!

What I have tried so far:

  • I've been using some eBPF stuff to monitor the file descriptor, however, I haven't found any insights at low-level.
  • Use async_hooks + stream.Transport to create a reproducible example in the same way, but wasn't reproducible.

UPDATE: The SIGINT issue was related to my OS, just fixed and working fine

@mcollina
Copy link
Contributor

Keep going! I won't have time to dig deep into the problem anytime soon unfortunately :(

@RafaelGSS
Copy link
Contributor Author

Well, I was debugging the:

The stream never emit close event properly. This is never called (at v16) for some unknown reason.

And I've discovered that on v15.8.0 it's working, but in v15.9.0 it starts to fail. Well, after git bisect looks like nodejs/node#37300 has added this behavior.

I also did:

git revert 521c08dee15ce8ada35571c94a4b6883ae00d41b over v15.9.0 and it's working fine. However, since it only affects the auto close application, we've decided to no go through it and keep the focus on the flamegraph differences between those versions.

@simoneb
Copy link
Contributor

simoneb commented Nov 1, 2021

Interesting finding @RafaelGSS, though somewhat surprising as that change (at least in my eyes) doesn't seem to carry any functional differences that would explain the change in behavior we're seeing. Do you have any clue as to why this is happening? How likely do you think it may be that the actual reason is something related to that commit, but not specifically that commit?

@RafaelGSS
Copy link
Contributor Author

@simoneb I'm not sure, but looks like was specifically this commit since if I revert only that one, it works. So, I would say that the write stream was affected, I mean, it's hard to affirm it without an easily reproducible code.

@RafaelGSS
Copy link
Contributor Author

I did a comparison between flame on v14.15 and v16.6 and they look pretty similar, the biggest difference looks like in the context filters.

Look at the following screenshot:

image

Node v14 and v16 at respectively left and right sides. If you check the filter context we have the following:

  • v14 consider internal/* as node context

image

  • v16 consider internal/* as V8 context

image


So, I feel that there are no issues so far, only filters adjustment. By the way, I'm assuming that internal/* should be placed as node context instead of v8, right?

Additional considerations

I've also created the profiling through linux_perf just to make sure if there are missed translations through node versions, but they look pretty similar.

@mcollina
Copy link
Contributor

mcollina commented Nov 1, 2021

So, I feel that there are no issues so far, only filters adjustment. By the way, I'm assuming that internal/* should be placed as node context instead of v8, right?

Yes exactly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants