Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

show os thread id's in --dump #57

Closed
nlevitt opened this issue Nov 13, 2018 · 7 comments
Closed

show os thread id's in --dump #57

nlevitt opened this issue Nov 13, 2018 · 7 comments
Labels
enhancement New feature or request

Comments

@nlevitt
Copy link

nlevitt commented Nov 13, 2018

It would be nice to see the OS thread id's in --dump, that is, the id that gettid() returns. With that information you can correlate against ps -L or top (after pressing H), on linux at least.

@benfred
Copy link
Owner

benfred commented Nov 16, 2018

I think this is a great idea! I also need to get the OS thread id's for some other things I want to add (profiling native extensions, and doing a better job of detecting when a thread is idle).

I took a look at doing this on quickly on both Linux and OSX - and I think I can grab the native threadid relatively easily on OSX anyways (using the thread_info call asking for THREAD_IDENTIFIER_INFO returns basically the python threadid in the thread_handle field of returned struct). I'm still sorting out how to best get this on linux -

@benfred benfred added the enhancement New feature or request label Nov 16, 2018
@nlevitt
Copy link
Author

nlevitt commented Mar 19, 2019

I looked at this again and discovered that it's now been implemented. Awesome! Thanks!

Caveats:

  • the system thread ids are only printed when you pass --native
  • they are printed in hex, not the most convenient format for cross-referencing with ps or top

I'm a little puzzled by the choice of hex, even for python thread ids. Perhaps there's some context where hex is conventional? But I feel like I mostly see them in decimal. For example:

>>> threading.current_thread().ident
140520831489792
>>> threading.current_thread()
<_MainThread(MainThread, started 140520831489792)>

So I would advocate for printing all the thread ids in decimal. (Or it could print only the os thread id in decimal, or print both thread ids both in decimal or hex, etc)

@benfred
Copy link
Owner

benfred commented Mar 20, 2019

So - it's not quite done yet, which is why I haven't updated this issue =).

I still need to decouple the code to that matches the OS thread id to the python thread id so that it can work even if you don't get the native trace. This will also let us have much better estimates of whether the thread is idle (#92).

Matching the OS thread to the python thread id is relatively easy with OSX and Windows - my problem is getting this going on Linux. The current code is a bit of a hack and involves grabbing the python thread id from the RBX register of the top level frame of the native stack:

// On unix based systems w/ pthreads - the python thread id
// is contained in the RBX register of the last frame (aside from main frame)
// This is sort of a massive hack, but seems to work
#[cfg(unix)]
{
let next_bx = cursor.bx();
if next_bx != 0 && threadids.contains(&next_bx) {
python_thread_id = next_bx;
}
}
}

The problem with this code is that it involves unwinding the native stack for the thread - which we obviously have to do in with the --native option, but is unnecessary otherwise. Also, unwinding native stacks still doesn't work all that reliably yet here. (see #2).

One option I was toying with for this is calling the PyThread_get_thread_ident function for each native thread on linux systems using ptrace (as shown here https://github.com/eklitzke/ptrace-call-userspace) instead of reading the top level RBX register. I'm not sure if this is a better idea than just trying to get better native stack unwinding going - and continuing on using the RBX hack though =(.

I agree about reporting thread ids in decimal - will make that change in the next dev release.

@nlevitt
Copy link
Author

nlevitt commented Mar 20, 2019

Wow, thanks for all your work on this!

P.S. I've always found it annoying that python hides the OS thread id.

@benfred
Copy link
Owner

benfred commented Jul 7, 2019

This PR refactors so that we will usually have the OS thread id in --dump: #123.

@benfred
Copy link
Owner

benfred commented Jul 7, 2019

Can install with pip install py-spy==0.2.0.dev2

@benfred benfred closed this as completed Jul 7, 2019
@nlevitt
Copy link
Author

nlevitt commented Jul 8, 2019

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants