-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native profiling (cython) -- merge error #210
Comments
Thanks for the bug report! I really need to see both the native/python stack traces at a single moment to figure this out. Can you run your program - and then pause it with Control-Z (so that its stopped running but still active)? Once it's stopped, you can first get the python stack trace with |
Thanks, the Python:
GDB:
I forgot to mention that this is using Python 3.6.6 (I guess you could tell that from the trace), but similar issues occurred with Python 2.7.5. Only one Python thread (BLAS may be using others). An earlier version that didn't use gevent/greenlet used multiple threads, with the same py-spy errors. Thanks again! |
To add a little more info, even though I did see this issue before I started using greenlets, there is definitely something funny going on with the reported Python stack traces after we're in a gevent loop. Here is a toy example:
I ran it like this to extract the py-spy and gdb stack traces (I didn't need to suspend it because it's sleep()ing most of the time and is very deterministic):
The py-spy dumps:
Clearly after we spawn and are in a gevent loop, the reported Python stack from py-spy is missing all the parent frames of the currently running greenlet. I don't know if gevent or greenlet are actually manipulating the Python stack to look like that or if they are just doing something unexpected and py-spy is parsing the Python stack incorrectly. The oldest frames in all three gdb stacks are exactly the same, in the "after spawn" case there are, as expected, many more frames. This is the "before spawn" and "after join" gdb trace (exactly the same):
This is the "after spawn" gdb trace (note that from frames # 57 down, it is the same as the above trace from # 3 down)
Hope that is helpful, and hope that gevent isn't opening up a huge can of worms. (Again, I got py-spy merging errors even without gevent, but I'm going to be using gevent going forward, so I'd like that to work) |
Just another update, I have been able to temporarily resolve this by compiling my own version of greenlet where it copies over the current stored stack when creating a greenlet (otherwise every greenlet looks like it was conjured out of thin air):
If creating a greenlet using gevent, it stores the "spawning stack" in its Greenlet wrapper, so conceivably there is a way for py-spy to find it, but it is probably complicated. Furthermore, both options may still run into problems with the Python/greenlet stacks and native stacks being out of sync if a greenlet spawns another greenlet. On the other hand, maybe it would work out fine since the greenlet saves the current native stack and Python stacks at the same time? Don't know if there is a legitimate reason for greenlets to keep their ancestor stacks hidden. Fun times! |
Interesting! thanks for tracking this down. It does look like the issue is with gevent hiding the ancestor stack trace, causing us to think we're not merging correctly. I think the correct thing for py-spy to do here is to detect this case and just merge up to the top of the python stack. |
I have the same issue on windows 10 in a conda environment with python 3.10 installed. When profiling my cython extension, it just print The detail environment
|
I'm getting the same thing on Windows 11 in a conda environment:
My Python:
|
I have this problem when debugging something that uses Turbodbc:
If I remove the usage of |
Just commenting with my own woes in case anyone has figured out solutions for theirs. Using >>> py-spy record -f speedscope -o benchmark/baseline1_lineno.json --native -- python benchmark/benchmark.py 1
Error: Failed to merge native and python frames (Have 5 native and 9 python) |
I had the same issue. |
I have the same issue. I do not use Cython, but have a C extension that I am trying to profile with |
I have the same issue with py-spy 0.3.14 and python 3.11. The output file is still produced, but I assume I can't trust the results given all the errors? Also, all the results point to |
I also observe the same issue for the processes that use
|
Same error message but without any cython extension that I'm aware of
Profiling this script using python 3.12 (conda environment) import numpy as np
import pandas as pd
from scipy.stats import linregress
def calc_slope(window: np.ndarray) -> float:
slope, intercept, r_value, p_value, std_err = linregress(
np.arange(len(window)), window
)
return float(slope)
if __name__ == "__main__":
for i in range(50):
ts = pd.Series(data=np.random.random(672) * 100.0)
slope_ts = ts.rolling(window=24).apply(calc_slope, raw=True) |
Getting the same error on Ubuntu 24.04 LTS profiling a Python application with a Rust backend (using
|
Update: I compiled Native profiling seems to "work" now. Albeit with a lot of errors.
|
I think this tool is pretty incredible. Unfortunately, it has worked only sporadically in my current project, and I'm not sure why it works/doesn't work. My issues resemble some of the errors reported in #2, especially the "Failed to merge native and python frames" message. I am using py-spy version 0.3.1 installed using pip, on Centos 7.5 x86_64. I am compiling several Cython extensions (some of whom call each other), and the command-line for the compile stage indicates debugging is enabled:
I then run the base python script through py-spy:
And the output is:
The successful samples are at the very beginning, before it hits any Cython extensions, and the failed ones all report the following errors (using RUST_LOG=info):
Note that the difference between native and python stack lengths is always 3. I have compiled using gcc 4.8.5 and 8.3.1 with the same result. I was not successful building straight from the repo using cargo, but it does not seem loading unwind_ptrace is the issue anyway. I would love to figure out what's going on, and thought a first step might be to figure out what the actual stack frames are, before merging, and having them printed out -- would there be an easy way to do this?
Also, I did mention there were a few blissful moments when it worked perfectly, but I have no idea what I did to made it work, and then it suddenly stopped working again, with no obvious change on my part (aside from recompilation of the Cython extensions, and no software upgrades). Any help appreciated!
The text was updated successfully, but these errors were encountered: