Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native profiling (cython) -- merge error #210

Open
SyamGadde opened this issue Dec 31, 2019 · 16 comments
Open

Native profiling (cython) -- merge error #210

SyamGadde opened this issue Dec 31, 2019 · 16 comments

Comments

@SyamGadde
Copy link

I think this tool is pretty incredible. Unfortunately, it has worked only sporadically in my current project, and I'm not sure why it works/doesn't work. My issues resemble some of the errors reported in #2, especially the "Failed to merge native and python frames" message. I am using py-spy version 0.3.1 installed using pip, on Centos 7.5 x86_64. I am compiling several Cython extensions (some of whom call each other), and the command-line for the compile stage indicates debugging is enabled:

gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -DCYTHON_TRACE=0 -I/usr/include/python3.6m -c build/lib/project/lossy_compress_nodebug.c -o build/temp.linux-x86_64-3.6/build/lib/project/lossy_compress_nodebug.o

I then run the base python script through py-spy:

~/VENV3/bin/py-spy record  -o ~/tmp12.pyspy.svg --native  -- ~/VENV3/bin/python ~/tmp12.py --exitafterone

And the output is:

Sampling process 100 times a second. Press Control-C to exit.
1.01s behind in sampling, results may be inaccurate. Try reducing the sampling rate..
  Collected 57612 samples (56884 errors)
Stopped sampling because process exitted
Wrote flamegraph data to '/home/sg9/tmp12.pyspy.svg'. Samples: 57612 Errors: 56884

The successful samples are at the very beginning, before it hits any Cython extensions, and the failed ones all report the following errors (using RUST_LOG=info):

⠒ Collected 77 samples (22 errors)
[2019-12-31T21:43:47.435024222Z WARN  py_spy] Failed to get stack trace from 26617: Failed to merge native and python frames (Have 7 native and 4 python)
[2019-12-31T21:43:47.465658980Z WARN  py_spy] Failed to get stack trace from 26617: Failed to merge native and python frames (Have 7 native and 4 python)
[2019-12-31T21:43:47.480930106Z WARN  py_spy] Failed to get stack trace from 26617: Failed to merge native and python frames (Have 7 native and 4 python)
⠠ Collected 81 samples (26 errors)
[2019-12-31T21:43:47.513377821Z WARN  py_spy] Failed to get stack trace from 26617: Failed to merge native and python frames (Have 6 native and 3 python)
[2019-12-31T21:43:47.529921084Z WARN  py_spy] Failed to get stack trace from 26617: Failed to merge native and python frames (Have 7 native and 4 python)
[2019-12-31T21:43:47.544238935Z WARN  py_spy] Failed to get stack trace from 26617: Failed to merge native and python frames (Have 7 native and 4 python)
[2019-12-31T21:43:47.558172214Z WARN  py_spy] Failed to get stack trace from 26617: Failed to merge native and python frames (Have 7 native and 4 python)
⠉ Collected 86 samples (31 errors)
[2019-12-31T21:43:47.587753753Z WARN  py_spy] Failed to get stack trace from 26617: Failed to merge native and python frames (Have 7 native and 4 python)
[2019-12-31T21:43:47.602930263Z WARN  py_spy] Failed to get stack trace from 26617: Failed to merge native and python frames (Have 8 native and 5 python)
[2019-12-31T21:43:47.617928596Z WARN  py_spy] Failed to get stack trace from 26617: Failed to merge native and python frames (Have 8 native and 5 python)
[2019-12-31T21:43:47.631587547Z WARN  py_spy] Failed to get stack trace from 26617: Failed to merge native and python frames (Have 7 native and 4 python)

Note that the difference between native and python stack lengths is always 3. I have compiled using gcc 4.8.5 and 8.3.1 with the same result. I was not successful building straight from the repo using cargo, but it does not seem loading unwind_ptrace is the issue anyway. I would love to figure out what's going on, and thought a first step might be to figure out what the actual stack frames are, before merging, and having them printed out -- would there be an easy way to do this?

Also, I did mention there were a few blissful moments when it worked perfectly, but I have no idea what I did to made it work, and then it suddenly stopped working again, with no obvious change on my part (aside from recompilation of the Cython extensions, and no software upgrades). Any help appreciated!

@benfred
Copy link
Owner

benfred commented Jan 2, 2020

Thanks for the bug report!

I really need to see both the native/python stack traces at a single moment to figure this out.

Can you run your program - and then pause it with Control-Z (so that its stopped running but still active)? Once it's stopped, you can first get the python stack trace with py-spy dump -p <PID>. The easiest way to get the native stack trace is probably using gdb: gdb --pid <PID>. Once in GDB you can get the stack by typing bt. If you have multiple threads can you get for each one (info threads to list the threads thread 2 to switch to the threads etc). Note that GDB will restart the program after it exits, so you need to run gdb after the py-spy dump command to get a consistent stack traces between the native/python code.

@SyamGadde
Copy link
Author

Thanks, the py-spy dump line was exactly what I needed. Here are the traces.

Python:

Thread 19850 (idle): "MainThread"
    _get_chunk_bytes (tckz/tckz.py:754)
    encode_chunk (tckz/tckz.py:776)
    _tck2tckz (tckz/tckz.py:990)

GDB:

#0  0x00007fa3d4b86033 in PyObject_Malloc () from /lib64/libpython3.6m.so.1.0
#1  0x00007fa3d4b9e857 in PyUnicode_New () from /lib64/libpython3.6m.so.1.0
#2  0x00007fa3d4ba8abd in _PyUnicode_FromASCII () from /lib64/libpython3.6m.so.1.0
#3  0x00007fa3d4bb5b15 in unicode_subscript () from /lib64/libpython3.6m.so.1.0
Python Exception <class 'gdb.error'> There is no member named length.: 
#4  0x00007fa3b198e52e in __Pyx_PyObject_GetSlice (_py_stop=0x0, wraparound=1, has_cstop=0, has_cstart=1, _py_slice=0x7fa3b1c80258 <__pyx_slice__3>, _py_start=0x0, cstop=0, cstart=-8, obj=)
    at build/lib/tckz/lossy_compress_nodebug.c:39173
#5  __pyx_f_4tckz_22lossy_compress_nodebug_BitCollector_push_word (__pyx_v_self=0x7fa3b0d78c60) at build/lib/tckz/lossy_compress_nodebug.c:4786
#6  0x00007fa3b1a18197 in __pyx_f_4tckz_22lossy_compress_nodebug_BitCollector_write_bits (__pyx_v_nbits=4 '\004', __pyx_v_bits=8191, __pyx_v_self=0x7fa3b0d78c60)
    at build/lib/tckz/lossy_compress_nodebug.c:5450
#7  __pyx_f_4tckz_22lossy_compress_nodebug_BitCollector_write_bits_list (__pyx_v_n=20, __pyx_v_nbitsp=0xca37754, __pyx_v_bitsp=0x17c214c8, __pyx_v_self=0x7fa3b0d78c60)
    at build/lib/tckz/lossy_compress_nodebug.c:5237
#8  __pyx_pf_4tckz_22lossy_compress_nodebug_11compress (__pyx_v_pts=<numpy.ndarray at remote 0x7fa3b3ce6990>, __pyx_v_nptslist=<numpy.ndarray at remote 0x7fa3b3ce6760>, __pyx_v_maxdctsize=0L, 
    __pyx_v_tolfunc=<optimized out>, __pyx_self=<optimized out>) at build/lib/tckz/lossy_compress_nodebug.c:19450
#9  0x00007fa3b1a6c62d in __pyx_pw_4tckz_22lossy_compress_nodebug_12compress (__pyx_self=<optimized out>, 
    __pyx_args=(<numpy.ndarray at remote 0x7fa3b3ce6990>, <numpy.ndarray at remote 0x7fa3b3ce6760>, 0L), __pyx_kwds=<optimized out>) at build/lib/tckz/lossy_compress_nodebug.c:6681
#10 0x00007fa3d4b82f9f in _PyCFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#11 0x00007fa3d4b83379 in _PyCFunction_FastCallKeywords () from /lib64/libpython3.6m.so.1.0
#12 0x00007fa3d4beefaf in call_function () from /lib64/libpython3.6m.so.1.0
#13 0x00007fa3d4be4605 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#14 0x00007fa3d4bedfec in _PyEval_EvalCodeWithName () from /lib64/libpython3.6m.so.1.0
#15 0x00007fa3d4beee4a in fast_function () from /lib64/libpython3.6m.so.1.0
#16 0x00007fa3d4bef0d3 in call_function () from /lib64/libpython3.6m.so.1.0
#17 0x00007fa3d4be4605 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#18 0x00007fa3d4bedfec in _PyEval_EvalCodeWithName () from /lib64/libpython3.6m.so.1.0
#19 0x00007fa3d4bf02ae in _PyFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#20 0x00007fa3d4b44e9e in _PyObject_FastCallDict () from /lib64/libpython3.6m.so.1.0
#21 0x00007fa3d4b44fb1 in _PyObject_Call_Prepend () from /lib64/libpython3.6m.so.1.0
---Type <return> to continue, or q <return> to quit---
#22 0x00007fa3d4b44c23 in PyObject_Call () from /lib64/libpython3.6m.so.1.0
#23 0x00007fa3d4be505a in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#24 0x00007fa3d4bef3d2 in PyEval_EvalCodeEx () from /lib64/libpython3.6m.so.1.0
#25 0x00007fa3d4b5d172 in function_call () from /lib64/libpython3.6m.so.1.0
Python Exception <class 'gdb.error'> There is no member named length.: 
#26 0x00007fa3bb96742f in __Pyx_PyObject_Call (kw=<optimized out>, arg=, func=<function at remote 0x7fa3bbbce0d0>) at src/gevent/greenlet.c:20528
#27 __pyx_pf_6gevent_9_greenlet_8Greenlet_38run (__pyx_v_self=0x7fa3b3d3e448) at src/gevent/greenlet.c:11215
#28 __pyx_pw_6gevent_9_greenlet_8Greenlet_39run (__pyx_v_self=<gevent._greenlet.Greenlet at remote 0x7fa3b3d3e448>, unused=<optimized out>) at src/gevent/greenlet.c:11120
#29 0x00007fa3bc429cfe in __Pyx_CyFunction_CallMethod (kw=0x0, arg=(), self=<gevent._greenlet.Greenlet at remote 0x7fa3b3d3e448>, func=<cython_function_or_method at remote 0x7fa3bbba5100>)
    at src/gevent/_hub_local.c:4407
#30 __Pyx_CyFunction_CallAsMethod (func=<cython_function_or_method at remote 0x7fa3bbba5100>, args=<optimized out>, kw=0x0) at src/gevent/_hub_local.c:4465
#31 0x00007fa3d4b44e20 in _PyObject_FastCallDict () from /lib64/libpython3.6m.so.1.0
#32 0x00007fa3d4b44fb1 in _PyObject_Call_Prepend () from /lib64/libpython3.6m.so.1.0
#33 0x00007fa3d4b44c23 in PyObject_Call () from /lib64/libpython3.6m.so.1.0
#34 0x00007fa3bc21e5cf in g_initialstub (mark=mark@entry=0x7ffefe9a3520) at greenlet.c:823
#35 0x00007fa3bc21df76 in g_switch (target=0x7fa3b3d3e448, args=(), kwargs=<optimized out>) at greenlet.c:597
#36 0x00007fa3bc21ee13 in green_switch (self=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at greenlet.c:1124
#37 0x00007fa3d4b8324e in PyCFunction_Call () from /lib64/libpython3.6m.so.1.0
#38 0x00007fa3d4b44c23 in PyObject_Call () from /lib64/libpython3.6m.so.1.0
#39 0x00007fa3bc673947 in gevent_call (loop=0x7fa3b3d0dac8, cb=0x7fa3b3ce95c8) at src/gevent/libev/callbacks.c:152
#40 0x00007fa3bc670cfe in __pyx_f_6gevent_5libev_8corecext_4loop__run_callbacks (__pyx_v_self=0x7fa3b3d0dac8) at src/gevent/libev/corecext.c:5586
#41 0x00007fa3bc67387a in gevent_loop_run_callbacks (__pyx_v_loop=__pyx_v_loop@entry=0x7fa3b3d0dac8) at src/gevent/libev/corecext.c:15744
#42 0x00007fa3bc675700 in gevent_run_callbacks (_loop=<optimized out>, watcher=0x7fa3b3d0dae0, revents=<optimized out>) at src/gevent/libev/callbacks.c:194
#43 0x00007fa3bc646115 in ev_invoke_pending (loop=0x7fa3bc8868a0 <default_loop_struct>) at /tmp/build/gevent/deps/libev/ev.c:3314
#44 0x00007fa3bc672060 in ev_run (loop=0x7fa3bc8868a0 <default_loop_struct>, flags=flags@entry=0) at /tmp/build/gevent/deps/libev/ev.c:3606
Python Exception <class 'gdb.error'> There is no member named ob_ival.: 
#45 0x00007fa3bc67272a in __pyx_pf_6gevent_5libev_8corecext_4loop_14run (__pyx_v_once=, __pyx_v_nowait=<optimized out>, __pyx_v_self=0x7fa3b3d0dac8) at src/gevent/libev/corecext.c:6882
#46 __pyx_pw_6gevent_5libev_8corecext_4loop_15run (__pyx_v_self=<gevent.libev.corecext.loop at remote 0x7fa3b3d0dac8>, __pyx_args=<optimized out>, __pyx_kwds=<optimized out>)
    at src/gevent/libev/corecext.c:6842
---Type <return> to continue, or q <return> to quit---
#47 0x00007fa3d4b82f9f in _PyCFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#48 0x00007fa3d4beefaf in call_function () from /lib64/libpython3.6m.so.1.0
#49 0x00007fa3d4be3377 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#50 0x00007fa3d4bf01ba in _PyFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#51 0x00007fa3d4b44e9e in _PyObject_FastCallDict () from /lib64/libpython3.6m.so.1.0
#52 0x00007fa3d4b44fb1 in _PyObject_Call_Prepend () from /lib64/libpython3.6m.so.1.0
#53 0x00007fa3d4b44c23 in PyObject_Call () from /lib64/libpython3.6m.so.1.0
#54 0x00007fa3bc21e5cf in g_initialstub (mark=mark@entry=0x7ffefe9a3c70) at greenlet.c:823
#55 0x00007fa3bc21df76 in g_switch (target=target@entry=0x7fa3bc8ca198, args=args@entry=(), kwargs=kwargs@entry=0x0) at greenlet.c:597
#56 0x00007fa3bc21eebb in PyGreenlet_Switch (g=0x7fa3bc8ca198, args=(), kwargs=0x0) at greenlet.c:1441
#57 0x00007fa3bc01411f in __pyx_f_6gevent_21__greenlet_primitives__greenlet_switch (__pyx_v_self=0x7fa3bc8ca198) at src/gevent/_greenlet_primitives.c:2679
#58 __pyx_f_6gevent_21__greenlet_primitives_25SwitchOutGreenletWithLoop_switch (__pyx_v_self=0x7fa3bc8ca198, __pyx_skip_dispatch=__pyx_skip_dispatch@entry=1) at src/gevent/_greenlet_primitives.c:1958
#59 0x00007fa3bc0147ce in __pyx_pf_6gevent_21__greenlet_primitives_25SwitchOutGreenletWithLoop_switch (__pyx_v_self=<optimized out>) at src/gevent/_greenlet_primitives.c:2008
#60 __pyx_pw_6gevent_21__greenlet_primitives_25SwitchOutGreenletWithLoop_1switch (__pyx_v_self=<optimized out>, unused=<optimized out>) at src/gevent/_greenlet_primitives.c:1995
Python Exception <class 'gdb.error'> There is no member named ma_mask.: 
#61 0x00007fa3bc429cfe in __Pyx_CyFunction_CallMethod (kw=0x0, arg=(), self=, func=<cython_function_or_method at remote 0x7fa3bc8e7498>) at src/gevent/_hub_local.c:4407
#62 __Pyx_CyFunction_CallAsMethod (func=<cython_function_or_method at remote 0x7fa3bc8e7498>, args=<optimized out>, kw=0x0) at src/gevent/_hub_local.c:4465
Python Exception <class 'gdb.error'> There is no member named ma_mask.: 
#63 0x00007fa3bb960819 in __Pyx_PyObject_Call (func=<cython_function_or_method at remote 0x7fa3bc8e7498>, arg=, kw=0x0) at src/gevent/greenlet.c:20528
Python Exception <class 'gdb.error'> There is no member named ma_mask.: 
#64 0x00007fa3bb9561bd in __Pyx__PyObject_CallOneArg (func=<cython_function_or_method at remote 0x7fa3bc8e7498>, arg=) at src/gevent/greenlet.c:20567
Python Exception <class 'gdb.error'> There is no member named ma_mask.: 
#65 0x00007fa3bb977c8e in __Pyx_PyObject_CallOneArg (arg=, func=<optimized out>) at src/gevent/greenlet.c:20586
#66 __pyx_f_6gevent_9_greenlet_8Greenlet_join (__pyx_v_self=0x7fa3b3d3e448, __pyx_skip_dispatch=<optimized out>, __pyx_optional_args=<optimized out>) at src/gevent/greenlet.c:10124
#67 0x00007fa3bb962a4b in __pyx_pf_6gevent_9_greenlet_8Greenlet_36join (__pyx_v_timeout=<optimized out>, __pyx_v_self=0x7fa3b3d3e448) at src/gevent/greenlet.c:10612
#68 __pyx_pw_6gevent_9_greenlet_8Greenlet_37join (__pyx_v_self=<gevent._greenlet.Greenlet at remote 0x7fa3b3d3e448>, __pyx_args=<optimized out>, __pyx_kwds=0x0) at src/gevent/greenlet.c:10596
#69 0x00007fa3bc429cb4 in __Pyx_CyFunction_CallMethod (kw=0x0, arg=(), self=<gevent._greenlet.Greenlet at remote 0x7fa3b3d3e448>, func=<cython_function_or_method at remote 0x7fa3bbba5048>)
    at src/gevent/_hub_local.c:4402
#70 __Pyx_CyFunction_CallAsMethod (func=<cython_function_or_method at remote 0x7fa3bbba5048>, args=<optimized out>, kw=0x0) at src/gevent/_hub_local.c:4465
#71 0x00007fa3d4b44e20 in _PyObject_FastCallDict () from /lib64/libpython3.6m.so.1.0
#72 0x00007fa3d4bef15c in call_function () from /lib64/libpython3.6m.so.1.0
---Type <return> to continue, or q <return> to quit---
#73 0x00007fa3d4be3377 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#74 0x00007fa3d4bedfec in _PyEval_EvalCodeWithName () from /lib64/libpython3.6m.so.1.0
#75 0x00007fa3d4beee4a in fast_function () from /lib64/libpython3.6m.so.1.0
#76 0x00007fa3d4bef0d3 in call_function () from /lib64/libpython3.6m.so.1.0
#77 0x00007fa3d4be4605 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#78 0x00007fa3d4bef3d2 in PyEval_EvalCodeEx () from /lib64/libpython3.6m.so.1.0
#79 0x00007fa3d4bf005b in PyEval_EvalCode () from /lib64/libpython3.6m.so.1.0
#80 0x00007fa3d4c7863e in run_mod () from /lib64/libpython3.6m.so.1.0
#81 0x00007fa3d4b22fd2 in PyRun_FileExFlags () from /lib64/libpython3.6m.so.1.0
#82 0x00007fa3d4b23392 in PyRun_SimpleFileExFlags () from /lib64/libpython3.6m.so.1.0
#83 0x00007fa3d4c7ef0c in Py_Main () from /lib64/libpython3.6m.so.1.0
#84 0x0000000000400a99 in main ()

I forgot to mention that this is using Python 3.6.6 (I guess you could tell that from the trace), but similar issues occurred with Python 2.7.5. Only one Python thread (BLAS may be using others). An earlier version that didn't use gevent/greenlet used multiple threads, with the same py-spy errors. Thanks again!

@SyamGadde
Copy link
Author

SyamGadde commented Jan 6, 2020

To add a little more info, even though I did see this issue before I started using greenlets, there is definitely something funny going on with the reported Python stack traces after we're in a gevent loop. Here is a toy example:

# tmp98.py
import time
import gevent

def test():

    def sleeper():
        print("after spawn")
        return sleeper1()

    def sleeper1():
        time.sleep(30)

    print("before spawn")
    time.sleep(30)

    gl = gevent.spawn(sleeper)
    gl.join()

    print("after join")
    time.sleep(30)

test()

I ran it like this to extract the py-spy and gdb stack traces (I didn't need to suspend it because it's sleep()ing most of the time and is very deterministic):

( python ~/tmp98.py & savedpid=$! ; sleep 10 && for i in 1 2 3 ; do py-spy dump -p "${savedpid}" > dump"${i}" ; (echo bt | gdb --pid "${savedpid}" > stack"${i}" ); sleep 30 ; done )

The py-spy dumps:

# dump1 (before spawn)
Process 16814: python /home/sg9/tmp98.py
Python v3.6.6 (/home/sg9/VENV3/bin/python36)

Thread 16814 (idle)
    test (tmp98.py:14)
    <module> (tmp98.py:22)
# dump2 (after spawn)
Process 16814: python /home/sg9/tmp98.py
Python v3.6.6 (/home/sg9/VENV3/bin/python36)

Thread 16814 (idle)
    sleeper1 (tmp98.py:11)
    sleeper (tmp98.py:8)
# dump3 (after join)
Process 16814: python /home/sg9/tmp98.py
Python v3.6.6 (/home/sg9/VENV3/bin/python36)

Thread 16814 (idle)
    test (tmp98.py:20)
    <module> (tmp98.py:22)

Clearly after we spawn and are in a gevent loop, the reported Python stack from py-spy is missing all the parent frames of the currently running greenlet. I don't know if gevent or greenlet are actually manipulating the Python stack to look like that or if they are just doing something unexpected and py-spy is parsing the Python stack incorrectly.

The oldest frames in all three gdb stacks are exactly the same, in the "after spawn" case there are, as expected, many more frames.

This is the "before spawn" and "after join" gdb trace (exactly the same):

#0  0x00007fe5ee53a973 in __select_nocancel () from /lib64/libc.so.6
#1  0x00007fe5ef2d573a in time_sleep () from /lib64/libpython3.6m.so.1.0
#2  0x00007fe5ef21ff66 in _PyCFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#3  0x00007fe5ef28bfaf in call_function () from /lib64/libpython3.6m.so.1.0
#4  0x00007fe5ef280377 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#5  0x00007fe5ef28b467 in _PyEval_EvalCodeWithName () from /lib64/libpython3.6m.so.1.0
#6  0x00007fe5ef28be4a in fast_function () from /lib64/libpython3.6m.so.1.0
#7  0x00007fe5ef28c0d3 in call_function () from /lib64/libpython3.6m.so.1.0
#8  0x00007fe5ef280377 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#9  0x00007fe5ef28c3d2 in PyEval_EvalCodeEx () from /lib64/libpython3.6m.so.1.0
#10 0x00007fe5ef28d05b in PyEval_EvalCode () from /lib64/libpython3.6m.so.1.0
#11 0x00007fe5ef31563e in run_mod () from /lib64/libpython3.6m.so.1.0
#12 0x00007fe5ef1bffd2 in PyRun_FileExFlags () from /lib64/libpython3.6m.so.1.0
#13 0x00007fe5ef1c0392 in PyRun_SimpleFileExFlags () from /lib64/libpython3.6m.so.1.0
#14 0x00007fe5ef31bf0c in Py_Main () from /lib64/libpython3.6m.so.1.0
#15 0x0000000000400a99 in main ()

This is the "after spawn" gdb trace (note that from frames # 57 down, it is the same as the above trace from # 3 down)

#0  0x00007fe5ee53a973 in __select_nocancel () from /lib64/libc.so.6
#1  0x00007fe5ef2d573a in time_sleep () from /lib64/libpython3.6m.so.1.0
#2  0x00007fe5ef21ff66 in _PyCFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#3  0x00007fe5ef28bfaf in call_function () from /lib64/libpython3.6m.so.1.0
#4  0x00007fe5ef280377 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#5  0x00007fe5ef28afec in _PyEval_EvalCodeWithName () from /lib64/libpython3.6m.so.1.0
#6  0x00007fe5ef28be4a in fast_function () from /lib64/libpython3.6m.so.1.0
#7  0x00007fe5ef28c0d3 in call_function () from /lib64/libpython3.6m.so.1.0
#8  0x00007fe5ef280377 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#9  0x00007fe5ef28c61a in PyEval_EvalCodeEx () from /lib64/libpython3.6m.so.1.0
#10 0x00007fe5ef1fa172 in function_call () from /lib64/libpython3.6m.so.1.0
#11 0x00007fe5e6d3642f in __Pyx_PyObject_Call (kw=<optimized out>, arg=, func=<function at remote 0x7fe5ef6f71e0>) at src/gevent/greenlet.c:20528
#12 __pyx_pf_6gevent_9_greenlet_8Greenlet_38run (__pyx_v_self=0x7fe5e7c7b348) at src/gevent/greenlet.c:11215
#13 __pyx_pw_6gevent_9_greenlet_8Greenlet_39run (__pyx_v_self=<gevent._greenlet.Greenlet at remote 0x7fe5e7c7b348>, unused=<optimized out>)
    at src/gevent/greenlet.c:11120
#14 0x00007fe5e77b8cfe in __Pyx_CyFunction_CallMethod (kw=0x0, arg=(), self=<gevent._greenlet.Greenlet at remote 0x7fe5e7c7b348>, 
    func=<cython_function_or_method at remote 0x7fe5e79f1328>) at src/gevent/_hub_local.c:4407
#15 __Pyx_CyFunction_CallAsMethod (func=<cython_function_or_method at remote 0x7fe5e79f1328>, args=<optimized out>, kw=0x0) at src/gevent/_hub_local.c:4465
#16 0x00007fe5ef1e1e20 in _PyObject_FastCallDict () from /lib64/libpython3.6m.so.1.0
#17 0x00007fe5ef1e1fb1 in _PyObject_Call_Prepend () from /lib64/libpython3.6m.so.1.0
#18 0x00007fe5ef1e1c23 in PyObject_Call () from /lib64/libpython3.6m.so.1.0
#19 0x00007fe5e75ad5cf in g_initialstub (mark=mark@entry=0x7ffefd8cb390) at greenlet.c:823
#20 0x00007fe5e75acf76 in g_switch (target=0x7fe5e7c7b348, args=(), kwargs=<optimized out>) at greenlet.c:597
#21 0x00007fe5e75ade13 in green_switch (self=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at greenlet.c:1124
#22 0x00007fe5ef22024e in PyCFunction_Call () from /lib64/libpython3.6m.so.1.0
#23 0x00007fe5ef1e1c23 in PyObject_Call () from /lib64/libpython3.6m.so.1.0
#24 0x00007fe5e7a42947 in gevent_call (loop=0x7fe5ef6c4a08, cb=0x7fe5e79e7d08) at src/gevent/libev/callbacks.c:152
#25 0x00007fe5e7a3fcfe in __pyx_f_6gevent_5libev_8corecext_4loop__run_callbacks (__pyx_v_self=0x7fe5ef6c4a08) at src/gevent/libev/corecext.c:5586
#26 0x00007fe5e7a4287a in gevent_loop_run_callbacks (__pyx_v_loop=__pyx_v_loop@entry=0x7fe5ef6c4a08) at src/gevent/libev/corecext.c:15744
#27 0x00007fe5e7a44700 in gevent_run_callbacks (_loop=<optimized out>, watcher=0x7fe5ef6c4a20, revents=<optimized out>) at src/gevent/libev/callbacks.c:194
#28 0x00007fe5e7a15115 in ev_invoke_pending (loop=0x7fe5e7c558a0 <default_loop_struct>) at /tmp/build/gevent/deps/libev/ev.c:3314
#29 0x00007fe5e7a41060 in ev_run (loop=0x7fe5e7c558a0 <default_loop_struct>, flags=flags@entry=0) at /tmp/build/gevent/deps/libev/ev.c:3606
#30 0x00007fe5e7a4172a in __pyx_pf_6gevent_5libev_8corecext_4loop_14run (__pyx_v_once=, __pyx_v_nowait=<optimized out>, __pyx_v_self=0x7fe5ef6c4a08)
    at src/gevent/libev/corecext.c:6882
#31 __pyx_pw_6gevent_5libev_8corecext_4loop_15run (__pyx_v_self=<gevent.libev.corecext.loop at remote 0x7fe5ef6c4a08>, __pyx_args=<optimized out>, 
    __pyx_kwds=<optimized out>) at src/gevent/libev/corecext.c:6842
#32 0x00007fe5ef21ff9f in _PyCFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#33 0x00007fe5ef28bfaf in call_function () from /lib64/libpython3.6m.so.1.0
#34 0x00007fe5ef280377 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#35 0x00007fe5ef28d1ba in _PyFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#36 0x00007fe5ef1e1e9e in _PyObject_FastCallDict () from /lib64/libpython3.6m.so.1.0
#37 0x00007fe5ef1e1fb1 in _PyObject_Call_Prepend () from /lib64/libpython3.6m.so.1.0
#38 0x00007fe5ef1e1c23 in PyObject_Call () from /lib64/libpython3.6m.so.1.0
#39 0x00007fe5e75ad5cf in g_initialstub (mark=mark@entry=0x7ffefd8cbae0) at greenlet.c:823
#40 0x00007fe5e75acf76 in g_switch (target=target@entry=0x7fe5ef78a240, args=args@entry=(), kwargs=kwargs@entry=0x0) at greenlet.c:597
#41 0x00007fe5e75adebb in PyGreenlet_Switch (g=0x7fe5ef78a240, args=(), kwargs=0x0) at greenlet.c:1441
#42 0x00007fe5e73a311f in __pyx_f_6gevent_21__greenlet_primitives__greenlet_switch (__pyx_v_self=0x7fe5ef78a240) at src/gevent/_greenlet_primitives.c:2679
#43 __pyx_f_6gevent_21__greenlet_primitives_25SwitchOutGreenletWithLoop_switch (__pyx_v_self=0x7fe5ef78a240, __pyx_skip_dispatch=__pyx_skip_dispatch@entry=1)
    at src/gevent/_greenlet_primitives.c:1958
#44 0x00007fe5e73a37ce in __pyx_pf_6gevent_21__greenlet_primitives_25SwitchOutGreenletWithLoop_switch (__pyx_v_self=<optimized out>)
    at src/gevent/_greenlet_primitives.c:2008
#45 __pyx_pw_6gevent_21__greenlet_primitives_25SwitchOutGreenletWithLoop_1switch (__pyx_v_self=<optimized out>, unused=<optimized out>)
    at src/gevent/_greenlet_primitives.c:1995
#46 0x00007fe5e77b8cfe in __Pyx_CyFunction_CallMethod (kw=0x0, arg=(), self=, func=<cython_function_or_method at remote 0x7fe5e7c96608>)
    at src/gevent/_hub_local.c:4407
#47 __Pyx_CyFunction_CallAsMethod (func=<cython_function_or_method at remote 0x7fe5e7c96608>, args=<optimized out>, kw=0x0) at src/gevent/_hub_local.c:4465
#48 0x00007fe5e6d2f819 in __Pyx_PyObject_Call (func=<cython_function_or_method at remote 0x7fe5e7c96608>, arg=, kw=0x0) at src/gevent/greenlet.c:20528
#49 0x00007fe5e6d251bd in __Pyx__PyObject_CallOneArg (func=<cython_function_or_method at remote 0x7fe5e7c96608>, arg=) at src/gevent/greenlet.c:20567
#50 0x00007fe5e6d46c8e in __Pyx_PyObject_CallOneArg (arg=, func=<optimized out>) at src/gevent/greenlet.c:20586
#51 __pyx_f_6gevent_9_greenlet_8Greenlet_join (__pyx_v_self=0x7fe5e7c7b348, __pyx_skip_dispatch=<optimized out>, __pyx_optional_args=<optimized out>)
    at src/gevent/greenlet.c:10124
#52 0x00007fe5e6d31a4b in __pyx_pf_6gevent_9_greenlet_8Greenlet_36join (__pyx_v_timeout=<optimized out>, __pyx_v_self=0x7fe5e7c7b348) at src/gevent/greenlet.c:10612
#53 __pyx_pw_6gevent_9_greenlet_8Greenlet_37join (__pyx_v_self=<gevent._greenlet.Greenlet at remote 0x7fe5e7c7b348>, __pyx_args=<optimized out>, __pyx_kwds=0x0)
    at src/gevent/greenlet.c:10596
#54 0x00007fe5e77b8cb4 in __Pyx_CyFunction_CallMethod (kw=0x0, arg=(), self=<gevent._greenlet.Greenlet at remote 0x7fe5e7c7b348>, 
    func=<cython_function_or_method at remote 0x7fe5e79f1270>) at src/gevent/_hub_local.c:4402
#55 __Pyx_CyFunction_CallAsMethod (func=<cython_function_or_method at remote 0x7fe5e79f1270>, args=<optimized out>, kw=0x0) at src/gevent/_hub_local.c:4465
#56 0x00007fe5ef1e1e20 in _PyObject_FastCallDict () from /lib64/libpython3.6m.so.1.0
#57 0x00007fe5ef28c15c in call_function () from /lib64/libpython3.6m.so.1.0
#58 0x00007fe5ef280377 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#59 0x00007fe5ef28b467 in _PyEval_EvalCodeWithName () from /lib64/libpython3.6m.so.1.0
#60 0x00007fe5ef28be4a in fast_function () from /lib64/libpython3.6m.so.1.0
#61 0x00007fe5ef28c0d3 in call_function () from /lib64/libpython3.6m.so.1.0
#62 0x00007fe5ef280377 in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#63 0x00007fe5ef28c3d2 in PyEval_EvalCodeEx () from /lib64/libpython3.6m.so.1.0
#64 0x00007fe5ef28d05b in PyEval_EvalCode () from /lib64/libpython3.6m.so.1.0
#65 0x00007fe5ef31563e in run_mod () from /lib64/libpython3.6m.so.1.0
#66 0x00007fe5ef1bffd2 in PyRun_FileExFlags () from /lib64/libpython3.6m.so.1.0
#67 0x00007fe5ef1c0392 in PyRun_SimpleFileExFlags () from /lib64/libpython3.6m.so.1.0
#68 0x00007fe5ef31bf0c in Py_Main () from /lib64/libpython3.6m.so.1.0
#69 0x0000000000400a99 in main ()

Hope that is helpful, and hope that gevent isn't opening up a huge can of worms. (Again, I got py-spy merging errors even without gevent, but I'm going to be using gevent going forward, so I'd like that to work)

@SyamGadde
Copy link
Author

Just another update, I have been able to temporarily resolve this by compiling my own version of greenlet where it copies over the current stored stack when creating a greenlet (otherwise every greenlet looks like it was conjured out of thin air):

diff --git a/greenlet.c b/greenlet.c
index ec738b9..340c94c 100644
--- a/greenlet.c
+++ b/greenlet.c
@@ -766,7 +766,8 @@ static int GREENLET_NOINLINE(g_initialstub)(void* mark)
        else {
                self->stack_prev = ts_current;
        }
-       self->top_frame = NULL;
+       /* self->top_frame = NULL; */
+       self->top_frame = PyThreadState_GET()->frame;
        green_clear_exc(self);
        self->recursion_depth = PyThreadState_GET()->recursion_depth;

If creating a greenlet using gevent, it stores the "spawning stack" in its Greenlet wrapper, so conceivably there is a way for py-spy to find it, but it is probably complicated. Furthermore, both options may still run into problems with the Python/greenlet stacks and native stacks being out of sync if a greenlet spawns another greenlet. On the other hand, maybe it would work out fine since the greenlet saves the current native stack and Python stacks at the same time? Don't know if there is a legitimate reason for greenlets to keep their ancestor stacks hidden. Fun times!

@benfred
Copy link
Owner

benfred commented Feb 9, 2020

Interesting! thanks for tracking this down.

It does look like the issue is with gevent hiding the ancestor stack trace, causing us to think we're not merging correctly. I think the correct thing for py-spy to do here is to detect this case and just merge up to the top of the python stack.

@synodriver
Copy link

I have the same issue on windows 10 in a conda environment with python 3.10 installed. When profiling my cython extension, it just print Error: Failed to merge native and python frames (Have 4 native and 2 python) when it meets the native part of the code and exits.

The detail environment

python -c "import sys; print(sys.version)"
3.10.4 | packaged by conda-forge | (main, Mar 30 2022, 08:38:02) [MSC v.1916 64 bit (AMD64)]

@hintron
Copy link

hintron commented Jan 27, 2023

I'm getting the same thing on Windows 11 in a conda environment:

$ py-spy record -n -- python app.py
...
Error: Failed to merge native and python frames (Have 22 native and 11 python)

My Python:

$ python -c "import sys; print(sys.version)"
3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:16:33) [MSC v.1929 64 bit (AMD64)]

@jonashaag
Copy link

I have this problem when debugging something that uses Turbodbc:

(lldb) bt
* thread #1, name = 'python', stop reason = signal SIGSTOP
  * frame #0: 0x00007f90fe343c8f turbodbc_arrow_support.cpython-311-x86_64-linux-gnu.so`turbodbc_arrow::arrow_result_set::fetch_all_native(std::shared_ptr<arrow::Table>*, bool) + 223
    frame #1: 0x00007f90fe344a63 turbodbc_arrow_support.cpython-311-x86_64-linux-gnu.so`turbodbc_arrow::arrow_result_set::fetch_next_batch() + 99
    frame #2: 0x00007f90fe34ea9c turbodbc_arrow_support.cpython-311-x86_64-linux-gnu.so`void pybind11::cpp_function::initialize<pybind11::cpp_function::cpp_function<pybind11::object, turbodbc_arrow::arrow_result_set, pybind11::name, pybind11::is_method, pybind11::sibling>(pybind11::object (turbodbc_arrow::arrow_result_set::*)(), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::'lambda'(turbodbc_arrow::arrow_result_set*), pybind11::object, turbodbc_arrow::arrow_result_set*, pybind11::name, pybind11::is_method, pybind11::sibling>(pybind11::object&&, turbodbc_arrow::arrow_result_set (*)(), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::'lambda1'(pybind11::detail::function_call&)::_FUN(pybind11::detail::function_call&) + 204
    frame #3: 0x00007f90fe33fdee turbodbc_arrow_support.cpython-311-x86_64-linux-gnu.so`pybind11::cpp_function::dispatcher(_object*, _object*, _object*) + 1214
    frame #4: 0x000055abfd719dc6 python3.11`cfunction_call + 54
    frame #5: 0x000055abfd6fa07b python3.11`_PyObject_MakeTpCall + 603
    frame #6: 0x000055abfd706992 python3.11`_PyEval_EvalFrameDefault + 2002
    frame #7: 0x000055abfd746b32 python3.11`gen_iternext + 146
    frame #8: 0x000055abfd706762 python3.11`_PyEval_EvalFrameDefault + 1442
    frame #9: 0x000055abfd7bed36 python3.11`_PyEval_Vector + 294
    frame #10: 0x000055abfd7be3ef python3.11`PyEval_EvalCode + 159
    frame #11: 0x000055abfd7dcf2a python3.11`run_eval_code_obj + 74
    frame #12: 0x000055abfd7d9343 python3.11`run_mod + 99
    frame #13: 0x000055abfd7ee300 python3.11`pyrun_file + 128
    frame #14: 0x000055abfd7edc5e python3.11`_PyRun_SimpleFileObject + 430
    frame #15: 0x000055abfd7eda44 python3.11`_PyRun_AnyFileObject + 68
    frame #16: 0x000055abfd7e7bdf python3.11`Py_RunMain + 735
    frame #17: 0x000055abfd7acf97 python3.11`Py_BytesMain + 55
    frame #18: 0x00007f91e3ba0083 libc.so.6`__libc_start_main + 243
    frame #19: 0x000055abfd7ace3d python3.11`_start + 41
    frame #20: 0x000055abfd7ace3d python3.11`_start + 41
py-spy dump -n -p ...
Python v3.11.5

Error: Failed to merge native and python frames (Have 1 native and 5 python)
$ py-spy dump -p ...
Python v3.11.5

Thread 25003 (idle): "MainThread"
    fetcharrowbatches (turbodbc/cursor.py:362)
    <module> (bench.py:14)
Thread 25386 (idle): "Thread-1"
    wait (threading.py:324)
    wait (threading.py:622)
    run (tqdm/_monitor.py:60)
    _bootstrap_inner (threading.py:1038)
    _bootstrap (threading.py:995)

If I remove the usage of tqdm (and with it, the second thread), it works with top -n -p <pid> but not with top -n python ....

@Samreay
Copy link

Samreay commented Oct 30, 2023

Just commenting with my own woes in case anyone has figured out solutions for theirs. Using cython==3.0.3 and python==3.11.5 on WSL, ie Ubuntu 22.04.2 LTS

>>> py-spy record -f speedscope -o benchmark/baseline1_lineno.json --native  -- python benchmark/benchmark.py 1
Error: Failed to merge native and python frames (Have 5 native and 9 python)

@Crispy13
Copy link

Crispy13 commented Jan 1, 2024

I had the same issue.

@dmitry-kabanov
Copy link

I have the same issue. I do not use Cython, but have a C extension that I am trying to profile with py-spy, but it fails with the same error message about merging stacks. I use py-spy 0.3.14 on Python 3.11.

@emlys
Copy link

emlys commented Apr 12, 2024

I have the same issue with py-spy 0.3.14 and python 3.11. The output file is still produced, but I assume I can't trust the results given all the errors? Also, all the results point to .so files rather than the .pyx files.

@gubenkoved
Copy link

gubenkoved commented Nov 22, 2024

I also observe the same issue for the processes that use greenlet in their runtime.

py-spy 0.4.0
greenlet 1.1.3.post0
python 3.10.12
ubuntu 22.04.5 LTS

@olejorgenb
Copy link

olejorgenb commented Dec 12, 2024

Same error message but without any cython extension that I'm aware of

$ py-spy record --native --pid 11 --rate 20 --format speedscope -o profile.json
Error: Failed to merge native and python frames (Have 40 native and 30 python)

Profiling this script using python 3.12 (conda environment)

import numpy as np
import pandas as pd
from scipy.stats import linregress

def calc_slope(window: np.ndarray) -> float:
    slope, intercept, r_value, p_value, std_err = linregress(
        np.arange(len(window)), window
    )
    return float(slope)

if __name__ == "__main__":
    for i in range(50):
        ts = pd.Series(data=np.random.random(672) * 100.0)
        slope_ts = ts.rolling(window=24).apply(calc_slope, raw=True)

@GStechschulte
Copy link

GStechschulte commented Dec 20, 2024

Getting the same error on Ubuntu 24.04 LTS profiling a Python application with a Rust backend (using pyo3). Related to #673

Error: Failed to merge native and python frames (Have 59 native and 54 python)
Python 3.12.3
py-spy 0.40

@GStechschulte
Copy link

GStechschulte commented Dec 20, 2024

Update: I compiled py-spy from source using cargo and recompiled my application in release mode (make sure your Cargo.toml file has debugging enabled in release mode).

Native profiling seems to "work" now. Albeit with a lot of errors.

py-spy record --native -o profile.svg -- my_cool_python_app.py
py-spy> Stopped sampling because process exited
py-spy> Wrote flamegraph data to 'profile.svg'. Samples: 1192 Errors: 8984
Error: No child process (os error 10)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests