Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make pdb asyncio aware #121468

Open
gaogaotiantian opened this issue Jul 7, 2024 · 6 comments
Open

Make pdb asyncio aware #121468

gaogaotiantian opened this issue Jul 7, 2024 · 6 comments
Labels
topic-asyncio type-feature A feature request or enhancement

Comments

@gaogaotiantian
Copy link
Member

gaogaotiantian commented Jul 7, 2024

Feature or enhancement

Proposal:

Currently pdb does not know anything about asyncio, even though it's a feature officially supported for a long time. I will add supports for asyncio in roughly 3 steps (does not have to be exact 3 PRs):

  1. pdb will generate some passive info for asyncio. If the user hits a breakpoint inside an asyncio task, there will be a one line message with the current stack indicating which task the user is in. A convenience variable $_asynctask will be added to refer to the current task.
  2. pdb will provide a way for the user to examine all the tasks in the current event loop with a new command. The user can inspect how many tasks are there and their status. Maybe they can even check the frames for each task.
  3. pdb will enable users to switch between tasks, aka letting users to specify "run until I'm in this task". pdb will also possibly provide a way for the users to cancel the task.

In theory, many of the features are already possible with manual code as pdb can execute arbitrary code anyway. However, I believe it's helpful for the users to have a more convenient way to debug their async programs.

This was briefly mentioned in the language summit at PyCon this year, no one objected (maybe because they have other stuff that they were more against :) ).

This should not impact any user experience with the existing code that not involves asyncio.

If you have any other suggestions or objections, please let me know.

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

Linked PRs

@gaogaotiantian gaogaotiantian added the type-feature A feature request or enhancement label Jul 7, 2024
@kumaraditya303
Copy link
Contributor

pdb will enable users to switch between tasks, aka letting users to specify "run until I'm in this task". pdb will also possibly provide a way for the users to cancel the task.

I have some questions:

  • How would you implement this?
  • Will it also switch the current context?
  • What if the task runs in separate event loop in a different thread?
  • How would this interact with async generators if any?

@gaogaotiantian
Copy link
Member Author

Switching is basically wait until in that task. pdb will have a callback for each function call(including entering generator) and we just need to check if the callee is what we need.

Multi-threading is not supported at all in pdb. Anything involves multi-threading is a non-blocker because - well it won't work anyway.

I did not think about async generators specifically, but as far as I understand, it's not that special is it? Maybe we can deal with it when there's a concrete example.

What I'm trying to do now is to have some level of support of asyncio, because have none now (actually I don't think the debuggers out there have support it much either). There will be issues that are difficult to solve, and there might be some things that are not achievable in the short term. There are things that we can't do in a single process single thread sync Python program. We may not be able to build a perfect all-can-do debugger for async, but we should still try.

@kumaraditya303
Copy link
Contributor

What I'm trying to do now is to have some level of support of asyncio, because have none now (actually I don't think the debuggers out there have support it much either). There will be issues that are difficult to solve, and there might be some things that are not achievable in the short term. There are things that we can't do in a single process single thread sync Python program. We may not be able to build a perfect all-can-do debugger for async, but we should still try.

I look forward to try a prototype of this!

@gaogaotiantian
Copy link
Member Author

@kumaraditya303 could you take a quick look at #124367? It's just a draft for now because I need to change all the asyncio related pdb doctests before merging it. It implemented step 1. One thing I'd really like to know is if the information Task-1: <main pending> enough. asyncio tasks have their own repr and of course we can use that in pdb directly, but the repr is really long, including a lot of information. The short version is something we want to show the users every time they stop inside an async task and I think it should be more concise. They can always do a $_asynctask to get the current running task. Do you think there are more critical information that the user will very often want to know from their task? I'm not the most heavy asyncio user in the world and that's the area where I need suggestions from experts.

@kumaraditya303
Copy link
Contributor

Task-1: <main pending> isn't so useful, running would be better than pending and it should print the callbacks of the task too.

As for convenience variable, maybe we can have one for the current contextvars used by the task?

@gaogaotiantian
Copy link
Member Author

I can remember the enter task and set that task to running, that's doable. All the other tasks would be pending, cancelling, cancelled or finished I think.

For the callbacks, one of the issue I had was there could be many and it could be long. Do you think something like Task-1: <main pending> cb=[_run_until_complete_cb()] would be helpful? Again, this is the information we display immediately (and always) when the user hits the breakpoint or steps, only the most important information should be displayed here. In normal cases, it would be

> /home/gaogaotiantian/programs/mycpython/example.py(9)main()
-> breakpoint()

The user can always get the detailed information with $_asynctask:

<Task pending name='Task-1' coro=<main() running at /home/gaogaotiantian/programs/mycpython/example.py:9> cb=[_run_until_complete_cb() at /home/gaogaotiantian/programs/mycpython/Lib/asyncio/base_events.py:182]>

But this line is too long to show in the debugger every time the user stops (it could be even longer).

As for the convenience variable, we have a few ways to show that. The simplest is just

$_asyncvars = $_asynctask.get_context()

Then you can do

$_asyncvars.get(foo)

which is always correct but a bit boring.

Say we have a foo = contextvars.ContextVar('bar', default='default'), what do you want in a debugger? You can have a dedicated command like

(Pdb) asyncvars
<ContextVar name='var' default='default' at 0x7f4442d70ef0>: 'is value'

Or you can have a proxy to support

$_asyncvars[foo] = None
$_asyncvars['var'] = None  # Ignore if there's ambiguity, aka two context vars having the same name

Getting the current context is trivial, the question is what the user need to interface with the context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-asyncio type-feature A feature request or enhancement
Projects
Status: Todo
Development

No branches or pull requests

3 participants