dap: rework auto-resume logic when connection is closed in multi-client mode #2958

polinasok · 2022-04-05T10:22:53Z

If a client connection is closed while a running command is in progress, that command is not interrupted, so the auto-resume goroutine continues as if nothing has happened. The auto-resume code therefore needs to handle any stops it encounters after that correctly.

The auto-resumer might get interrupted by the following events:

A halt from a new connection on entry. Auto-resumer should log stop reason and not resume.
A breakpoint. (These are not cleared on disconnect.) Auto-resumer should log stop reason and not resume. A user can connect afterwards to inspect state.
An unrecoverable error. Auto-resumer should log the problem and not resume. The user can connect afterwards to inspect state.
A logpoint. Auto-resumer can log the logpoint via stderr and resume execution unless another session issues a halt (see 1) while the logpoint is being processed.
Server shutdown (e.g. by a kill command). Auto-resumer should log halt and not resume.
(There cannot be halt as part of setting breakpoints, normally another reason to auto-resume, because 1. would happen first.)

Problem: We can have a potential conflict between 1 & 4. If the halt arrives while debugger is running, the Debugger.StopReason() will reflect it. But if it happens while the auto-resumer is processing another stop, the reason will be lost. This is because our halt-tracking mechanism is per Session, so when a halt is triggered from a new Session, the paused auto-resumer in the old Session has no way of detecting it.
- Solution 1: Yet another vote for pushing this state into the Debugger object, which is shared between Sessions.
- Solution 2: Another option is to clear logpoints as part of connection closure, so there is never a reason again to auto-resume from a dying Session.
- Solution 3: Or we can treat any interruption, even on a logpoint, as a reason to quit resuming once connection is closed. That's the current behavior (from PR.
Problem: Debugger pointer gets reset as part of connection closure, so it is no longer possible to check stop reasons or resume.
- Solution 1 See if we can skip clearing s.debugger
- Solution 2: Save a copy of the pointer in the auto-resumer.
- Solution 3: If never to auto-resume, just set halt-requested and debugger reason won't be needed.
Problem: Current checks for closed connection can race. The disconnect might arrive after we confirmed that connection is closed, but before the logpoint is processed as if it is still open.
Problem: TBD Because disconnect code can be called more than once, the current way of keeping track might run into channel double closure issue (see more).
Note: keeping track of the closed connection also allows us to avoid sending any events to it. Those errors are not fatal, but can look unnecessarily alarming in the server logs.

The text was updated successfully, but these errors were encountered:

aarzilli · 2022-04-06T13:08:38Z

Two things:

logpoints could be reimplemented as (breaklet callbacks)[https://github.com/go-delve/delve/blob/1669711c0e7c95eea04f7b2ce71ce53f980babc5/pkg/proc/breakpoints.go#L117], which would simplify the auto-resume logic a lot, since it would only have to support stops induced by setting breakpoints
does it make sense to keep logpoints after a client has disconnected?

polinasok · 2022-04-06T16:17:58Z

does it make sense to keep logpoints after a client has disconnected?

I have been wondering about that myself. If a dap client reconnects, all breakpoints will be cleared and reset anyway. But an rpc client would inherit them, won't it? And if the user has logging, the logpoints would log to stderr with no client and just the server running. So some small reasons, but not strong reasons. We could go either way and should pick whatever is easiest, I think. There is also a difference between treating them as regular breakpoints with no client - i.e. stopping auto-resume until a client reconnects and actually clearing them and letting the process just run uninterrupted.

hyangah · 2022-04-09T17:32:15Z

does it make sense to keep logpoints after a client has disconnected?

I have been wondering about that myself. If a dap client reconnects, all breakpoints will be cleared and reset anyway. But an rpc client would inherit them, won't it?

Shouldn't we clear all breakpoints if a client disconnects (at least when a user asks to resume the target before disconnecting)?

I couldn't find a convincing use case of leaving breakpoints (or logpoints) after the client is disconnected yet.

If one wants to leave an app running with breakpoints/logpoints over night in the hope of catching a rare event, couldn't the person leave the debugging session active over night too?

Interestingly, in GDB remote, I think the following behavior effectively causes all the breakpoints to get cleared before a user who connects using terminal-based cli detaches/disconnects from the remote gdb server.

The GDB command to set breakpoints, break does not immediately cause a RSP interaction. GDB only actually sets breakpoints immediately before execution (for example by a continue or step command) and immediately clears them when a breakpoint is hit. This minimizes the risk of a program being left with breakpoints inserted, for example when a serial link fails.

Separately, I found exit -c not clearing breakpoints is not intuitive* and can be dangerous.

[*]: In my case, my expectation on dlv connect is somewhat similar to dlv attach (but remote version). When I disconnect and quit my dlv cli with exit -c, my intention is "I am done debugging. Clean up my mess and resume normal processing" even though behind the scene, my program can be still ptraced by the headless dlv (accept-multiclient) unlike dlv attach.

hyangah · 2022-07-27T13:49:16Z

In golang/vscode-go#2368 a user wants to clear all the breakpoints & resume when disconnecting. I also agree with the user and this is a regression when switching from the legacy adapter.

Reading the original issue again, I am afraid the scope of this issue is broader than changing the disconnect behavior around breakpoints/logpoints handling. Do you want us to open a narrower issue (change the disconnect behavior to clear all breakpoints) and fix the issue?

polinasok · 2022-08-02T21:10:00Z

Yes, the scope of this issue was intended to be broader - it was meant to sort out all the race conditions caused by the complexity of disconnecting and reconnecting while managing different types of breakpoints. If we simply clear all breakpoints, this becomes very simple. So the prerequisite question was: are the surviving breakpoints valuable enough to deal with the complexity?

There are definitely users that don't want their server to get unexpectedly stuck on some breakpoint after they are done debugging. The UI has a toggle to deactivate/remove all breakpoints in one click, but a user would have to remember to use it. On the other hand there could be users who loose connection due to flakiness, who might appreciate a seamless reconnect. Or maybe they are chasing a rare condition and want things to keep running until it is hit, then connect to investigate more.

We had a VC discussion about this a couple of months ago. From the notes:

Decision on logpoints: clear the logpoints - nowhere useful to log anyway and solves complexities of auto-resuming between sessions
Decision on breakpoints: pros and cons for clearing/disabling them. The best solution would be to offer both as a prompt at disconnect and/or as a flag to set with dlv cli and launch.json in vscode.

And that's why this issue is still open :) In our attempt to maybe agree to simplify this, we ended up agreeing on more work, not less (both dealing with races while having breakpoints AND adding flags to clear them). Happy to revisit :)

@briandealwis What would CloudCode users prefer?

hyangah · 2022-08-02T23:32:54Z

From CloudCode's use case here (#2772 (comment)) I think the good default for CloudCode is to clear all the breakpoints + continue. "Continue" is already the default, and "Clearing all breakpoints" is another missing feature.

Decision on breakpoints: pros and cons for clearing/disabling them. The best solution would be to offer both as a prompt at disconnect and/or as a flag to set with dlv cli and launch.json in vscode.

What I remember from the discussion is nobody came up with convincing, strong reasons to keep the breakpoints (and I still see no answer to my question #2958 (comment)). The decision was made based on an assumption that this can be implemented without adding too much burden/time or this issue was under active work. Since the assumption is no longer true, why not rescope and focus on #3083?

According to what @OrBin reported in golang/vscode-go#2368, Goland clears breakpoints before tearing down the session. GDB protocol by design clears breakpoints. Why not just follow the step?

there could be users who loose connection due to flakiness, who might appreciate a seamless reconnect.

I don't think this is worth all the complexities.

Or maybe they are chasing a rare condition and want things to keep running until it is hit, then connect to investigate more.

Users can keep the debug session running.

briandealwis · 2022-08-05T04:27:35Z

I hesitate to make a statement on behalf of all Cloud Code users, but like @hyangah, I can’t think of a compelling reason to retain breakpoints on detach. I think it would be unlikely for a user to want to set breakpoints, detach, and then re-attach later in the hopes of finding some session paused in a breakpoint, as it would lead to hangs in each process hitting the breakpoint.

polinasok · 2022-08-05T07:18:13Z

briandealwis Thanks for the input.

We have never come up with any particularly strong reasons for keeping the breakpoints. It's just been the historical behavior that got copied over when we ported the legacy adapter, which in turn was probably modeled after the CLI. Such historical behaviors tend to stick because there is a chance that somebody depends on them, so nobody revisits the status quo until it causes some friction. Interestingly, I can't find any user complaints about this behavior in the old issues. But we have one now (golang/vscode-go#2368). And this helps simplify this issue well. So the status quo has been challenged. All good reasons to do it the other way. If some user pops in and complaints that we took away their favorite feature, we can ask them to use the cli.

rosti-il · 2024-12-13T22:41:02Z

Hello. This bug report is more than two years old already and still in a discussion stage without any progress after August 2022. What is stopping you to decide how to implement this obvious feature and make remote debugging logically correct? I came to Go development from Java and in Java disconnecting debugger from a remote process disables all breakpoints there and lets that remote process to continue running.

aarzilli added kind/discussion area/dap labels Apr 6, 2022

polinasok mentioned this issue Apr 22, 2022

debug: change the default adapter for remote attach to dlv-dap golang/vscode-go#2205

Closed

3 tasks

hyangah mentioned this issue Jul 25, 2022

Attached debugger does not remove breakpoints before disconnecting golang/vscode-go#2368

Open

hyangah mentioned this issue Jul 28, 2022

service/dap: clear breakpoints before disconnecting for attach mode #3083

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dap: rework auto-resume logic when connection is closed in multi-client mode #2958

dap: rework auto-resume logic when connection is closed in multi-client mode #2958

polinasok commented Apr 5, 2022 •

edited

Loading

aarzilli commented Apr 6, 2022

polinasok commented Apr 6, 2022 •

edited

Loading

hyangah commented Apr 9, 2022

hyangah commented Jul 27, 2022

polinasok commented Aug 2, 2022

hyangah commented Aug 2, 2022

briandealwis commented Aug 5, 2022

polinasok commented Aug 5, 2022

rosti-il commented Dec 13, 2024

dap: rework auto-resume logic when connection is closed in multi-client mode #2958

dap: rework auto-resume logic when connection is closed in multi-client mode #2958

Comments

polinasok commented Apr 5, 2022 • edited Loading

aarzilli commented Apr 6, 2022

polinasok commented Apr 6, 2022 • edited Loading

hyangah commented Apr 9, 2022

hyangah commented Jul 27, 2022

polinasok commented Aug 2, 2022

hyangah commented Aug 2, 2022

briandealwis commented Aug 5, 2022

polinasok commented Aug 5, 2022

rosti-il commented Dec 13, 2024

polinasok commented Apr 5, 2022 •

edited

Loading

polinasok commented Apr 6, 2022 •

edited

Loading