Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in syncWithSender (?) #151

Closed
edsko opened this issue Nov 11, 2024 · 4 comments
Closed

Bug in syncWithSender (?) #151

edsko opened this issue Nov 11, 2024 · 4 comments

Comments

@edsko
Copy link
Collaborator

edsko commented Nov 11, 2024

The initial symptom I noticed was that my test suite would sometimes fail with

Network.Socket.accept: resource exhausted (Too many open files)

I managed to construct an example (through grapesy) that would reliably reproduce this; in this example, the server handler terminates immediately, sends a response to the client, and the client receives that response without ever sending any data of its own. An unusual situation but one that can arise if the handler would throw an exception (which in gRPC is sent as a regular response to the client). Curiously, a minor variation on this same test where the client first sends one message to the server did not result in the same socket leak. (Note that both the server and the client use streaming bodies.)

Debugging this took me into quite a deep rabbit hole.

  1. The call to accept happens in runTCPServerWithSocket in network-run. This closes the socket when the specified server terminates; conclusion: the server wasn't terminating.

  2. The server in question in this case is Network.HTTP2.Server.Run.runH2 in http2. It terminates when its background threads terminate, and in particular, when the frame receiver terminates. The frame receiver terminates in one of two conditions:

    • It terminates when the counterparty sends a GOAWAY message.
    • It terminates when the socket is closed (more specifically, when confReadN returns a null result

    For reasons I don't yet fully understand, in my test case where the client does at least send one message (that is, part of a streaming output), a GOAWAY message was sent, but in the case where the server handler terminates without the client sending anything, this did not happen. That by itself should not be a problem; after all, the client terminates the connection and so the socket should be closed. Conclusion: it wasn't the server not properly closing the socket, it was the client.

  3. The client socket is created in runTCPClientWithSettings, again in network-run. As on the server side, this socket is closed after the client terminates; conclusion: the client wasn't terminating.

  4. The client in question in this case is Network.HTTP2.Client.Run.runH2 in http2.

  5. I tracked this down to wrapClinet (presumably a typo for wrapClient?). In particular, it was the call to waitCounter0 on the thread manager that was blocking indefinitely. Conclusion: some managed thread was not terminating.

  6. For this particular example, two managed threads were created in the client: H2 streaming supporter for stream 1 and H2 worker; I saw a call to decCounter for the former, but not for the latter.

  7. The H2 worker thread is spawned in sendHeaderBody in Network.HTTP2.Client.Run, and runs syncWithSender.

At this point I am not sure how to continue debugging this, or whether I should propose a fix. Perhaps the solution needs to be that somehow closeAllStreams must be able to indicate to the H2 worker thread that it is no longer required? I don't fully understand what all this Sync stuff is for, and moreover it looks like #148 will change some of it, although I'm not sure if that will affect this. So I'm opening this ticket instead of proposing a PR; please let me know how you'd like to continue or what you'd like me to do.

@kazu-yamamoto
Copy link
Owner

kazu-yamamoto commented Nov 12, 2024

presumably a typo for wrapClient?

Fixed.

@kazu-yamamoto
Copy link
Owner

If this happens with v5.3.5 but does not happen with v5.3.4, this is due to the removal of unliftio.
master does not have a counter integer anymore because it uses the size of Map.
Would you try both v5.3.4 and master and tell me what happens.

edsko added a commit to well-typed/grapesy that referenced this issue Nov 16, 2024
edsko added a commit to well-typed/grapesy that referenced this issue Nov 16, 2024
The problem is fixed in latest `http2`
(`7036a3429fb08bfcd5947230c37d1f3e63dfb3a6`).  See
kazu-yamamoto/http2#151 for the `http2` bug report.

Closes #257.
@edsko
Copy link
Collaborator Author

edsko commented Nov 16, 2024

@kazu-yamamoto I can confirm that this happens with both 5.3.4 and 5.3.5, but not with main/master (7036a34). I'll close this issue, thanks! (I'll keep the issue open our side for just a bit longer, until the next http2 release.)

@edsko edsko closed this as completed Nov 16, 2024
@kazu-yamamoto
Copy link
Owner

Thanks.
v5.3.6 has been released.

edsko added a commit to well-typed/grapesy that referenced this issue Nov 19, 2024
The problem is fixed in latest `http2`
(`7036a3429fb08bfcd5947230c37d1f3e63dfb3a6`).  See
kazu-yamamoto/http2#151 for the `http2` bug report.

Closes #257.
edsko added a commit to well-typed/grapesy that referenced this issue Nov 23, 2024
The problem is fixed in latest `http2`
(`7036a3429fb08bfcd5947230c37d1f3e63dfb3a6`).  See
kazu-yamamoto/http2#151 for the `http2` bug report.

Closes #257.
edsko added a commit to well-typed/grapesy that referenced this issue Nov 23, 2024
The problem is fixed in latest `http2`
(`7036a3429fb08bfcd5947230c37d1f3e63dfb3a6`).  See
kazu-yamamoto/http2#151 for the `http2` bug report.

Closes #257.
edsko added a commit to well-typed/grapesy that referenced this issue Nov 23, 2024
The problem is fixed in latest `http2`
(`7036a3429fb08bfcd5947230c37d1f3e63dfb3a6`).  See
kazu-yamamoto/http2#151 for the `http2` bug report.

Closes #257.
edsko added a commit to well-typed/grapesy that referenced this issue Nov 23, 2024
The problem is fixed in latest `http2`
(`7036a3429fb08bfcd5947230c37d1f3e63dfb3a6`).  See
kazu-yamamoto/http2#151 for the `http2` bug report.

Closes #257.
edsko added a commit to well-typed/grapesy that referenced this issue Nov 29, 2024
The problem is fixed in latest `http2`
(`7036a3429fb08bfcd5947230c37d1f3e63dfb3a6`).  See
kazu-yamamoto/http2#151 for the `http2` bug report.

Closes #257.
edsko added a commit to well-typed/grapesy that referenced this issue Dec 4, 2024
The problem is fixed in latest `http2`
(`7036a3429fb08bfcd5947230c37d1f3e63dfb3a6`).  See
kazu-yamamoto/http2#151 for the `http2` bug report.

Closes #257.
edsko added a commit to well-typed/grapesy that referenced this issue Dec 4, 2024
The problem is fixed in latest `http2`
(`7036a3429fb08bfcd5947230c37d1f3e63dfb3a6`).  See
kazu-yamamoto/http2#151 for the `http2` bug report.

Closes #257.
edsko added a commit to well-typed/grapesy that referenced this issue Dec 17, 2024
The problem is fixed in latest `http2`
(`7036a3429fb08bfcd5947230c37d1f3e63dfb3a6`).  See
kazu-yamamoto/http2#151 for the `http2` bug report.

Closes #257.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants