Skip to content

Fix TcpListener to not stop from accepting further connections on transient accept errors#7970

Merged
Aaronontheweb merged 5 commits into
akkadotnet:devfrom
petrikero:tcp-listener-fix
Jan 7, 2026
Merged

Fix TcpListener to not stop from accepting further connections on transient accept errors#7970
Aaronontheweb merged 5 commits into
akkadotnet:devfrom
petrikero:tcp-listener-fix

Conversation

@petrikero
Copy link
Copy Markdown
Contributor

Fixes #7969

Changes

  • Handle more transient errors in the TcpListener.HandleAccept(). This fixes the issue of the TcpListener stopping accepting any further incoming connections in case of a transient error during accept.
  • Be explicit about handling the known fatal errors in TcpListener.HandleAccept().
  • Unknown errors are still treated as fatal. Arguably, they could be treated as transient errors but there's some risk in that as well (some additional context at the end of this description).

Disclaimer: I'm not an expert in the domain got help from AI to craft this PR so a thorough review would be appreciated.

Checklist

For significant changes, please ensure that the following have been completed (delete if not relevant):

Additional notes

  • I believe it's possible to work around the issue here by watching the TcpListener actor and re-binding on fatal failures, but any active TCP connections would still be terminated if the issue occurs.
  • If the TcpListener stops on a fatal error, the Tcp.Unbound does not seem to be sent. I.e., this code in the TcpEchoService.Server sample does not trigger: Receive<Tcp.Unbound>(_ => _stopCommander?.Tell("Done"));
  • I was not able to find the watch + re-bind pattern (to protect against unexpected termination) from any of the samples or the docs. I suspect many apps running in production might not have this and are thus susceptible.

Regarding whether the unknown errors should be treated as fatal or transient:

  • If the errors are treated as fatal, it increases the likelihood of applications being affected by this. It'd make it less likely for production apps to stop accepting incoming connections.
  • If the errors are treated as transient, the TcpListener can end up in an infinite loop of retrying what is really a fatal error. This prevents the surrounding app from seeing the issue and reacting to it.
  • Perhaps a better approach would be to keep retrying on unknown errors for a few seconds and then terminate so the app can detect and handle the error. Anyway, it's outside the context of this PR.

@petrikero petrikero changed the title TcpListener: Improve handling of Accept errors Fix TcpListener to not stop from accepting further connections on transient accept errors Dec 22, 2025
Copy link
Copy Markdown
Member

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks fine but I left some comments - I'll need to pull up the source and look at the TCP error codes before approving though

Comment thread src/core/Akka/IO/TcpListener.cs
Comment thread src/core/Akka/IO/TcpListener.cs Outdated
break;

// Fatal errors - the listener socket itself is broken
case SocketError.OperationAborted:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between OperationAborted and ConnectionAborted?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason I'm asking is that this might be a transient error - I'd want to know the systems-level explanation of the code first though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not an expert on low-level socket code, but my understanding is that OperationAborted maps to WSA_OPERATION_ABORTED, which indicates that the listener socket (not the incoming connection socket) was closed/disposed/interrupted.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's great to know - I'll pull down the branch and check it out locally but if what you said is true then yes, this is correct.

break;

case SocketError.ConnectionReset:
case SocketError.ConnectionAborted:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@petrikero
Copy link
Copy Markdown
Contributor Author

@Aaronontheweb Just a quick update: we're still seeing this happen about once a week in our production environments. We did deploy a hotfix that restarts the actor when we see this happen, so we're mostly shielded from the impact (all the connections are dropped temporarily but we have a recovery mechanism for that), but it's quite possible that many Akka-using apps don't have this and could be impacted.

@Aaronontheweb
Copy link
Copy Markdown
Member

@petrikero I've talked about this with @Arkatufus this morning and we'll do a new v1.5 release with this fix today.

Copy link
Copy Markdown
Member

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Aaronontheweb Aaronontheweb enabled auto-merge (squash) January 7, 2026 16:13
@petrikero
Copy link
Copy Markdown
Contributor Author

@petrikero I've talked about this with @Arkatufus this morning and we'll do a new v1.5 release with this fix today.

I greatly appreciate you handling this so quickly (given the holiday season)!

@Aaronontheweb Aaronontheweb disabled auto-merge January 7, 2026 17:20
@Aaronontheweb Aaronontheweb merged commit 4efb6f6 into akkadotnet:dev Jan 7, 2026
6 of 11 checks passed
Arkatufus pushed a commit to Arkatufus/akka.net that referenced this pull request Jan 7, 2026
…ransient accept errors (akkadotnet#7970)

* Minimal fix to handle SocketError.ConnectionAborted in TcpListener.

* Handle more transient error cases. Handle known fatal errors explicitly.

* Join code paths for known and unknown fatal errors.

---------

Co-authored-by: Aaron Stannard <aaron@petabridge.com>
Aaronontheweb added a commit that referenced this pull request Jan 8, 2026
…ransient accept errors (#7970)

* Minimal fix to handle SocketError.ConnectionAborted in TcpListener.

* Handle more transient error cases. Handle known fatal errors explicitly.

* Join code paths for known and unknown fatal errors.

---------

Co-authored-by: Aaron Stannard <aaron@petabridge.com>
This was referenced May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TcpListener terminating with ConnectionAborted error

2 participants