Skip to content

fix: treat io.ErrUnexpectedEOF as driver.ErrBadConn to prevent connection pool poisoning#1299

Merged
arp242 merged 1 commit intolib:masterfrom
henhouse:fix/connection-poisoning-unexpected-eof
Apr 1, 2026
Merged

fix: treat io.ErrUnexpectedEOF as driver.ErrBadConn to prevent connection pool poisoning#1299
arp242 merged 1 commit intolib:masterfrom
henhouse:fix/connection-poisoning-unexpected-eof

Conversation

@henhouse
Copy link
Copy Markdown
Contributor

Fixes #1298

When a TCP read is interrupted mid-message (partial DataRow), pq's recvMessage returns io.ErrUnexpectedEOF. Prior to this fix, handleError did not classify this as driver.ErrBadConn, so:

  1. cn.err was never set, and IsValid() returned true
  2. The inProgress atomic flag remained stuck at true (since ReadyForQuery
    never arrived)
  3. database/sql kept handing out the broken connection
  4. The CompareAndSwap guard rejected every subsequent query with
    "there is already a query being processed on this connection"

This is especially impactful with CockroachDB, where node drains and rebalances routinely produce partial reads.

Two changes:

  1. error.go: Treat io.ErrUnexpectedEOF the same as io.EOF in handleError - both indicate a dead connection.
  2. conn.go: Wrap errQueryInProgress with driver.ErrBadConn so database/sql evicts the connection instead of recycling it. This acts as defense for any future code path that leaves inProgress stuck.

Includes unit test for handleError and integration test using a TCP fault-injection proxy that truncates a DataRow mid-body to reproduce the exact failure mode.

@arp242
Copy link
Copy Markdown
Collaborator

arp242 commented Mar 30, 2026

Can't this use the existing pqtest.Fake to test this error? Over 250 lines of code for a single test is quite a lot to maintain going forward, and it's not immediately obvious what it does exactly either.

I'm also okay with just running this against a real database. I made #1304 to run it in the CI, although I need to finish up some bits.

Is changing errQueryInProgress intentional or a left-over from some previous debugging?

@henhouse henhouse force-pushed the fix/connection-poisoning-unexpected-eof branch from fa858df to e2a6281 Compare March 31, 2026 18:09
@henhouse
Copy link
Copy Markdown
Contributor Author

@arp242 Oops yeah that was left behind, thanks.

Pushed an update that uses pqtest instead which reduces it a lot. If you'd prefer not having a test for this case entirely, I can just remove it. It was mostly to provide a in-repo example of the demo I linked to that reproduces the case in full. The check in error_test.go is minimally sufficient to check for a regression in this case.

@arp242 arp242 force-pushed the fix/connection-poisoning-unexpected-eof branch from e2a6281 to 205ea6b Compare April 1, 2026 11:45
When recvMessage does an io.ReadFull on a partially-received message
body and the connection drops mid-read, the result is
io.ErrUnexpectedEOF. handleError classifies io.EOF as driver.ErrBadConn
but not io.ErrUnexpectedEOF, so cn.err is never set, IsValid() returns
true, and database/sql keeps recycling the broken connection. The
inProgress flag stays stuck at true (ReadyForQuery never arrived), and
the CAS guard rejects every subsequent query with "there is already a
query being processed on this connection" — permanently poisoning the
pool.

So treat io.ErrUnexpectedEOF the same as io.EOF in handleError: both
indicate a dead connection.

Fixes lib#1298
@arp242 arp242 force-pushed the fix/connection-poisoning-unexpected-eof branch from 205ea6b to 6cc34d3 Compare April 1, 2026 12:00
@arp242 arp242 merged commit 6d77ced into lib:master Apr 1, 2026
13 checks passed
@arp242
Copy link
Copy Markdown
Collaborator

arp242 commented Apr 1, 2026

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

io.ErrUnexpectedEOF causes permanent connection pool poisoning via stuck inProgress flag

2 participants