-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ProgrammingError
after reconnecting.
#32
Comments
Hi, First of all, thanks for the report! Which version of Twisted are you using? The culprit seems to be this exception: ValueError: file descriptor cannot be a negative integer (-1) which is something that has been plaguing me for a long time - there's at least one Twisted ticket that has been opened in connection with this problem: http://twistedmatrix.com/trac/ticket/4539 As you can see, the ticket is now solved, but I've seen this error happen on our production system once, even though we're running recent Twisted. OTOH, we've had a master/slave fallback recently and everything kept working. It would be very good if you could come up with a way to reproduce the problem. At this point, I'm not even sure if it's Twisted's or psycopg2's fault... The -1 file descriptor error is most probably Twisted, but the asynchronous query underway one - that's coming from psycopg2. txpostgres calls Without a reproducible test case it's hard to debug; I might try to do some tests with hammering an instance with queries while restarting the database, but in the meantime I'm flying blind :( |
I'm using Twisted 13.2.0. Not the most recent, but within the past year.
I should note that I've also had failovers that have worked fine, this is the first time something went wrong.
I'll play around when I get a chance and see if I can come up with anything. |
After doing some mild investigation: It looks like Ticket #4539 is broken again. The |
It appears that the tests created for the previous fix (removedFromReactor, negativeOneFileDescriptor) are passing. |
Thanks for the investigation! I just tried with Twisted trunk and I tried with various reactors and both I'll try digging deeper into that, although I'm also a bit concerned about this problem being a red herring. I'm not 100% positive that it's what has been causing the reconnection problem you got; it would be great to have a reproducible test case for that so we can be sure that fixing the -1 FD problem fixes the txpostgres bug. |
OK, that one was easy: for epoll, I'll try to write a program to bombard a Postgres instance with queries and kill the database - let's see if I manage to get a similar error to yours. |
I did a few more tests with txpostgres and Twisted trunk and have been able to reproduce the -1 error, but it still managed to recover and continue querying. Here's the code: https://gist.github.com/fa25fbae2855e8e5d2d6 I'm running it in one terminal and in another one doing: sudo pg_ctlcluster -m fast 9.1 main restart This gives me
but after that I get a reconnection and the queries keep running. I start to suspect that the -1 FD is indeed a red herring. We should focus on reproducing the actual problem with the |
I'm using
ConnectionPool
withDeadConnectionDetector
. Here's my connection function, if it helps.This morning, my primary database failed, which happened to be during a period of high-frequency querying. So of course,
psycopg2.OperationalError
appears several times, before it was replaced withConnectionDead
. Shortly thereafter, my standby server was promoted to master. txpostgres was then able to reconnect, which [I assume] it did. But then, immediately, the following error starts:psycopg2.ProgrammingError: execute cannot be used while an asynchronous query is underway
. As far as I can tell, no queries were successfully executed until I restarted the app.I've been digging around the txpostgres code to try to find how this could happen, but I haven't had any enlightenment so far. I was wondering if you might have any ideas.
Here are some tracebacks, but—as is usual for Twisted—they aren't very helpful.
The text was updated successfully, but these errors were encountered: