Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

sql transactions are not retried on connection failure #9779

Open
richvdh opened this issue Apr 9, 2021 · 3 comments
Open

sql transactions are not retried on connection failure #9779

richvdh opened this issue Apr 9, 2021 · 3 comments
Labels
S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@richvdh
Copy link
Member

richvdh commented Apr 9, 2021

if the connection to the postgres db drops, the sql transaction is not retried (and the HTTP request(s) in question end up failing with a 500).

It also makes an awful lot of noise in the logs.

For example:

2021-04-09 16:20:41,158 - synapse.storage.txn - 543 - WARNING - PUT-1087 - [TXN OPERROR] {set_received_txn_response-1511} terminating connection due to administrator command
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
 0/5
2021-04-09 16:20:41,161 - twisted - 260 - CRITICAL - PUT-1087 - Rollback failed
Traceback (most recent call last):
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 284, in _runWithConnection
    result = func(conn, *args, **kw)
  File "/home/synapse/src/synapse/storage/database.py", line 738, in inner_func
    return func(db_conn, *args, **kwargs)
  File "/home/synapse/src/synapse/storage/database.py", line 548, in new_transaction
    conn.rollback()
  File "/home/synapse/src/synapse/storage/database.py", line 160, in rollback
    self.conn.rollback()
--- <exception caught here> ---
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 53, in rollback
    self._connection.rollback()
psycopg2.InterfaceError: connection already closed

2021-04-09 16:20:41,162 - twisted - 260 - INFO - PUT-1087 - adbapi closing: psycopg2
2021-04-09 16:20:41,162 - twisted - 260 - INFO - PUT-1087 - Connection lost.
2021-04-09 16:20:41,164 - twisted - 260 - CRITICAL - sentinel - Rollback failed
Traceback (most recent call last):
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/threadpool.py", line 255, in <lambda>
    ctx, func, *args, **kw
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/context.py", line 118, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/context.py", line 83, in callWithContext
    return func(*args, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 290, in _runWithConnection
    conn.rollback()
--- <exception caught here> ---
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 53, in rollback
    self._connection.rollback()
psycopg2.InterfaceError: connection already closed

2021-04-09 16:20:41,164 - twisted - 260 - CRITICAL - sentinel - Rollback failed
Traceback (most recent call last):
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/threadpool.py", line 238, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/threadpool.py", line 255, in <lambda>
    ctx, func, *args, **kw
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/context.py", line 118, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/context.py", line 83, in callWithContext
    return func(*args, **kw)
--- <exception caught here> ---
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 290, in _runWithConnection
    conn.rollback()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 62, in rollback
    self._pool.disconnect(self._connection)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 429, in disconnect
    raise Exception("wrong connection for thread")
builtins.Exception: wrong connection for thread

2021-04-09 16:20:41,165 - synapse.federation.transport.server - 433 - ERROR - PUT-1087 - on_incoming_transaction failed
Capture point (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/synapse/src/synapse/app/homeserver.py", line 502, in <module>
    main()
  File "/home/synapse/src/synapse/app/homeserver.py", line 498, in main
    run(hs)
  File "/home/synapse/src/synapse/app/homeserver.py", line 489, in run
    logger=logger,
  File "/home/synapse/src/synapse/app/_base.py", line 137, in start_reactor
    run()
  File "/home/synapse/src/synapse/app/_base.py", line 121, in run
    run_command()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/base.py", line 1423, in run
    self.mainLoop()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/base.py", line 1433, in mainLoop
    reactorBaseSelf.runUntilCurrent()
  File "/home/synapse/src/synapse/metrics/__init__.py", line 559, in f
    ret = func(*args, **kwargs)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/base.py", line 972, in runUntilCurrent
    f(*a, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 517, in errback
    self._startRunCallbacks(fail)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 580, in _startRunCallbacks
    self._runCallbacks()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 1511, in gotResult
    current_context.run(_inlineCallbacks, r, g, status)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 1501, in _inlineCallbacks
    status.deferred.errback()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 517, in errback
    self._startRunCallbacks(fail)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 580, in _startRunCallbacks
    self._runCallbacks()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/synapse/src/synapse/util/async_helpers.py", line 100, in errback
    observer.errback(f)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 517, in errback
    self._startRunCallbacks(fail)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 580, in _startRunCallbacks
    self._runCallbacks()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 1511, in gotResult
    current_context.run(_inlineCallbacks, r, g, status)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks
    result = current_context.run(result.throwExceptionIntoGenerator, g)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
Traceback (most recent call last):
  File "/home/synapse/src/synapse/storage/database.py", line 532, in new_transaction
    r = func(cursor, *args, **kwargs)
  File "/home/synapse/src/synapse/storage/database.py", line 846, in simple_insert_txn
    txn.execute(sql, vals)
  File "/home/synapse/src/synapse/storage/database.py", line 294, in execute
    self._do_execute(self.txn.execute, sql, *args)
  File "/home/synapse/src/synapse/storage/database.py", line 320, in _do_execute
    return func(sql, *args)
psycopg2.errors.AdminShutdown: terminating connection due to administrator command
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/synapse/src/synapse/federation/transport/server.py", line 430, in on_PUT
    origin, transaction_data
  File "/home/synapse/src/synapse/federation/federation_server.py", line 207, in on_incoming_transaction
    request_time,
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks
    result = current_context.run(result.throwExceptionIntoGenerator, g)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/synapse/src/synapse/federation/federation_server.py", line 220, in _on_incoming_transaction_inner
    origin, transaction, request_time
  File "/home/synapse/src/synapse/federation/federation_server.py", line 269, in _handle_incoming_transaction
    await self.transaction_actions.set_response(origin, transaction, 200, response)
  File "/home/synapse/src/synapse/federation/persistence.py", line 65, in set_response
    transaction_id, origin, code, response
  File "/home/synapse/src/synapse/storage/databases/main/transactions.py", line 146, in set_received_txn_response
    desc="set_received_txn_response",
  File "/home/synapse/src/synapse/storage/database.py", line 825, in simple_insert
    await self.runInteraction(desc, self.simple_insert_txn, table, values)
  File "/home/synapse/src/synapse/storage/database.py", line 668, in runInteraction
    **kwargs,
  File "/home/synapse/src/synapse/storage/database.py", line 744, in runWithConnection
    self._db_pool.runWithConnection(inner_func, *args, **kwargs)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/threadpool.py", line 238, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/threadpool.py", line 255, in <lambda>
    ctx, func, *args, **kw
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/context.py", line 118, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/context.py", line 83, in callWithContext
    return func(*args, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 293, in _runWithConnection
    compat.reraise(excValue, excTraceback)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/deprecate.py", line 298, in deprecatedFunction
    return function(*args, **kwargs)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/compat.py", line 403, in reraise
    raise exception.with_traceback(traceback)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 284, in _runWithConnection
    result = func(conn, *args, **kw)
  File "/home/synapse/src/synapse/storage/database.py", line 738, in inner_func
    return func(db_conn, *args, **kwargs)
  File "/home/synapse/src/synapse/storage/database.py", line 548, in new_transaction
    conn.rollback()
  File "/home/synapse/src/synapse/storage/database.py", line 160, in rollback
    self.conn.rollback()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 67, in rollback
    raise ConnectionLost()
twisted.enterprise.adbapi.ConnectionLost
2021-04-09 16:20:41,172 - synapse.http.server - 98 - ERROR - PUT-1087 - Failed handle request via 'FederationSendServlet': <XForwardedForRequest at 0x7f2590f00f28 method='PUT' uri='/_matrix/federation/v1/send/1617898562914' clientproto='HTTP/1.0' site='8008'>
Capture point (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/synapse/src/synapse/app/homeserver.py", line 502, in <module>
    main()
  File "/home/synapse/src/synapse/app/homeserver.py", line 498, in main
    run(hs)
  File "/home/synapse/src/synapse/app/homeserver.py", line 489, in run
    logger=logger,
  File "/home/synapse/src/synapse/app/_base.py", line 137, in start_reactor
    run()
  File "/home/synapse/src/synapse/app/_base.py", line 121, in run
    run_command()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/base.py", line 1423, in run
    self.mainLoop()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/base.py", line 1433, in mainLoop
    reactorBaseSelf.runUntilCurrent()
  File "/home/synapse/src/synapse/metrics/__init__.py", line 559, in f
    ret = func(*args, **kwargs)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/base.py", line 972, in runUntilCurrent
    f(*a, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 517, in errback
    self._startRunCallbacks(fail)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 580, in _startRunCallbacks
    self._runCallbacks()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 1511, in gotResult
    current_context.run(_inlineCallbacks, r, g, status)
Traceback (most recent call last):
  File "/home/synapse/src/synapse/storage/database.py", line 532, in new_transaction
    r = func(cursor, *args, **kwargs)
  File "/home/synapse/src/synapse/storage/database.py", line 846, in simple_insert_txn
    txn.execute(sql, vals)
  File "/home/synapse/src/synapse/storage/database.py", line 294, in execute
    self._do_execute(self.txn.execute, sql, *args)
  File "/home/synapse/src/synapse/storage/database.py", line 320, in _do_execute
    return func(sql, *args)
psycopg2.errors.AdminShutdown: terminating connection due to administrator command
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks
    result = current_context.run(result.throwExceptionIntoGenerator, g)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/synapse/src/synapse/federation/federation_server.py", line 220, in _on_incoming_transaction_inner
    origin, transaction, request_time
  File "/home/synapse/src/synapse/federation/federation_server.py", line 269, in _handle_incoming_transaction
    await self.transaction_actions.set_response(origin, transaction, 200, response)
  File "/home/synapse/src/synapse/federation/persistence.py", line 65, in set_response
    transaction_id, origin, code, response
  File "/home/synapse/src/synapse/storage/databases/main/transactions.py", line 146, in set_received_txn_response
    desc="set_received_txn_response",
  File "/home/synapse/src/synapse/storage/database.py", line 825, in simple_insert
    await self.runInteraction(desc, self.simple_insert_txn, table, values)
  File "/home/synapse/src/synapse/storage/database.py", line 668, in runInteraction
    **kwargs,
  File "/home/synapse/src/synapse/storage/database.py", line 744, in runWithConnection
    self._db_pool.runWithConnection(inner_func, *args, **kwargs)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/threadpool.py", line 238, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/threadpool.py", line 255, in <lambda>
    ctx, func, *args, **kw
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/context.py", line 118, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/context.py", line 83, in callWithContext
    return func(*args, **kw)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 293, in _runWithConnection
    compat.reraise(excValue, excTraceback)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/deprecate.py", line 298, in deprecatedFunction
    return function(*args, **kwargs)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/python/compat.py", line 403, in reraise
    raise exception.with_traceback(traceback)
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 284, in _runWithConnection
    result = func(conn, *args, **kw)
  File "/home/synapse/src/synapse/storage/database.py", line 738, in inner_func
    return func(db_conn, *args, **kwargs)
  File "/home/synapse/src/synapse/storage/database.py", line 548, in new_transaction
    conn.rollback()
  File "/home/synapse/src/synapse/storage/database.py", line 160, in rollback
    self.conn.rollback()
  File "/home/synapse/env/lib/python3.7/site-packages/twisted/enterprise/adbapi.py", line 67, in rollback
    raise ConnectionLost()
twisted.enterprise.adbapi.ConnectionLost
2021-04-09 16:20:41,806 - twisted - 260 - INFO - sentinel - adbapi connecting: psycopg2
2021-04-09 16:20:41,174 - synapse.access.http.8008 - 393 - INFO - PUT-1087 - 95.216.197.211 - 8008 - {jboi.nl} Processed request: 0.030sec/-0.000sec (0.017sec, 0.000sec) (0.001sec/0.006sec/2) 55B 500 "PUT /_matrix/federation/v1/send/1617898562914 HTTP/1.0" "Synapse/1.31.0" [0 dbevts]
@richvdh
Copy link
Member Author

richvdh commented Apr 9, 2021

our retry logic is inside the "run with connection" logic, so to fix this, we'd have to add another layer of retries further up the stack.

@richvdh
Copy link
Member Author

richvdh commented Apr 9, 2021

#9780 is a small step in the right direction for the logs

@richvdh
Copy link
Member Author

richvdh commented Apr 9, 2021

The logging also isn't helped by the fact that rollback is being called twice (once from within synapse.storage.database.new_transaction, at database.py:160), and then again by twisted's _runWithConnection at adbapi.py:284.

I'm rather wondering if we should move all the fancy exception/retry handling out of new_transaction and into (say) runInteraction.

@anoadragon453 anoadragon453 added S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. labels Apr 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

No branches or pull requests

2 participants