Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Make sqlite database migrations transactional again #14910

Merged
merged 1 commit into from
Jan 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/14910.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix a regression introduced in Synapse 1.69.0 which can result in database corruption when database migrations are interrupted on sqlite.
3 changes: 3 additions & 0 deletions synapse/storage/engines/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,9 @@ def executescript(cursor: CursorType, script: str) -> None:
"""Execute a chunk of SQL containing multiple semicolon-delimited statements.

This is not provided by DBAPI2, and so needs engine-specific support.

Some database engines may automatically COMMIT the ongoing transaction both
before and after executing the script.
"""
...

Expand Down
5 changes: 3 additions & 2 deletions synapse/storage/engines/sqlite.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,13 +135,14 @@ def executescript(cursor: sqlite3.Cursor, script: str) -> None:
> than one statement with it, it will raise a Warning. Use executescript() if
> you want to execute multiple SQL statements with one call.

Though the docs for `executescript` warn:
The script is wrapped in transaction control statemnets, since the docs for
`executescript` warn:

> If there is a pending transaction, an implicit COMMIT statement is executed
> first. No other implicit transaction control is performed; any transaction
> control must be added to sql_script.
"""
cursor.executescript(script)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, script will not contain the statement

# Mark as done.
cur.execute(
"INSERT INTO applied_schema_deltas (version, file) VALUES (?,?)",
(v, relative_path),
)

which adds a row to applied_schema_deltas.

It's possible that we are interrupted after running the migration and the COMMIT added here, but before we mark the transaction as done. Then on the next restart, we'll try to reapply an applied migration, which I expect will fail. It's a small window, and having the COMMIT here is a vast improvement... so I think we should probably leave it as-is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point.

I wasn't sure how a BEGIN without a corresponding COMMIT would behave in executescript. The implicit commit at the start implies that some level of magic is going on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we insert that as the last line of the script?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, it'll just take a bunch of refactoring. I'm tempted to omit the final COMMIT and add some unit tests instead.

Copy link
Contributor Author

@squahtx squahtx Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

executescript is defined here:
https://github.com/python/cpython/blob/7e3f09cad9b783d8968aa79ff6a8ee57beb8b83e/Modules/_sqlite/cursor.c#L1017

and it looks like there's no reason leaving the transaction open shouldn't work.

cursor.executescript(f"BEGIN TRANSACTION;\n{script}\nCOMMIT;")


# Following functions taken from: https://github.com/coleifer/peewee
Expand Down