-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Compare ported to unported PG schemas in portdb test job #13808
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,27 +2,27 @@ | |
# | ||
# Test script for 'synapse_port_db'. | ||
# - configures synapse and a postgres server. | ||
# - runs the port script on a prepopulated test sqlite db | ||
# - also runs it against an new sqlite db | ||
# - runs the port script on a prepopulated test sqlite db. Checks that the | ||
# return code is zero. | ||
# - reruns the port script on the same sqlite db, targetting the same postgres db. | ||
# Checks that the return code is zero. | ||
# - runs the port script against a new sqlite db. Checks the return code is zero. | ||
# | ||
# Expects Synapse to have been already installed with `poetry install --extras postgres`. | ||
# Expects `poetry` to be available on the `PATH`. | ||
|
||
set -xe | ||
set -xe -o pipefail | ||
cd "$(dirname "$0")/../.." | ||
|
||
echo "--- Generate the signing key" | ||
|
||
# Generate the server's signing key. | ||
poetry run synapse_homeserver --generate-keys -c .ci/sqlite-config.yaml | ||
|
||
echo "--- Prepare test database" | ||
|
||
# Make sure the SQLite3 database is using the latest schema and has no pending background update. | ||
# Make sure the SQLite3 database is using the latest schema and has no pending background updates. | ||
poetry run update_synapse_database --database-config .ci/sqlite-config.yaml --run-background-updates | ||
|
||
# Create the PostgreSQL database. | ||
poetry run .ci/scripts/postgres_exec.py "CREATE DATABASE synapse" | ||
psql -c "CREATE DATABASE synapse" | ||
|
||
echo "+++ Run synapse_port_db against test database" | ||
# TODO: this invocation of synapse_port_db (and others below) used to be prepended with `coverage run`, | ||
|
@@ -45,9 +45,23 @@ rm .ci/test_db.db | |
poetry run update_synapse_database --database-config .ci/sqlite-config.yaml --run-background-updates | ||
|
||
# re-create the PostgreSQL database. | ||
poetry run .ci/scripts/postgres_exec.py \ | ||
"DROP DATABASE synapse" \ | ||
"CREATE DATABASE synapse" | ||
psql \ | ||
-c "DROP DATABASE synapse" \ | ||
-c "CREATE DATABASE synapse" | ||
|
||
echo "+++ Run synapse_port_db against empty database" | ||
poetry run synapse_port_db --sqlite-database .ci/test_db.db --postgres-config .ci/postgres-config.yaml | ||
|
||
echo "--- Create a brand new postgres database from schema" | ||
cp .ci/postgres-config.yaml .ci/postgres-config-unported.yaml | ||
sed -i -e 's/database: synapse/database: synapse_unported/' .ci/postgres-config-unported.yaml | ||
psql -c "CREATE DATABASE synapse_unported" | ||
poetry run update_synapse_database --database-config .ci/postgres-config-unported.yaml --run-background-updates | ||
|
||
echo "+++ Comparing ported schema with unported schema" | ||
# Ignore the tables that portdb creates. (Should it tidy them up when the porting is completed?) | ||
psql synapse -c "DROP TABLE port_from_sqlite3;" | ||
pg_dump --format=plain --schema-only --no-tablespaces --no-acl --no-owner synapse_unported > unported.sql | ||
pg_dump --format=plain --schema-only --no-tablespaces --no-acl --no-owner synapse > ported.sql | ||
# By default, `diff` returns zero if there are no changes and nonzero otherwise | ||
diff -u unported.sql ported.sql | tee schema_diff | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Obviously this is a syntactic diff. Do we have reasonable confidence that this will be OK for semantically equivalent databases? (does There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The schema is apparently dumped in a deterministic order, see https://www.postgresql.org/message-id/1303223266.24799.13.camel%40fsopti579.F-Secure.com. https://stackoverflow.com/a/2179376/5252017 claims it's dumped in a "fairly deterministic" order. No guarantees about the data. I suggest we leave this be for now: there's very few rows inserted into a brand new synapse database---IIRC they're just stream positions and schema metadata. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I only dump the schema in the PR anyway. I suppose we could also dump data and check they agree, but I'm tempted to leave this PR as is. WDYT? |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Check that portdb generates the same postgres schema as that in the source tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm. What do you call 'completed' though?; the script is meant to be re-run multiple times.
I think it's OK as it is. It may be useful to have a trace that the database is a ported one anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding: portdb is a one-time migration process that can be suspended (cancel the script) and resumed (rerun the script). Once it finished porting across all rows you're supposed to only write to the postgres DB, and not rewrite to the sqlite db.
More precisely: the portdb script assumes the following for each table to be ported.
I am slightly paranoid that this assumption is false. https://sqlite.org/rowidtable.html notes that rowids can change if you vacuum the script. If you have an
INTEGER PRIMARY KEY
then you can insert any rowid you like in any order you like. (Related: #13226.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be explicit: the above is mostly paranoia and extreme caution rather than anything concrete. I plan to leave the script and comment as it is. How does that sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was that you can re-run the portdb script many times online (it's an incremental process), but then you finalise it by running it once again with Synapse offline. Hence there isn't really a 'completion' condition, unless we make the admin specify some flag to say 'this is the last time I intend to run the script'..