Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address problems in concurrent sqlite access #10706

Merged
merged 8 commits into from
Mar 15, 2022

Conversation

espadolini
Copy link
Contributor

@espadolini espadolini commented Mar 1, 2022

This PR addresses an issue related to our use of sqlite during restarts, both in-process (typically as a result of CA rotation) and across processes (as a result of a user-initiated graceful restart, with SIGHUP or SIGUSR2/SIGQUIT): the sqlite backend can't handle multiple processes working over the same database, as the "busy" error that's returned when a transaction must be retried due to concurrent access is treated as a fatal database error.

The issue is fixed by changing a go-sqlite3 parameter that changes how locks are acquired in transactions; by starting every transaction with BEGIN IMMEDIATE, the write lock is acquired unconditionally at the very beginning of the transaction, so there's never any need for a forced rollback and the busy timeout can handle concurrency while multiple processes are handling the same database (which only happens during restarts).

In addition, this PR cleans up some of the sqlite backend code (including the removal of in-memory sqlite, which has significantly worse performance than our radix tree in-memory backend), adds a way for users to specify the journal mode for sqlite auth backends, and sets the sync mode for process storage to FULL, as process storage is quite important and seldom written to (a corrupted process storage db requires wiping the db and rejoining the cluster with a token).

No changes are made to the default sync mode of OFF and to the default journal mode (not specified by the code, and thus defaulting to the deferred DELETE mode) which applies both to test code that spawns sqlite backends and, more importantly, to sqlite auth backends, because of performance concerns; we ought to document that "light" production use with the sqlite backend should specify a stronger filesystem sync mode, however.

Partially fixes the sqlite issues in #10332, #9815, #5083 and most other occurrences of database is closed, however our use of the same cache backend storage across restarts is still problematic, and will be addressed in #11022.

@espadolini espadolini force-pushed the espadolini/sqlite-corruption branch from 82c606b to 0644267 Compare March 1, 2022 20:46
@programmerq
Copy link
Contributor

@espadolini is it possible that this will also address #5083?

@espadolini
Copy link
Contributor Author

Hopefully yes; if the underlying cause is the same (and I think it is), it should also get fixed by this. There's still some other problems that we'll address in this PR related specifically to caches rather than just the backend, which could also be contributing to the issue.

@espadolini
Copy link
Contributor Author

espadolini commented Mar 2, 2022

After just 0644267 I still got a failure after about 300 runs of TestIntegrations/RotateTrustedCluster (massive improvements over the spurious failures after 5 or so runs); with 7e74595 I'm at over 2000 successful runs on arm64 macos

edit: 1200 successful runs in a row on arm64 linux

@espadolini espadolini force-pushed the espadolini/sqlite-corruption branch 6 times, most recently from ed89316 to 582627e Compare March 7, 2022 13:19
@espadolini espadolini marked this pull request as ready for review March 7, 2022 15:14
@github-actions github-actions bot requested review from xacrimon and zmb3 March 7, 2022 15:14
@espadolini espadolini requested review from fspmarshall, xacrimon, zmb3 and rosstimothy and removed request for zmb3 and xacrimon March 7, 2022 15:15
@xacrimon
Copy link
Contributor

xacrimon commented Mar 7, 2022

Otherwise LGTM

xacrimon
xacrimon previously approved these changes Mar 7, 2022
@espadolini
Copy link
Contributor Author

espadolini commented Mar 7, 2022

edit: moved questions in the PR description

@espadolini espadolini force-pushed the espadolini/sqlite-corruption branch from 474ee2e to 3d387b3 Compare March 7, 2022 17:33
@espadolini
Copy link
Contributor Author

espadolini commented Mar 7, 2022

Cache directory cleanup while a sqlite database is still open can cause errors during (*TeleportProcess).Close(), as shown by repeatedly testing RotateTrustedCluster in on-disk cache mode; it might be possible to just try to delete the directory a few times while RemoveAll reports unlinkat datadir/cache-XXXX/component: directory not empty.

edit: mitigated by 4243228 and 6bb6398

@espadolini espadolini force-pushed the espadolini/sqlite-corruption branch 3 times, most recently from 38500c3 to a66f0a3 Compare March 8, 2022 16:32
@espadolini espadolini force-pushed the espadolini/sqlite-corruption branch from 1ba14ea to 409cf65 Compare March 9, 2022 16:16
@espadolini espadolini enabled auto-merge (squash) March 15, 2022 15:05
@espadolini espadolini merged commit d83886e into master Mar 15, 2022
@espadolini espadolini deleted the espadolini/sqlite-corruption branch March 15, 2022 16:54
espadolini added a commit that referenced this pull request Mar 16, 2022
* Use BEGIN IMMEDIATE to start transactions

This makes it so all transactions grab a write lock
rather than a read lock that can be upgraded in case of
a write; in case of multiple writers (which, in our
case, can only happen during a restart as the new
process reopens the same sqlite database) this will
prevent two transactions from attempting to upgrade
their lock, which would cause a SQLITE_BUSY error in
one of them. In regular operation this shouldn't cause
a performance hit, as we're using a single connection
to the sqlite database (guarded by locks in the go side)
anyway.

* Escape path in sqlite connection URL

This makes it so that the sqlite backend supports paths with ? in them.

* Close process storage on TeleportProcess shutdown

This aligns the behavior of Shutdown with that of Close.

* Allow specifying the journal mode in sqlite

This will let sqlite backend users specify WAL mode in their config
file, and will allow us to specify alternate journal modes for our
on-disk caches in the future.

This also removes sqlite memory mode, as it's not used anywhere because
of its poor query performance compared to our in-memory backend, and
cleans up a bit of old cruft, and runs process storage in FULL sync
mode - it's very seldom written to and holds important data.
espadolini added a commit that referenced this pull request Mar 16, 2022
* Use BEGIN IMMEDIATE to start transactions

This makes it so all transactions grab a write lock
rather than a read lock that can be upgraded in case of
a write; in case of multiple writers (which, in our
case, can only happen during a restart as the new
process reopens the same sqlite database) this will
prevent two transactions from attempting to upgrade
their lock, which would cause a SQLITE_BUSY error in
one of them. In regular operation this shouldn't cause
a performance hit, as we're using a single connection
to the sqlite database (guarded by locks in the go side)
anyway.

* Escape path in sqlite connection URL

This makes it so that the sqlite backend supports paths with ? in them.

* Close process storage on TeleportProcess shutdown

This aligns the behavior of Shutdown with that of Close.

* Allow specifying the journal mode in sqlite

This will let sqlite backend users specify WAL mode in their config
file, and will allow us to specify alternate journal modes for our
on-disk caches in the future.

This also removes sqlite memory mode, as it's not used anywhere because
of its poor query performance compared to our in-memory backend, and
cleans up a bit of old cruft, and runs process storage in FULL sync
mode - it's very seldom written to and holds important data.
espadolini added a commit that referenced this pull request Mar 16, 2022
* Use BEGIN IMMEDIATE to start transactions

This makes it so all transactions grab a write lock
rather than a read lock that can be upgraded in case of
a write; in case of multiple writers (which, in our
case, can only happen during a restart as the new
process reopens the same sqlite database) this will
prevent two transactions from attempting to upgrade
their lock, which would cause a SQLITE_BUSY error in
one of them. In regular operation this shouldn't cause
a performance hit, as we're using a single connection
to the sqlite database (guarded by locks in the go side)
anyway.

* Escape path in sqlite connection URL

This makes it so that the sqlite backend supports paths with ? in them.

* Close process storage on TeleportProcess shutdown

This aligns the behavior of Shutdown with that of Close.

* Allow specifying the journal mode in sqlite

This will let sqlite backend users specify WAL mode in their config
file, and will allow us to specify alternate journal modes for our
on-disk caches in the future.

This also removes sqlite memory mode, as it's not used anywhere because
of its poor query performance compared to our in-memory backend, and
cleans up a bit of old cruft, and runs process storage in FULL sync
mode - it's very seldom written to and holds important data.
espadolini added a commit that referenced this pull request Mar 17, 2022
* Use BEGIN IMMEDIATE to start transactions

This makes it so all transactions grab a write lock
rather than a read lock that can be upgraded in case of
a write; in case of multiple writers (which, in our
case, can only happen during a restart as the new
process reopens the same sqlite database) this will
prevent two transactions from attempting to upgrade
their lock, which would cause a SQLITE_BUSY error in
one of them. In regular operation this shouldn't cause
a performance hit, as we're using a single connection
to the sqlite database (guarded by locks in the go side)
anyway.

* Escape path in sqlite connection URL

This makes it so that the sqlite backend supports paths with ? in them.

* Close process storage on TeleportProcess shutdown

This aligns the behavior of Shutdown with that of Close.

* Allow specifying the journal mode in sqlite

This will let sqlite backend users specify WAL mode in their config
file, and will allow us to specify alternate journal modes for our
on-disk caches in the future.

This also removes sqlite memory mode, as it's not used anywhere because
of its poor query performance compared to our in-memory backend, and
cleans up a bit of old cruft, and runs process storage in FULL sync
mode - it's very seldom written to and holds important data.
espadolini added a commit that referenced this pull request Mar 17, 2022
* Use BEGIN IMMEDIATE to start transactions

This makes it so all transactions grab a write lock
rather than a read lock that can be upgraded in case of
a write; in case of multiple writers (which, in our
case, can only happen during a restart as the new
process reopens the same sqlite database) this will
prevent two transactions from attempting to upgrade
their lock, which would cause a SQLITE_BUSY error in
one of them. In regular operation this shouldn't cause
a performance hit, as we're using a single connection
to the sqlite database (guarded by locks in the go side)
anyway.

* Escape path in sqlite connection URL

This makes it so that the sqlite backend supports paths with ? in them.

* Close process storage on TeleportProcess shutdown

This aligns the behavior of Shutdown with that of Close.

* Allow specifying the journal mode in sqlite

This will let sqlite backend users specify WAL mode in their config
file, and will allow us to specify alternate journal modes for our
on-disk caches in the future.

This also removes sqlite memory mode, as it's not used anywhere because
of its poor query performance compared to our in-memory backend, and
cleans up a bit of old cruft, and runs process storage in FULL sync
mode - it's very seldom written to and holds important data.
espadolini added a commit that referenced this pull request Mar 17, 2022
* Use BEGIN IMMEDIATE to start transactions

This makes it so all transactions grab a write lock
rather than a read lock that can be upgraded in case of
a write; in case of multiple writers (which, in our
case, can only happen during a restart as the new
process reopens the same sqlite database) this will
prevent two transactions from attempting to upgrade
their lock, which would cause a SQLITE_BUSY error in
one of them. In regular operation this shouldn't cause
a performance hit, as we're using a single connection
to the sqlite database (guarded by locks in the go side)
anyway.

* Escape path in sqlite connection URL

This makes it so that the sqlite backend supports paths with ? in them.

* Close process storage on TeleportProcess shutdown

This aligns the behavior of Shutdown with that of Close.

* Allow specifying the journal mode in sqlite

This will let sqlite backend users specify WAL mode in their config
file, and will allow us to specify alternate journal modes for our
on-disk caches in the future.

This also removes sqlite memory mode, as it's not used anywhere because
of its poor query performance compared to our in-memory backend, and
cleans up a bit of old cruft, and runs process storage in FULL sync
mode - it's very seldom written to and holds important data.
@webvictim webvictim mentioned this pull request Apr 19, 2022
@webvictim webvictim mentioned this pull request Jun 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
robustness Resistance to crashes and reliability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants