Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kine skips compact intervals if compaction fails to complete #357

Closed
brandond opened this issue Nov 8, 2024 · 1 comment
Closed

Kine skips compact intervals if compaction fails to complete #357

brandond opened this issue Nov 8, 2024 · 1 comment

Comments

@brandond
Copy link
Member

brandond commented Nov 8, 2024

If the compaction loop fails for any reason, the rows will be compacted and the compact-rev key will be updated, but the expected compact-rev key stored in memory won't be updated - so it thinks that some other node has compacted, and skips the following interval.

This has been reported several times:

The first instance was in an odd multi-master Galera cluster, but the second was on plain old sqlite.

This is because if any compaction fails, we restart the outer loop:

logrus.Errorf("Compact failed: %v", err)
metrics.CompactTotal.WithLabelValues(metrics.ResultError).Inc()
continue outer

without recording any of the work done by prior successful iterations of the inner loop:
// Record the final results for the outer loop
compactRev = compactedRev
targetCompactRev = currentRev

We should fix that, but we should also figure out how to better handle locking errors when trying to compact.

For sqlite at least, this may be related to go-sqlite3's BeginTX ignoring TxOptions:

This is BAD, as the default behavior of sqlite transactions is to... not actually start a transaction:
https://sqlite.org/forum/info/c3cb9524bef62b67#forum11484

A bare BEGIN (as in BEGIN DEFERRED) does not start a transaction. It turns off the auto-commit machinery so that the transaction commenced by the next statement is not automatically committed at the end of the execution of that statement. If that statement is a "read" statement, then the transaction is a read transaction. If that statement is a "write" statement, then the transaction is a write transaction. BEGIN IMMEDIATE and BEGIN EXCLUSIVE both turn off the auto-commit machinery and start a transaction (write or exclusive respectively)

@brandond
Copy link
Member Author

brandond commented Nov 8, 2024

For sqlite3, I guess all we can do is change the global transaction lock type by setting _txlock=immediate in the DSN parameters:
https://pkg.go.dev/github.com/mattn/go-sqlite3#readme-connection-string

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant