Improve journal delete performance#538
Conversation
| _useCloneDataConnection = config.UseCloneConnection; | ||
|
|
||
| if (_opts.RetryPolicyOptions.RetryPolicy is null && _opts.ConnectionOptions.ProviderName!.ToLowerInvariant().StartsWith("sqlserver")) | ||
| if (_opts.RetryPolicyOptions.RetryPolicy is null && _opts.RetryPolicyOptions.Factory is null && _opts.ConnectionOptions.ProviderName!.ToLowerInvariant().StartsWith("sqlserver")) |
There was a problem hiding this comment.
Add missing retry policy factory check. The Factory (if not null) is used internally by Linq2Db to populate DataConnection.RetryPolicy if the DataOption.RetryPolicyOptions.RetryPolicy is null.
| await using (var connection = ConnectionFactory.GetConnection()) | ||
| { | ||
| maxMarkedDeletion = await MaxMarkedForDeletionMaxPersistenceIdQuery(connection, persistenceId, maxSequenceNr).FirstOrDefaultAsync(ShutdownToken); | ||
| } |
There was a problem hiding this comment.
Retrieve the highest affected sequence number that is affected by the delete operation. No need to use transaction, this is strictly a read operation.
| if (maxMarkedDeletion is 0) | ||
| return; |
There was a problem hiding this comment.
Early bail out condition, just return if there's nothing to delete.
| r => | ||
| r.PersistenceId == persistenceId && | ||
| r.SequenceNumber <= maxSequenceNr) | ||
| r.SequenceNumber == maxMarkedDeletion) |
There was a problem hiding this comment.
Instead of marking all of the records less than the requested sequence number as deleted, we mark the highest affected instead. Update 1 record instead of many.
| await journalTable | ||
| .Where( | ||
| r => | ||
| r.PersistenceId == persistenceId && | ||
| r.SequenceNumber < maxMarkedDeletion) | ||
| .DeleteAsync(token); |
There was a problem hiding this comment.
Simplify the deletion query, just delete anything that is less than the highest affected row.
| => connection | ||
| .GetTable<JournalRow>() | ||
| .Where(r => r.PersistenceId == persistenceId && r.Deleted) | ||
| .Where(r => r.PersistenceId == persistenceId && r.SequenceNumber <= maxSequenceNr) |
There was a problem hiding this comment.
Optimization, instead of checking against persistence id and deleted field, check against persistence id and sequence number instead, as they are indexed.
# Conflicts: # src/Akka.Persistence.Sql/Journal/Dao/BaseByteArrayJournalDao.cs
Note
These changes are safe because we're using transactions, the old JVM implementation queries were designed that way because they have to account for possible non-transaction delete failures.
We need transaction guarantee because we have to write to multiple tables to support backward compatibility and tag table indexing.
Changes
Optimize event journal delete code