Use safe cache for Delta Lake Transaction Log#11562
Conversation
63faa0a to
a4eac5d
Compare
There was a problem hiding this comment.
Until we get to this point the snapshot could already be existing in the tableSnapshots cache.
A possible alternative would be to have a way to safely know whether the snapshot is already cached and act accordingly.
AtomicBoolean isTableSnapshotAlreadyCached = new AtomicBoolean(true);
TableSnapshot cachedTableSnapshot = tableSnapshots.get(location, () ->
{
isTableSnapshotAlreadyCached.set(false);
return TableSnapshot.load(
table,
fileSystem,
tableLocation,
parquetReaderOptions,
checkpointRowStatisticsWritingEnabled);
});
There was a problem hiding this comment.
Until we get to this point the snapshot could already be existing in the tableSnapshots cache.
And if this is the case we will get it from there instead of loading it.
There was a problem hiding this comment.
AtomicBoolean isTableSnapshotAlreadyCached = new AtomicBoolean(false);
I assume you made a mistake and initial value would be true, right ?
And the point of this part is to change the implementation in such a way we call cache.get only once? Do I understand correctly?
There was a problem hiding this comment.
I assume you made a mistake and initial value would be true, right ?
Yes, I've corrected the snipped.
we call
cache.getonly once
Yes, that's my point.
There was a problem hiding this comment.
I can do that but I am not sure that will be cleaner than the current one. I find them both similarly ugly to follow ;)
@findepi WDYT ?
There was a problem hiding this comment.
will take another look once #11562 (review) gets applied
There was a problem hiding this comment.
parquetReaderOptions and checkpointRowStatisticsWritingEnabled are used only for TableSnapshot.load within the TransactionLogAccess class.
I recommend extracting the logic of the static method to a class which can be injected into TransactionLogAccess . This would give us the opportunity to thoroughly test the fix that you are applying in this PR.
There was a problem hiding this comment.
Sorry I don't understand this comment. Which static method are you talking about?
There was a problem hiding this comment.
io.trino.plugin.deltalake.transactionlog.TableSnapshot#load
There was a problem hiding this comment.
Do you mean io.trino.plugin.deltalake.transactionlog.TableSnapshot.loadSnapshot ? It is not static this is why I am confused.
Also if you extract this to another class what about io.trino.plugin.deltalake.transactionlog.TableSnapshot.getActiveFiles ? Would you also try to extract it ?
There was a problem hiding this comment.
I spotted these settings a bit out of place while reading the PR.
Obviously, it is not mandatory to extract this logic for being able to test the race conditions.
Ideally, this PR should provide appropriate tests to ensure that the class correctly with race conditions.
There was a problem hiding this comment.
I agree that this kind of tests would be good but I am not sure how to implement them. Running some simple operations hundred of times and waiting for failure/success seems like a flaky test to me.
I could maybe simulate something with heavy mocking but I have an impression like this kid of tests are not really used here
There was a problem hiding this comment.
I could maybe simulate something with heavy mocking
This is what I had in mind as well.
There was a problem hiding this comment.
Off-topic: I was trying to understand where the method getActiveFiles is being used and noticed that the tableSnapshot parameter is always produced by a previous call to TransactionLogAccess as well.
transactionLogAccess.getMetadataEntry depends also on a tableSnapshot previously obtained from TransactionLogAccess as well.
It would be worth considering exposing a method which retrieves both the active metadata & files:
// pseudocode
MetadataAndAddFileEntryList getActiveMetadataAndAddFileEntryList(ConnectorSession session)
There was a problem hiding this comment.
I think it makes sense but maybe not in scope of this PR. Please create a separate ticket for that,
a4eac5d to
2376fd4
Compare
There was a problem hiding this comment.
I think it would be better to unwrap the ExecutionException.
(also, watch out for UncheckedExecutionException and handle the same way)
There was a problem hiding this comment.
in tests this should be throws Exception
There was a problem hiding this comment.
(in this PR, you can just undo these changes, see the other comment)
cd6161b to
d6909d0
Compare
|
@findepi comments addressed |
There was a problem hiding this comment.
Restoring this is not that simple. I would need to check if data was loaded from the cache or file system. I would need to add atomic boolean etc (in similar fashion to what Marius suggested). Do you think it makes sense to add all this for a simple warn to be logged ?
There was a problem hiding this comment.
I think it belongs to
if (cachedTable.getVersion() > tableSnapshot.getVersion()) {
return loadActiveFiles(tableSnapshot, session, fileSystem)
block? Please double check me on this
d6909d0 to
a2aed4a
Compare
Description
Related issues, pull requests, and links
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
(x) No release notes entries required.
( ) Release notes entries required with the following suggested text: