Pass allowFailure flag to statistics access#29006
Pass allowFailure flag to statistics access#29006chenjian2664 wants to merge 2 commits intotrinodb:masterfrom
allowFailure flag to statistics access#29006Conversation
036cb3b to
4a9095d
Compare
4a9095d to
6045d9b
Compare
We can fix concurrent access/modification quite easily. How will we fix concurrent writes?
This is not great
|
What's would that be like? there will be still a conflict when we are (concurrent) baking up, or still a chance reading an updating backup. but If you've see an easy way I can have a try on it
We don't fix the concurrent writes here, like previously we are passing the Could you clarify what kind of "concurrent writes" you have in mind -- are we talking about within a single cluster or across multiple clusters? But it wouldn't matter if it is just a optimization we could just ignore it.
I might be missing something, but why do we have an |
findinpath
left a comment
There was a problem hiding this comment.
This contribution is in line with #18506
While the approach of "ignoring failures" is certainly not ideal, pragmatically this contribution serves at least in avoiding test flakiness.
Let's use the opportunity and come up with a rough action plan on how to deal with Delta Lake table extended statistics before we go forward and merge and forget about the underlying issue (the fact that multiple readers & writers are trying to read & write from the very same extended stats file sometimes concurrently).
Metadata statistics are treated as a best-effort optimization and should not fail query execution. This change does not fully address concurrency issues in metadata statistics access, but allows the primary execution path to proceed when statistics are non-critical.
6045d9b to
feb6973
Compare
|
temporary solution #29170 |
Description
Concurrent access to metadata statistics may lead to errors:
as the current implementation does not provide concurrency guarantees.
Since statistics access is typically non-essential to query execution, this PR
introducespass a flag to define the failure semantics, enabling best-effort handling where metadata statistics access failures do not interrupt the main execution path.Fixes #21725
Fixes #22455
Additional context and related issues
INSERTin Delta Lake #18506 the flag was introducedRelease notes
(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: