Skip to content

Fix FaultTolerantPartitioningScheme obtaining logic#19979

Merged
losipiuk merged 4 commits intotrinodb:masterfrom
losipiuk:lo/fix-fte-part-scheme
Dec 1, 2023
Merged

Fix FaultTolerantPartitioningScheme obtaining logic#19979
losipiuk merged 4 commits intotrinodb:masterfrom
losipiuk:lo/fix-fte-part-scheme

Conversation

@losipiuk
Copy link
Member

@losipiuk losipiuk commented Nov 30, 2023

With caching logic used when FaultTolerantPartitioningScheme was obtained it was possible to get FaultTolerantPartitioningScheme with wrongly set partition count. In cache keyed on PartitioningHandle we could store a FaultTolerantPartitioningScheme object obtained for specific partitionCount value. Later on when we asked for same handle but without specifying partitionCount we still got cached FaultTolerantPartitioningScheme.

This commit changes cache key to inclue partitionCount.

Description

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Section
* Fix possible query failure for MERGE queries when `retry-policy` set to `TASK` 
  and `query.determine-partition-count-for-write-enabled` set to `true`.  ({issue}`19979`)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withPartitionCount became unused, remove

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before the change, we guaranteed that "if value is taken from the cache, the partitionCount is made to be exactly as requested".
i don't know why we didn't guarantee that for new values.

maybe we could add this

if (partitionCount.isPresent()) {
  verify(result.getPartitionCount() == partitionCount.get(), "...");
}

however, create doesn't seem to guarantee this property -- return new FaultTolerantPartitioningScheme(1, ....

Thus this PR (1) fixes caching logic
and introduces a pretty subtle side-effect (maybe correct). If this side effect is intentional, it would be good to call it out (perhaps as separate commit)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure this method should not be called with non-empty partition count for case when create would return scheme with partitionCount == 1.
verify seems like a good change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move below create, maybe bottom of the class

With caching logic used when FaultTolerantPartitioningScheme was
obtained it was possible to get FaultTolerantPartitioningScheme with
wrongly set partition count. In cache keyed on PartitioningHandle we
could store a FaultTolerantPartitioningScheme object obtained for
specific partitionCount value. Later on when we asked for same handle
but without specifying partitionCount we still got cached
FaultTolerantPartitioningScheme.

This commit changes cache key to inclue partitionCount.
@losipiuk losipiuk force-pushed the lo/fix-fte-part-scheme branch from 2d382d1 to 6eefc88 Compare December 1, 2023 12:15
@losipiuk losipiuk requested a review from findepi December 1, 2023 12:15
@github-actions github-actions bot added the iceberg Iceberg connector label Dec 1, 2023
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a data provider would produce much more readable failure messages
otherwise when inspecting CI error message you need to hope the line numbers didn't change between CI and your local copy

out of scope for this pr

The FTE configuration used for tests was missing
fault-tolerant-execution-min-partition-count-for-write, the default
value (50) was not compatible with
fault-tolerant-execution-max-partition-count which was explicitly set to
5. Adding fault-tolerant-execution-min-partition-count-for-write set to
2 and also fault-tolerant-execution-min-partition-count for
completeness
@losipiuk losipiuk force-pushed the lo/fix-fte-part-scheme branch from 6eefc88 to 6aa13dc Compare December 1, 2023 13:23
@losipiuk losipiuk merged commit d8ddf9d into trinodb:master Dec 1, 2023
@github-actions github-actions bot added this to the 435 milestone Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed iceberg Iceberg connector

Development

Successfully merging this pull request may close these issues.

3 participants