Skip to content

Fix GlueHiveMetastore#batchGetPartition to handle unprocessedKeys#10696

Merged
findepi merged 1 commit intotrinodb:masterfrom
cacts:handle-glue-unprocessed-keys
Jan 27, 2022
Merged

Fix GlueHiveMetastore#batchGetPartition to handle unprocessedKeys#10696
findepi merged 1 commit intotrinodb:masterfrom
cacts:handle-glue-unprocessed-keys

Conversation

@cacts
Copy link
Copy Markdown
Member

@cacts cacts commented Jan 19, 2022

Cherry-picked and updated for presto -> trino changes from original PR #6155. @rentaow please let me know if you'd like to update yours instead and I'll close this :)

BatchGetPartitionsRequest could potentially not return all requested partition values. These unprocessed keys are returned in a separate field and can be retrieved successfully on subsequent requests. If these keys are ignored, this results in a HIVE_PARTITION_DROPPED_DURING_QUERY exception.

@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Jan 19, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Rentao Wu.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cacts cacts force-pushed the handle-glue-unprocessed-keys branch from 5279525 to aab1128 Compare January 19, 2022 16:10
@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Jan 19, 2022

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to cla@trino.io. For more information, see https://github.com/trinodb/cla.

@findepi
Copy link
Copy Markdown
Member

findepi commented Jan 19, 2022

@cactauz thank you for your PR. Please ping me once you have the CLA signed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense here to log the fact that Glue API returned UnprocessedKeys ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, to log on debug.

BatchGetPartitionsRequest could potentially not return all requested
partition values. If Glue returns incomplete partitions, the query
will error out. This change is to handle unprocessedKeys from
BatchGetParititonsResults.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, to log on debug.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partitionValueLists -> pendingPartitions

Comment on lines 933 to 934
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batchGetPartitionResult.getPartitions().isEmpty() implies that !batchGetPartitionResult.getUnprocessedKeys().isEmpty(), or Glue service is cheating (or we have wrong assumptions how it works)

@findepi findepi force-pushed the handle-glue-unprocessed-keys branch from aab1128 to cf160e9 Compare January 20, 2022 08:48
@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Jan 20, 2022

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to cla@trino.io. For more information, see https://github.com/trinodb/cla.

@martint
Copy link
Copy Markdown
Member

martint commented Jan 24, 2022

@cla-bot check

@cla-bot cla-bot bot added the cla-signed label Jan 24, 2022
@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Jan 24, 2022

The cla-bot has been summoned, and re-checked this pull request!

// In the unlikely scenario where batchGetPartition call cannot make progress on retrieving partitions, avoid infinite loop
if (partitions.isEmpty()) {
verify(!unprocessedKeys.isEmpty(), "Empty unprocessedKeys for non-empty BatchGetPartitionRequest and empty partitions result");
throw new TrinoException(HIVE_METASTORE_ERROR, "Cannot make progress retrieving partitions. Unable to retrieve partitions: " + unprocessedKeys);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like Glue doesn't guarantee there will be at least one partition processed.
See #12516

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants