Conversation
| ImmutableMap.Builder<String, Optional<Partition>> resultBuilder = ImmutableMap.builder(); | ||
| for (Entry<String, List<String>> entry : partitionNameToPartitionValuesMap.entrySet()) { | ||
| Partition partition = partitionValuesToPartitionMap.get(entry.getValue()); | ||
| if (partition == null) { |
There was a problem hiding this comment.
This logging strategy may be rather naive.
If lots of partitions are deleted between the time their names gets retrieved and the time their metadata gets retrieved, we could end up with lots of useless warnings in the logs.
| ImmutableMap.Builder<String, Optional<Partition>> resultBuilder = ImmutableMap.builder(); | ||
| for (Entry<String, List<String>> entry : partitionNameToPartitionValuesMap.entrySet()) { | ||
| Partition partition = partitionValuesToPartitionMap.get(entry.getValue()); | ||
| if (partition == null) { |
There was a problem hiding this comment.
If the partition is gone during query planning, isn't that an error?
There was a problem hiding this comment.
Indeed. This is an error.
Would it make maybe more sense to log in the exception ALL the partitions which are missing ? Currently we just throw an exception for the first partition found missing.
There was a problem hiding this comment.
There was a problem hiding this comment.
Why no throw here then?
BTW the code here seems to come from
There was a problem hiding this comment.
The signature of the method
Map<String, Optional<Partition>> getPartitionsByNames(String databaseName, String tableName, List<String> partitionNames);
exists there from the first commit recorded on the HiveMetastore.
In this context I think it is considered valid to deal with NULL partitions.
|
The partitions were missing due to a particularity in the AWS Glue batch_get_partition call which returns in case that the response payload exceeds a certain threshold (~ 5 MB) the partition keys which were not processed as part of the request in the UnprocessedKeys field of the response. See #10696 |
No description provided.