-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Closed
Labels
area:awsAWS ecosystem supportAWS ecosystem supportarea:query-engineQuery engine integrationsQuery engine integrationspriority:criticalProduction degraded; pipelines stalledProduction degraded; pipelines stalled
Description
Describe the problem you faced
A clear and concise description of the problem.
To Reproduce
Steps to reproduce the behavior:
- DELETE_PARTITION for non-existing partition ( e.g. org_id=55555 )
- since it will raise an exception, you have to wrap the Spark Write.
- this operation will creates org_id=55555_\$folder$ in Hudi Table Path ( BTW, why is it even created? )
- UPSERT to other partition ( e.g. org_id=24 )
- Check the current status
- you will see org_id=55555 partition is in Glue Catalog
- Go to Athena / Run Query
- you will see that the query will fail due to the missing path org_id=55555 in S3
Expected behavior
org_id=55555 MUST not be registered to Catalog
Environment Description
- Hudi version : 0.10.1
- Spark version : 3.1.1-amzn-0
- Hive version : 2.3.7-amzn-4
- Hadoop version : 3.2.1-amzn-3
- Storage (HDFS/S3/GCS..) : S3
- Running on Docker? (yes/no) : NO
Metadata
Metadata
Assignees
Labels
area:awsAWS ecosystem supportAWS ecosystem supportarea:query-engineQuery engine integrationsQuery engine integrationspriority:criticalProduction degraded; pipelines stalledProduction degraded; pipelines stalled