Skip to content

Conversation

@parisni
Copy link
Contributor

@parisni parisni commented Nov 7, 2025

Describe the issue this Pull Request addresses

the #9347 PR introduced a regression on the aws glue comments when moving to the aws sdk v2. This restore the original behavior

Summary and Changelog

Impact

Risk Level

Documentation Update

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

@github-actions github-actions bot added the size:XS PR with lines of changes in <= 10 label Nov 7, 2025
@parisni parisni marked this pull request as draft November 7, 2025 22:46
Copy link
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Possible to add a test case on this?

@github-actions github-actions bot added size:L PR with lines of changes in (300, 1000] and removed size:XS PR with lines of changes in <= 10 labels Nov 8, 2025
@parisni
Copy link
Contributor Author

parisni commented Nov 8, 2025

@yihua

  • added base IT test with moto to emulate aws glue + hive test tools to emulate hudi local table
  • to make glue client connect to moto i did introduce sts configs
  • tested and fixed the setComment implem
  • dropped improved useless partition pushdown tests
  • removed moto/dynamodb platform and also use moto official image (i m working on arm64 and the current setup couln't work)

@parisni parisni force-pushed the feat-aws-glue-comment branch from bb2fda9 to abf8e3d Compare November 8, 2025 22:37
@parisni parisni changed the title feat: fix aws glue sync set comment fix: aws glue sync set comment Nov 8, 2025
@parisni parisni requested a review from yihua November 8, 2025 22:49
@parisni parisni marked this pull request as ready for review November 8, 2025 22:49
@parisni
Copy link
Contributor Author

parisni commented Nov 8, 2025

 Error:  testRemoveTableComments  Time elapsed: 0.03 s  <<< ERROR!
java.lang.NoSuchMethodError: org.apache.parquet.schema.Types$PrimitiveBuilder.as(Lorg/apache/parquet/schema/LogicalTypeAnnotation;)Lorg/apache/parquet/schema/Types$Builder;
	at org.apache.hudi.aws.sync.ITTestAWSGlueComments.setUp(ITTestAWSGlueComments.java:58)

locally i run the IT with spark profile 3.3 successfuly. The CI run spark 3.5. Sounds like there is a parquet deps conflict on this profile

@parisni parisni force-pushed the feat-aws-glue-comment branch 2 times, most recently from 5d5d659 to 7bafc7a Compare November 10, 2025 08:54
parisni added a commit to leboncoin/hudi that referenced this pull request Nov 10, 2025
parisni added a commit to leboncoin/hudi that referenced this pull request Nov 10, 2025
parisni added a commit to leboncoin/hudi that referenced this pull request Nov 10, 2025
fix problem with array

fix array of struct
@parisni
Copy link
Contributor Author

parisni commented Nov 12, 2025

@yihua I added a test framework that will be useful to validate the aws binding. what about landing this fix into 1.1 ?

@parisni
Copy link
Contributor Author

parisni commented Nov 12, 2025

also maybe @danny0405 ?

@danny0405
Copy link
Contributor

There is a test failure:

[ERROR] Errors: 
[ERROR]   TestAwsGlueSyncTool.validateInitThroughSyncTool:96 » Hoodie Unable to instanti...


private void setComments(List<Column> columns, Map<String, Option<String>> commentsMap) {
columns.forEach(column -> {
private List<Column> setComments(List<Column> columns, Map<String, Option<String>> commentsMap) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original logic does not take any effect right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the point. BTW the test did prove the original method fail. New implem fixes the tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fix the test failures.

)
.build())
.build()).build()).get();
public void setUpPushdownTest() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we can ensure the super class setUp() been executed before this one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The answer is true if the method is not overidden.

Copy link
Contributor

@danny0405 danny0405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, let's fix the test failures.

add tests

fix deps

restore partition pushd down it

fix checkstyle

fix test
@parisni parisni force-pushed the feat-aws-glue-comment branch from 7bafc7a to 5613bfe Compare November 13, 2025 08:58
@parisni
Copy link
Contributor Author

parisni commented Nov 13, 2025

@danny0405 i fixed the tests, let wait the CI to confirm. Note that without #14235 the comments still won't be added. The reason is the comments are coming from the tableResolver = the avro schema of the last commit. While hudi stores the comments in the avro, the tableResolver doesn't extract them. That's what #14235 provides. Let me fix the CI there

</image>
<image>
<name>apachehudi/moto:${moto.version}</name>
<name>motoserver/moto:${moto.version}</name>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what the exact effect of the change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hudi image is not cross platform. Only amd64. The motoserver is cross platform. Not sure whonisballowednto push hudi images to dockerhub

<name>amazon/dynamodb-local:${dynamodb-local.version}</name>
<alias>it-database</alias>
<run>
<platform>linux/amd64</platform>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove this

<name>motoserver/moto:${moto.version}</name>
<alias>it-aws</alias>
<run>
<platform>linux/amd64</platform>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the image agnostic to the platform. Pining to amd64 did not allow to run the test on arm64 or apple silicon

@danny0405
Copy link
Contributor

@hudi-bot run azure

@danny0405
Copy link
Contributor

@hudi-bot run azure

@danny0405
Copy link
Contributor

@parisni can you retrigger this PR CI by a dummy commit, I can not trigger it through the cmd.

@parisni
Copy link
Contributor Author

parisni commented Nov 18, 2025

@danny0405 done. also in the other PR

@danny0405
Copy link
Contributor

@hudi-bot run azure

Copy link
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parisni Could you rebase your branch on top of the latest master? A few fixes on CI flakiness have been merged.

Copy link
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. @parisni Could you check CI failures and rebase the PR on master?

@parisni
Copy link
Contributor Author

parisni commented Dec 22, 2025

Hi @yihua @danny0405 i did again rebase on master.

@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L PR with lines of changes in (300, 1000]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants