Skip to content

Conversation

@kadai0308
Copy link
Contributor

@kadai0308 kadai0308 commented May 17, 2025

Spark expects all StringType fields to be castable to CharSequence, but Iceberg's
readable_metrics lower_bound/upper_bound may decode to java.util.UUID for UUID-typed
columns. This causes a runtime ClassCastException when Spark tries to read those
metrics as UTF8String.

This commit fixes the issue by converting UUID values to string when generating
readable metric values for Spark metadata tables.

Closes: #13077 (comment)

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the fix @kadai0308 ! can you please add an UT for this as well

@kadai0308
Copy link
Contributor Author

kadai0308 commented May 20, 2025

thank you for the fix @kadai0308 ! can you please add an UT for this as well

@Fokko
updated! Please help me review the PR. Thx!

@Fokko Fokko self-requested a review May 20, 2025 20:21
@github-actions
Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jul 10, 2025
@kadai0308
Copy link
Contributor Author

@Fokko
Hi, I am not sure about what's the next step of this PR?

@github-actions github-actions bot removed the stale label Jul 13, 2025
@aperture147
Copy link

I'm having the same problem when I run catalog_name.system.rewrite_manifests. When will this be merged in the next release of iceberg/spark? Thanks.

@kadai0308
Copy link
Contributor Author

@Fokko @singhpk234 Just a friendly reminder about this PR.

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me @kadai0308 👍

@Fokko Fokko merged commit 7fefc46 into apache:main Aug 4, 2025
42 checks passed
@ebyhr
Copy link
Contributor

ebyhr commented Aug 6, 2025

This PR broke one of the tests (TestIcebergParquetSystemTables.testFilesTableReadableMetrics) in Trino Iceberg connector.

I understand this change is mainly intended for Spark, but it will impact Java API users as well. I had to introduce a redundant cast into our connector. Ideally, Spark specific change should be done in Spark module, not API.

@Fokko
Copy link
Contributor

Fokko commented Aug 6, 2025

@ebyhr Good catch, and I agree that we should fix this on the Spark side 👍

Fokko added a commit to Fokko/iceberg that referenced this pull request Aug 6, 2025
Fokko added a commit that referenced this pull request Aug 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot cast java.util.UUID to java.lang.CharSequence

5 participants