Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark <> Iceberg bug integration test #482

Closed
kevinjqliu opened this issue Feb 29, 2024 · 0 comments · Fixed by #501
Closed

Spark <> Iceberg bug integration test #482

kevinjqliu opened this issue Feb 29, 2024 · 0 comments · Fixed by #501

Comments

@kevinjqliu
Copy link
Contributor

Apache Iceberg version

None

Please describe the bug 🐞

While working on #444, I ran into a weird bug with Spark integration test.

Particularly here
https://github.com/apache/iceberg-python/compare/main...kevinjqliu:iceberg-python:kevinjqliu/weird-spark-bug?expand=1#diff-ae89704e133e5eb800112d7a84557f2976819b2c5d989a62af97bf922865631bR459

The current snapshot contains 10 data files, as verified in the assert statement just above.

assert tbl.current_snapshot().summary['added-data-files'] == '10'

But the Spark metadata table still returns with 1 file.

The issue goes away completely if I use a new table identifier in L452.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant