[Bug] Deletion of older Iceberg metadata files on new writes causes Iceberg Readers to fail #5189
Open
2 tasks done
Labels
bug
Something isn't working
Search before asking
Paimon version
In Paimon Iceberg compatibility mode, I have observed that Iceberg metadata files are completely replaced (old ones are deleted once new ones are created) as new data is written from either Spark or Flink. This causes existing reader based on old Iceberg metadata files to fail, making the tables non-usable in scenarios where you have multiple readers involved.
I was looking at the code, and it appears we do delete the old metadata files every time a new one is created. There can be 2 different scenarios for that.
Compute Engine
I have verified with both Spark & Flink, but it should be applicable for any compute engine.
Minimal reproduce step
Reproducing steps.
At this point a Iceberg metadata file should be created
Insert some more data into the Paimon table
What doesn't meet your expectations?
Right now, the code doesn't leverage the Iceberg options
write.metadata.previous-versions-max
andwrite.metadata.delete-after-commit.enabled
as mentioned here. In my view, we can support these Iceberg Options for Iceberg compatible Paimon table and delete old metadata files based on the configurations.Anything else?
Let me know about your thoughts here.
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: