Skip to content

Conversation

@xloya
Copy link
Contributor

@xloya xloya commented Mar 11, 2022

When we through Flink to write enabled the upsert option, and then we specify non equality columns conditions in the query, some of the equality delete files will be lost due to the wrong usage of the filter.

@xloya xloya changed the title Core: Fix delete file index with non equality column filter lost equlity delete file Core: Fix delete file index with non equality column filter lost equlity delete files Mar 11, 2022
Copy link
Contributor

@kbendick kbendick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing a patch @xloya.

It would help if. you could provide more context. Is this something that needs to be special cased for flink upsert mode in some way? Modifying core and removing tests causes me concern at first sight.

It's possible I missed something on the mailing list or some other discussion, forgive me if so. Is it possible there's an example scenario that can be given to help me understand?

@xloya
Copy link
Contributor Author

xloya commented Mar 11, 2022

Thanks for providing a patch @xloya.

It would help if. you could provide more context. Is this something that needs to be special cased for flink upsert mode in some way? Modifying core and removing tests causes me concern at first sight.

It's possible I missed something on the mailing list or some other discussion, forgive me if so. Is it possible there's an example scenario that can be given to help me understand?

Of course, we have a scenario to write data to iceberg's v2 table through Flink CDC. They have non-primary key query scenarios. The current implementation in core will add a filter, which may lose the latest seq num equality delete files for Flink streaming writing.
E.g:
Table schema : (id int (primary key), date date)
When seq num=1, Flink writes a record with id=1, date='2021-01-01', will insert a data record with id=1, date='2021-01-01', and a equality delete data record with an id=1, date='2021-01-01';
When seq num=2, writes a record with id=1, date='2022-01-01' to update, will insert a data record with id=1, date='2022-01-01', and a equality delete record with id=1 ,date='2022-01-01' ;
At this time, when using select * from xxx where date < '2022-01-01' to query, due to the addition of the filter, the equality delete file written when seq num=2 will be filtered out.

This is currently the easiest way to fix the problem. If we want to optimize for Flink upsert, then I think may need to read the latest records with the primary key that already exists in the table and write them to the equality delete file when writing, while instead of writing the inserted data to the equality delete file

Copy link
Contributor

@kbendick kbendick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test and explaining the situation more @xloya.

However, I don't think this is necessarily the appropriate solution. There's several places that call to the delete file index.

Additionally, this removes the ability to have non-conflicting delete transactions go through.

I think we should reconsider how the filter is built in the case of Flink upsert mode (or anything that is affected by this), and likely check for any other places where this issue exists and do the same.

Possibly we can add the seq number into the filter or something? I'd have to take a closer look, but throwing that out there as a possible idea.

I can confirm though that the test provided will fail if filterRows is added back in to the DeleteFileIndex.

cc @RussellSpitzer @flyrain @aokolnychyi who have all worked on the delete file index or helped review it.

@szehon-ho
Copy link
Member

szehon-ho commented Mar 11, 2022

I agree with @kbendick , changing core will break some other engine like Spark that depend on this for things like validation check for serializable isolation mode. Also agree with his suggestion on consider how the Flink side using the filter, and see if we can add some other option to deleteFileIndex to support this use case.

@hililiwei
Copy link
Contributor

hililiwei commented Mar 13, 2022

We seem to have the same issues.

And it only happens on our Parquet table (doesn't happen on our Avro table). After analysis, we found that the problem occurred in the metric (such as upper_bounds \ lower_bounds )filtering process of the MANIFEST file (avro tables did not generate these metric data).

Our solution is different from this PR. We try to trim the row filter fileds used for metric filtering. For deleted manifest file, only the metric filed id in the equality_ids will be filtered, please refer to #4316 for details.

I'm not sure which way is better, or that there is another better solution.

I'm sorry if your PR is to address a different issues.

Thx. 😄

}

@Test
public void testConcurrentNonConflictingEqualityDeletes() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is invalid about this test? It is suspicious that it is removed.

@xloya
Copy link
Contributor Author

xloya commented Mar 14, 2022

We seem to have the same issues.

And it only happens on our Parquet table (doesn't happen on our Avro table). After analysis, we found that the problem occurred in the metric (such as upper_bounds \ lower_bounds )filtering process of the MANIFEST file (avro tables did not generate these metric data).

Our solution is different from this PR. We try to trim the row filter fileds used for metric filtering. For deleted manifest file, only the metric filed id in the equality_ids will be filtered, please refer to #4316 for details.

I'm not sure which way is better, or that there is another better solution.

I'm sorry if your PR is to address a different issues.

Thx. smile

Thank you for your attention to the same issue! I will take a look at the patch you committed later. In fact, I think my patch is only a temporary solution. I just want to get more opinions from out community to find the most appropriate way to fix the problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants