Skip to content

[native] Add positional deletes to Iceberg native tpcds e2e tests#22867

Merged
yingsu00 merged 1 commit intoprestodb:masterfrom
nmahadevuni:iceberg_tpcds_e2etest_pos_delete
Jul 4, 2024
Merged

[native] Add positional deletes to Iceberg native tpcds e2e tests#22867
yingsu00 merged 1 commit intoprestodb:masterfrom
nmahadevuni:iceberg_tpcds_e2etest_pos_delete

Conversation

@nmahadevuni
Copy link
Member

Description

Added delete phase to the Iceberg tpc-ds native e2e tests.

Motivation and Context

Goal to add full data maintenance phases described in the tpc-ds spec to the e2e tests.

Impact

No impact

Test Plan

Added a new test to run deletes, verify and run all the tpc-ds queries.

== NO RELEASE NOTE ==

@nmahadevuni nmahadevuni requested review from a team, feilong-liu and jaystarshot as code owners May 30, 2024 10:42
@nmahadevuni nmahadevuni requested a review from presto-oss May 30, 2024 10:42
@nmahadevuni nmahadevuni changed the title Add positional deletes to Iceberg native tpcds e2e tests [native] Add positional deletes to Iceberg native tpcds e2e tests May 31, 2024
@nmahadevuni
Copy link
Member Author

@yingsu00 Please review.

@nmahadevuni nmahadevuni requested a review from yingsu00 May 31, 2024 04:32
@nmahadevuni nmahadevuni force-pushed the iceberg_tpcds_e2etest_pos_delete branch from 30aa3b9 to b3a6e4a Compare June 10, 2024 10:50
Map<String, Long> deletedRowsMap = new HashMap<>();

@Override
protected void doDeletes()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These delete queries are the same for positional delete and equality delete, and . I think they better be in the parent class TestPrestoNativeIcebergTpcdsQueriesParquetUsingThrift.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presto doesn't support writing equality delete yet, so it's ok to remove this class after moving all common queries to the parent class. In the future if equality delete yet is supported, then we can subclass TestPrestoNativeIcebergTpcdsQueriesParquetUsingThrift to TestPrestoNativeIcebergTpcdsQueriesParquetUsingThriftWithPositionalDelete and TestPrestoNativeIcebergTpcdsQueriesParquetUsingThriftWithEqualityDelete, and these two classes should be only a few lines which is used to set the delete method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are doing the equality deletes using the Iceberg API, since there is no support to enforce equality deletes using DELETE. So this function will have a different implementation for equality delete test case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will move it now to base class. If there is support to enforce the type of delete in the future, we can keep it here. Or else, we will move it back to derived classes.


protected void dropTables()
{
for (String table : tpcdsFactTableNames) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the implementation in this class? Can any of AbstractTestNativeTpcdsQueries's children have different implementation than this? If not, let's move it to AbstractTestNativeTpcdsQueries's

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will move it to base class.

@nmahadevuni nmahadevuni force-pushed the iceberg_tpcds_e2etest_pos_delete branch from b3a6e4a to 7899740 Compare June 28, 2024 10:51
@nmahadevuni
Copy link
Member Author

@yingsu00 addressed the comments here. Please check.

@yingsu00 yingsu00 merged commit eb53161 into prestodb:master Jul 4, 2024
@ajaygeorge
Copy link
Contributor

ajaygeorge commented Jul 9, 2024

@nmahadevuni Looks like this PR has introduced a test failure which is blocking PR merges. Can you please take a look.

[ERROR]   TestPrestoNativeIcebergTpcdsQueriesParquetUsingThrift.doDeletesAndQuery:399->verifyDeletes:390->AbstractTestQueryFramework.computeScalar:149->AbstractTestQueryFramework.computeActual:139 ? Runtime size <= capacity_ (18446744073709551499 vs. 1440) Split [Hive: file:/tmp/iceberg_data/HIVE/tpcds/store_sales/data/593eb530-2b2c-4b34-8bf9-caba66ad83d3.parquet 0 - 2071074] Task 20240709_152552_00114_sdniy.1.0.0.0 Operator: TableScan[0] 0
[INFO] 
[ERROR] Tests run: 132, Failures: 1, Errors: 0, Skipped: 0

See https://app.circleci.com/pipelines/github/prestodb/presto/17633/workflows/314944a1-d3c2-4aba-a170-b3fd171d9cff/jobs/69606
https://app.circleci.com/pipelines/github/prestodb/presto/17611/workflows/926ddb6e-1de6-4e0c-b06b-36f6d7a84f4b/jobs/69499

@majetideepak
Copy link
Collaborator

@ajaygeorge, Looks like the test started failing with this #23138 merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants