Skip to content

Refactor Delta's DROP TABLE support for corrupted tables#16651

Merged
findepi merged 4 commits intomasterfrom
findepi/drop-delta-corrupted
Mar 24, 2023
Merged

Refactor Delta's DROP TABLE support for corrupted tables#16651
findepi merged 4 commits intomasterfrom
findepi/drop-delta-corrupted

Conversation

@findepi
Copy link
Copy Markdown
Member

@findepi findepi commented Mar 21, 2023

Previously, a corrupted table handle was represented with a
DeltaLakeTableHandle missing a MetadataEntry. When drop support for
corrupted tables was implemented, a connector could have only one table
handle class. This commit improves the distinction by introducing a
dedicated table handle class to represent corrupted tables. As a
necessity, this class is explicitly handled in multiple
DeltaLakeMetadata methods. This sets viable example to follow for
implementing drop support for corrupted Iceberg tables as a follow-up.

Note: testGetInsertLayoutTableNotFound test was removed, instead of
being updated, since ConnectorMetadata.getInsertLayout cannot be
reached for a corrupted table, as getting column list will fail earlier.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected that the target of #16248 is only Iceberg connector. Could you replace the link with #16662?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i linked to the PR that adds syntax.
Once we have syntax supported we should have an uncommented test line here, even before the functionality is supported in Delta

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: NullPointerException here (before the refactor)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: before the change we allowed rename. I think it was incidental

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: testGetInsertLayoutTableNotFound test was removed, instead of
being updated, since ConnectorMetadata.getInsertLayout cannot be
reached for a corrupted table, as getting column list will fail earlier.

@findepi findepi force-pushed the findepi/drop-delta-corrupted branch from bb00968 to 9afbe9f Compare March 21, 2023 16:26
@github-actions github-actions bot added the delta-lake Delta Lake connector label Mar 21, 2023
@findepi
Copy link
Copy Markdown
Member Author

findepi commented Mar 21, 2023

CI #16658

@findepi
Copy link
Copy Markdown
Member Author

findepi commented Mar 21, 2023

CI ... and some bigquery failures which i don't understand. cc @ebyhr @hashhar

@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Mar 21, 2023

The BigQuery CI failures are handled in #16603

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected that the target of #16248 is only Iceberg connector. Could you replace the link with #16662?

@findepi
Copy link
Copy Markdown
Member Author

findepi commented Mar 22, 2023

Conflict with #16466, rebasing

Comment on lines 619 to 621
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can extract throwIfCourrupted(ConnectorTableHandle handle) method

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wanted to have throw in the calling code so that code flow analysis works, but i don't think there is much gain from that.

will change

Previously, a corrupted table handle was represented with a
`DeltaLakeTableHandle` missing a `MetadataEntry`.  When drop support for
corrupted tables was implemented, a connector could have only one table
handle class.  This commit improves the distinction by introducing a
dedicated table handle class to represent corrupted tables. As a
necessity, this class is explicitly handled in multiple
`DeltaLakeMetadata` methods. This sets viable example to follow for
implementing drop support for corrupted Iceberg tables as a follow-up.

Note: `testGetInsertLayoutTableNotFound` test was removed, instead of
being updated, since `ConnectorMetadata.getInsertLayout` cannot be
reached for a corrupted table, as getting column list will fail earlier.
@findepi
Copy link
Copy Markdown
Member Author

findepi commented Mar 23, 2023

CI #13995 and #16694 (fix, not issue)

@findepi findepi force-pushed the findepi/drop-delta-corrupted branch from 6045f41 to c23ec5e Compare March 23, 2023 12:18

// DROP TABLE should succeed so that users can remove their corrupted table
getQueryRunner().execute("DROP TABLE " + tableName);
assertFalse(getQueryRunner().tableExists(getSession(), tableName));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also verify that the table location does not have any files after DROP TABLE execute?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed offline, in this test we already deleted all the files

       // break the table by deleting all its files including transaction log
        deleteRecursively(tableLocation, ALLOW_INSECURE);

i like the idea to improve the test. let's delete only transaction log and then add assertion that data files got removed.
since the PR is otherwise ready to go let's handle this as a follow-up

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up PR #16764

@findepi
Copy link
Copy Markdown
Member Author

findepi commented Mar 24, 2023

CI #7535 (will address with #16707)

@findepi findepi merged commit d2534ab into master Mar 24, 2023
@findepi findepi deleted the findepi/drop-delta-corrupted branch March 24, 2023 10:14
@github-actions github-actions bot added this to the 411 milestone Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed delta-lake Delta Lake connector

Development

Successfully merging this pull request may close these issues.

4 participants