Skip to content

Add Databricks 11.3 test environment to Delta Lake#14848

Merged
ebyhr merged 1 commit intomasterfrom
ebi/delta-databricks-11.3
Nov 15, 2022
Merged

Add Databricks 11.3 test environment to Delta Lake#14848
ebyhr merged 1 commit intomasterfrom
ebi/delta-databricks-11.3

Conversation

@ebyhr
Copy link
Copy Markdown
Member

@ebyhr ebyhr commented Nov 1, 2022

Description

Non-technical explanation

Release notes

(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Nov 1, 2022
@ebyhr ebyhr force-pushed the ebi/delta-databricks-11.3 branch 3 times, most recently from f799b4a to a40fa2f Compare November 8, 2022 09:08
@ebyhr ebyhr marked this pull request as ready for review November 8, 2022 09:09
@findinpath
Copy link
Copy Markdown
Contributor

checkstyle issue: src/main/java/io/trino/tests/product/deltalake/util/DatabricksVersion.java:[20] (regexp) RegexpSinglelineJava: No new line before extends/implements

@ebyhr ebyhr force-pushed the ebi/delta-databricks-11.3 branch 2 times, most recently from 95eff6f to 2e2a25e Compare November 8, 2022 10:26
private String getDatabricksDefaultTableProperties()
{
return "TBLPROPERTIES (\n" +
" 'Type' = 'EXTERNAL',\n" +
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did 11.3 get rid of the type param?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes from SHOW CREATE TABLE result. DESCRIBE EXTENDED still provides such information.

@ebyhr ebyhr force-pushed the ebi/delta-databricks-11.3 branch from 2e2a25e to 7c73ff8 Compare November 9, 2022 02:25
@findinpath
Copy link
Copy Markdown
Contributor

Checkstyle failure The field 'databricksRuntimeVersion' is never read.

@ebyhr ebyhr force-pushed the ebi/delta-databricks-11.3 branch from 7c73ff8 to c5c5de6 Compare November 9, 2022 05:17
Comment on lines 741 to 745
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this two entries?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's under exclude section. These two entries are required to ignore hdp3 always and then enable another one when the secret exists in my understanding.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nineinchnick @hashhar please review the ci.yml changes

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Note it adds quite a lot of time to the total CI runtime:
obraz

Do we need to test all LTSes?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've brought up EOLing the tests for version 7.3. Maybe it's time, though DB is supporting it through March

Comment on lines 230 to 231
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that doesn't mean we should exclude the test, do it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we can do in 11.3 is verifying a checkpoint file wasn't created in CREATE OR REPLACE TABLE and skip, but I don't feel such logic is helpful in this test method. The test objective is accomplished by other versions. If we verify Databricks behavior (which operation creates a checkpoint file or not), adding dedicated tests look better to me.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can CREATE OR REPLACE TABLE create a checkpoint in 11.3?
Can we force it to do that?

Note that the test specificially wants to test the scenario when checkpoint changes the schema

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I couldn't find a way to create a checkpoint using the statement, such options, alternative procedures or something in 11.3.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that. Does Databricks 10.4 add ZSTD support out of the box?
🎉 finally we can switch Delta's default compression to ZSTD, right?

cc @losipiuk @alexjo2144

@ebyhr ebyhr force-pushed the ebi/delta-databricks-11.3 branch 2 times, most recently from 5a41dde to 8937957 Compare November 10, 2022 00:14
@github-actions github-actions bot added the docs label Nov 10, 2022
@ebyhr
Copy link
Copy Markdown
Member Author

ebyhr commented Nov 10, 2022

Addressed comments.

@ebyhr ebyhr requested a review from findepi November 10, 2022 22:17
" 'delta.minReaderVersion' = '1',\n" +
" 'delta.minWriterVersion' = '2')\n";
}
else if (databricksRuntimeVersion.equals(DATABRICKS_104_RUNTIME_VERSION)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant else

Comment on lines 230 to 231
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can CREATE OR REPLACE TABLE create a checkpoint in 11.3?
Can we force it to do that?

Note that the test specificially wants to test the scenario when checkpoint changes the schema

" 'delta.minReaderVersion' = '1',\n" +
" 'delta.minWriterVersion' = '2')\n";
}
else if (databricksRuntimeVersion.equals(DATABRICKS_104_RUNTIME_VERSION)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else is redundant

.containsOnly(expected);

if ("ZSTD".equals(compressionCodec) && !databricksRuntimeVersion.equals(DATABRICKS_104_RUNTIME_VERSION)) {
if ("ZSTD".equals(compressionCodec) && databricksRuntimeVersion.orElseThrow().isOlderThan(DATABRICKS_104_RUNTIME_VERSION)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if databricksRuntimeVersion must be present, can you make the field non-Optional, as you did in other tests?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class has DELTA_LAKE_OSS group, so we can't make it non-Optional in the field.

.result();
}

public static DatabricksVersion createFromString(String versionString)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse

@ebyhr ebyhr force-pushed the ebi/delta-databricks-11.3 branch from 8937957 to 9336f7e Compare November 14, 2022 22:56
@ebyhr ebyhr merged commit 715c3fb into master Nov 15, 2022
@ebyhr ebyhr deleted the ebi/delta-databricks-11.3 branch November 15, 2022 05:00
@github-actions github-actions bot added this to the 403 milestone Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants