Skip to content

Add Iceberg properties in SHOW CREATE MATERIALIZED VIEW#12522

Merged
findepi merged 2 commits intotrinodb:masterfrom
polaris6:show-create-mv-add-properties
Jun 6, 2022
Merged

Add Iceberg properties in SHOW CREATE MATERIALIZED VIEW#12522
findepi merged 2 commits intotrinodb:masterfrom
polaris6:show-create-mv-add-properties

Conversation

@polaris6
Copy link
Copy Markdown
Member

@polaris6 polaris6 commented May 24, 2022

Description

Show Iceberg location and format_version in SHOW CREATE MATERIALIZED VIEW.

Related issues, pull requests, and links

Fixes #12504

Documentation

(x) No documentation is needed.

Release notes

(x) Release notes entries required with the following suggested text:

# Iceberg
* Show Iceberg location and format_version in `SHOW CREATE MATERIALIZED VIEW`. ({issue}`12504 `)

@cla-bot cla-bot bot added the cla-signed label May 24, 2022
@findepi findepi requested a review from alexjo2144 May 24, 2022 07:06
Copy link
Copy Markdown
Member

@alexjo2144 alexjo2144 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create the MV as it used to be created and update only the SHOW CREATE ... assertion:

Path schemaDirectory = getDistributedQueryRunner().getCoordinator().getBaseDataDir().resolve("iceberg_data/tpch");
        assertThat((String) computeScalar("show create materialized view  materialized_view_window"))
                .matches("\\QCREATE MATERIALIZED VIEW " + qualifiedMaterializedViewName + "\n" +
                        "WITH (\n" +
                        "   format = 'ORC',\n" +
                        "   format_version = 2,\n" +
                        "   location = '" + schemaDirectory + "/st_\\E[0-9a-f]+\\Q',\n" +
                        "   partitioning = ARRAY['_date']\n" +
                        ") AS\n" +
                        "SELECT\n" +
                        "  _date\n" +
                        ", sum(_bigint) OVER (PARTITION BY _date ORDER BY _date ASC) sum_ints\n" +
                        "FROM\n" +
                        "  base_table1");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same in the other test

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your advice, I updated the commit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can icebergTable.location() be an empty string? when?

Copy link
Copy Markdown
Member Author

@polaris6 polaris6 May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This location is actually the location of the storage table, when the storage table is dropped and a storage table with an empty location with the same name is created, the location will be empty.
Full Screenshot:
image

In addition, does it need to be consistent with spark? Show location regardless of whether location is empty or not.
image

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This location is actually the location of the storage table, when the storage table is dropped and a storage table with an empty location with the same name is created, the location will be empty.

This is not a valid situation we need to care about.
If a storage table is manually replaced with something else, all sorts of problems may occur.
SHOW CREATE MATERIALIZED VIEW may fail in such case.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thus, please remove the condition

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@polaris6 polaris6 force-pushed the show-create-mv-add-properties branch 2 times, most recently from 7029d2c to 45600b4 Compare May 27, 2022 12:25
Copy link
Copy Markdown
Member

@raunaqmorarka raunaqmorarka May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On latest mater we're now also missing ORC_BLOOM_FILTER_COLUMNS and ORC_BLOOM_FILTER_FPP.
I would suggest extracting common code for this from IcebergMetadata#getTableMetadata and re-using it here to avoid the need to update in two places.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your advice!
I have another question, does SHOW CREATE TABLE need to show orc_bloom_filter_columns and orc_bloom_filter_fpp properties? These two properties are specific to the orc format and are not generic, I personally think they should be shown through $properties instead of SHOW CREATE TABLE, of course $properties can already show them now.
https://trino.io/docs/current/connector/iceberg.html#properties-table

Copy link
Copy Markdown
Member

@raunaqmorarka raunaqmorarka May 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the bloom filter properties already don't show up in SHOW CREATE MV if bloom filter is not configured (correct me if I'm wrong). That behaviour seems ok to me.
It would be great if we also include a test in BaseIcebergMaterializedViewTest which uses the orc bloom filter properties.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Currently, SHOW CREATE TABLE shows orc_bloom_filter_columns and orc_bloom_filter_fpp :
image

I will add test cases in BaseIcebergMaterializedViewTest.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raunaqmorarka I updated the testShowCreate() as I felt no need to add a separate test case to test this situation.

@polaris6 polaris6 force-pushed the show-create-mv-add-properties branch from 45600b4 to 8e444a6 Compare May 28, 2022 08:41
@polaris6 polaris6 force-pushed the show-create-mv-add-properties branch from 8e444a6 to 8cca94b Compare May 30, 2022 10:51
@polaris6
Copy link
Copy Markdown
Member Author

polaris6 commented Jun 1, 2022

Hi @findepi, could you please help me review or merge this PR when you have time? Thank you very much!

@polaris6 polaris6 requested a review from findepi June 6, 2022 12:45
@findepi findepi merged commit 6ec04ff into trinodb:master Jun 6, 2022
@findepi findepi mentioned this pull request Jun 6, 2022
@github-actions github-actions bot added this to the 385 milestone Jun 6, 2022
@polaris6 polaris6 deleted the show-create-mv-add-properties branch June 16, 2022 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Show Iceberg location and format_version in SHOW CREATE MATERIALIZED VIEW

4 participants