Support materialized view creation, drop and query#15589
Support materialized view creation, drop and query#15589highker merged 3 commits intoprestodb:masterfrom
Conversation
highker
left a comment
There was a problem hiding this comment.
Let's have meetings for this PR. There are a lot details to figure out. Also, we need a lot of tests for this
presto-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/MetastoreUtil.java
Outdated
Show resolved
Hide resolved
presto-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/MetastoreUtil.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/execution/CreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/execution/CreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
We should make it a generic interface:
alterTable(ConnectorSession session, String databaseName, String tableName, Table newTable)Then, in this function, we should verify everything else is the same and update the table params only. If we found something else is different (e.g., owner, column, etc), we should fail it. Same, even for table params, we should only touch PRESTO_DEPENDENT_MATERIALIZED_VIEW_LIST.
Add a javadoc to indicate the functionality is very limited
ae2354d to
cea67a3
Compare
4e9afee to
59d3299
Compare
highker
left a comment
There was a problem hiding this comment.
- We need a lot more real tests.
TestHiveDistributedQueriesis a good place to add. - Only half way through the first commit
presto-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/MetastoreUtil.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/ConnectorMaterializedViewDefinition.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/execution/TestCreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/ConnectorMaterializedViewDefinition.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/ConnectorMaterializedViewDefinition.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/ConnectorMaterializedViewDefinition.java
Outdated
Show resolved
Hide resolved
|
|
@gggrace14 can you please rebase these changes? |
highker
left a comment
There was a problem hiding this comment.
Separate out the first commit as a separate PR. We probably don't want to merge the second commit at the moment. We will need DROP, REPLACE, SHOW CREATE, and REFRESH logic first.
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveClientModule.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/execution/CreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestDistributedQueries.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/execution/CreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Oh, forgot to comment. They're used in multiple Test cases (and initialized in setUp()). So might need to be class fields to be accessed by the Test cases.
There was a problem hiding this comment.
Oh, forgot to reply. They're used by multiple Test cases (and initialized in setUp()). So might need to be class members.
There was a problem hiding this comment.
I don't think they are used. Please check IntelliJ warnings. They can all be local variables than class members.
There was a problem hiding this comment.
Got what you mean. Revised. Good to see the IntelliJ highlight.
presto-main/src/test/java/com/facebook/presto/execution/TestCreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/execution/TestCreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/com/facebook/presto/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
presto-hive/src/test/java/com/facebook/presto/hive/TestHiveIntegrationSmokeTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
We need to add a TODO here right? This is where the partition stitching happens. Without TODO, it's very easy to miss this part.
There was a problem hiding this comment.
Is this correct? What would be the column mapping looks like?
test_region_nation_join_mv.nationkey = test_customer_base.nationkey
but test_orders_base is with orderstatus as the partition key right?
There was a problem hiding this comment.
I'm curious why this will pass assert? It should fail right. It doesn't look like a legit use case.
There was a problem hiding this comment.
Suppose for the use case below, the exchange_rate is a small dimension table partitioned by 'country'. If we include country as the mv partition key, a reload of a country will trigger the refresh of impacted revenue_aggregation_mv partitions. If we don't include country as a mv partition key, a reload of country won't affect the past revenue_aggregation_mv data. Need to have this flexibility to handle different customer needs.
CREATE MATERIALIZED VIEW IF NOT EXISTS revenue_aggregation_mv
WITH (partitioned_by = ARRAY['ds'])
AS SELECT
SUM(l.legal_budget * r.to_usd) AS total_legal_budget_usd, country, ds
FROM revenue_aggregation_imp l JOIN exchange_rate r ON l.country = r.country
GROUP BY country, ds
presto-hive/src/test/java/com/facebook/presto/hive/TestHiveIntegrationSmokeTest.java
Outdated
Show resolved
Hide resolved
presto-hive/src/test/java/com/facebook/presto/hive/TestHiveIntegrationSmokeTest.java
Outdated
Show resolved
Hide resolved
63a6b44 to
4b48c96
Compare
There was a problem hiding this comment.
The comment is not addressed. We need to assertQueryFails if we are to insert into a materialized view will fail and delete from a materialized view will fail.
There was a problem hiding this comment.
Can we add a test where a join query will fail? Especially more tests to cover partition column mapping.
There was a problem hiding this comment.
This doesn't look right. We shouldn't throw MISSING_TABLE error. We should tell user we cannot alter a materialized view.
There was a problem hiding this comment.
same, we should throw errors other than MISSING_TABLE
| @@ -172,15 +173,15 @@ public synchronized List<String> getAllDatabases() | |||
| public synchronized void createTable(Table table) | |||
| { | |||
| TableType tableType = TableType.valueOf(table.getTableType()); | |||
There was a problem hiding this comment.
In the class here along with other implementation at this level, org.apache.hadoop.hive.metastore.TableType is being called, while Hive 2.0 doesn't have TableType.MATERIALIZED_VIEW. So I need move the table type casting logic to one level above, ThriftMetastoreUtil.fromMetastoreApiTable() and toMetastoreApiTable().
There was a problem hiding this comment.
put a () around !table.getTableType().equals(MANAGED_TABLE) && !table.getTableType().equals(MATERIALIZED_VIEW)
There was a problem hiding this comment.
Add a TODO here to remove the type change after Hive 3.0 upgrade
There was a problem hiding this comment.
Don't mention Facebook. This is open source; we should be neutral. Add a TODO here to remove the type change after Hive 3.0 upgrade
There was a problem hiding this comment.
Accidental change? We don't have to modify this.
There was a problem hiding this comment.
Here's where we cast managed_table back to materialized_view for getTable().
6eded58 to
8d9b1c1
Compare
There was a problem hiding this comment.
add assertUpdate("DROP TABLE test_customer_base"); and the same for test_orders_base. We wanna clean up the created table after unit test is done.
There was a problem hiding this comment.
same drop table test_customer_base_1 here on exist
Add createMaterializedView API to MetadataManager and add CreateMaterializedViewTask that calls the API to perform MV creation. HiveMetadata implementation of the API is to add essential MV parameters and create MV as a standard table. ConnectorMaterializedViewDefinition is the json structure that contains all essential metadata we will save to Metastore. It will serialize into the viewOriginalText field of table metadata. Add MV partition. At the moment, we only support partitioned MV defined on partitioned base tables. MV must have one partition directly matched (selected) from base tables. Add MATERIALIZED_VIEW table type. We don't support alter schema of materialized view at the moment. We don't support MVs defined across different catalogs, which is enforced in CreateMaterializedViewTask
Add dropMaterializedView API to MetadataManager, and add DropMaterializedViewTask that calls the API to perform MV drop. HiveMetadata implementation of the API is to directly drop the MV table.
kaikalur
left a comment
There was a problem hiding this comment.
Did the language spec get reviewed? I had concerns earlier on VIEW vs MATERIALIZED VIEW? Were those addressed?
Chatted offline with @rongrong once early this year. We are going to model MATERIALIZED VIEW as VIEW with strong consistency. MATERIALIZED VIEW is VIEW with or without backfilled/materialized data. We will have a lot of code in Presto to make sure that. Also, we will have metastore support to guarantee transactions so that base tables and materialized views are always in sync. |
Depended by https://github.com/facebookexternal/presto-facebook/pull/1438