Added custom materialization for materialized view#198
Added custom materialization for materialized view#198Jay-code0 wants to merge 3 commits intostarburstdata:masterfrom
Conversation
mdesmet
left a comment
There was a problem hiding this comment.
Please also add tests. You can have a look in the tests directory and the contribution guide on how to get started with testing.
| {%-set existing_type = 'CTE'%} | ||
| {% endif %} | ||
|
|
||
| {%- if existing_type is not none and config_drop_any_existing == 'False' %} |
There was a problem hiding this comment.
config flags should use boolean type and not strings.
| {%- set build_sql = build_materialized_view(target_relation,config_grace_period,config_max_import_duration,config_cron,config_refresh_interval) %} | ||
|
|
||
| {% elif existing_type is not none and config_drop_any_existing == 'True' and existing_type != 'table'%} -- theres an issue with starburst thinking that the materialised view is a table in the information schema, hence the logic on this line | ||
| {%- set drop_existing_sql = "DROP " ~ existing_type ~ " IF EXISTS " ~ existing_relation %} |
There was a problem hiding this comment.
Makes me wonder if this should not be more generally applied. For example a MV can be replaced by a table.
There was a problem hiding this comment.
This line is something i need to look into further at some point, theres an issue where it thinks an existing mat view is a table coming from starburst, hence its excluding table on this line, so causes an error due to recieving the wrong type, the error flow looks like this
- Finds an existing item
-E xisting item is a materialised view, however schema tables in starburst have marked it as a table incorrectly - Decides to run this code to drop a table
- Error occurs as it runs DROP TABLE against a MATERIALIZED VIEW
So this currently ignores table types and will display an error in console informing user a table already exists and they need to manually handle it (drop, move etc..), otherwise itll run fine if its a materialized view incorrectly typed, due to the create or replace part of the SQL statement further on
There was a problem hiding this comment.
These are all good cases that should be integration tested.
There was a problem hiding this comment.
See following PR trinodb/trino#15350 where support in system table tables will be improved with the type.
|
|
||
| {%- set sqlcode= "CREATE OR REPLACE MATERIALIZED VIEW " ~ target_relation ~ " WITH ( | ||
| " ~ schedule_to_use ~ | ||
| "grace_period = \'" ~ config_grace_period ~ "\', |
There was a problem hiding this comment.
Why the single quotes are escaped? I would think it is not required.
There was a problem hiding this comment.
Its so they land correct in starburst, with the escape it looks like this
WITH(
config_item = '1234',
)
without the single quotes it'll produce something like
WITH(
config_item=1234,
)
Ive chose this format due to it working consistently and noticed issues with different values when the code is run manually on starburst, for example a value of 5.33h, so thought its safer keeping the '' in every query sent
There was a problem hiding this comment.
My point is within a double quoted string, I don't think it's needed to escape single quotes. We should definitely include the single quotes based on the type of the property.
|
@damian3031, @hovaesco, @findinpath : As a general question: should we also do a |
|
Thanks @Jay-code0 for opening this PR.
Yes, just put it under a new config flag. |
|
@Jay-code0 thank you for your contribution. It would be helpful to understand what problem are we trying to solve with this new materialization. As far as I know, a materialized view is supposedly created once and then eventually refreshed, if needed, either explicitly by the user or on the fly while doing a If the problem that you're tackling is being able to refresh MVs at the end of the dbt flow, shouldn't this be part of a post hook in case that the dbt transformations completed successfully ? This could be modelled in airflow / Argo. Another approach to refresh materialized views could be dbt hooks https://docs.getdbt.com/reference/resource-configs/pre-hook-post-hook In any case, let's think what kind of problem we are trying to solve and whether a new materialization is the answer for this problem before proceeding. |
|
The use case @Jay-code0 mentioned on dbt slack is to being able to do MV refreshes without having to do it through an external orchestration framework. I think in most cases people will want to refresh the MV when they do a PTAL at https://github.com/dbt-labs/dbt-labs-experimental-features/tree/main/materialized-views for inspiration |
d08d755 to
463ea85
Compare
|
Hi @Jay-code0, do you need any help to move this PR forward? |
|
Hi @hovaesco, I cant progress at the moment due to lack of access to the tools on my side, id appreciate if someone can pickup and develop the tests for it though to move it forward :) (i believe only tests are left to be done) |
|
We have added support for MV in current master and added you as a co-author. Thanks! |
Overview
This new materialization will allow users to create models in dbt which will send the relevant command to starburst to create and schedule a materialized view, current stage of code will do the common config items and also requires the model to send a config item for dropping an existing item with the same name if not the same type (view.cte.table)
Checklist
README.mdupdated and added information about my changechangie newto create a changelog entry