From 95be281f6896343d84eeb9e9f8a1601a1bf69147 Mon Sep 17 00:00:00 2001 From: Will Sargent <109557847+will-sargent-dbtlabs@users.noreply.github.com> Date: Tue, 2 Apr 2024 08:30:27 -0600 Subject: [PATCH 01/12] Update constraints.md --- website/docs/reference/resource-properties/constraints.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/website/docs/reference/resource-properties/constraints.md b/website/docs/reference/resource-properties/constraints.md index 83b5563cadb..4b5f4af51a3 100644 --- a/website/docs/reference/resource-properties/constraints.md +++ b/website/docs/reference/resource-properties/constraints.md @@ -15,6 +15,8 @@ Constraints require the declaration and enforcement of a model [contract](/refer Constraints may be defined for a single column, or at the model level for one or more columns. As a general rule, we recommend defining single-column constraints directly on those columns. +If you are defining multiple `primary_key` constraints for a single model, those MUST be defined at the model level. Defining multiple `primary_key` constraints at the column level is not supported. + The structure of a constraint is: - `type` (required): one of `not_null`, `unique`, `primary_key`, `foreign_key`, `check`, `custom` - `expression`: Free text input to qualify the constraint. Required for certain constraint types, and optional for others. From 5e8bd05ab3a7e33f64168ed813d5809bc4ce115c Mon Sep 17 00:00:00 2001 From: Doug Beatty <44704949+dbeatty10@users.noreply.github.com> Date: Tue, 2 Apr 2024 10:02:20 -0600 Subject: [PATCH 02/12] Documenting custom materializations --- website/docs/faqs/Docs/documenting-macros.md | 23 ++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/website/docs/faqs/Docs/documenting-macros.md b/website/docs/faqs/Docs/documenting-macros.md index 9a2036cd6bf..1710ced0f1a 100644 --- a/website/docs/faqs/Docs/documenting-macros.md +++ b/website/docs/faqs/Docs/documenting-macros.md @@ -27,3 +27,26 @@ macros: ``` + +## Document a custom materialization + +When you create a [custom materialization](/guides/create-new-materializations), dbt creates an associated macro with the following format: +``` +materialization_{materialization_name}_{adapter} +``` + +To document a custom materialization, use the format above to determine the associated macro name(s) to document. + + + +```yaml +version: 2 + +macros: + - name: materialization_my_materialization_name_default + description: A custom materialization to insert records into an append-only table and track when they were added. + - name: materialization_my_materialization_name_xyz + description: A custom materialization to insert records into an append-only table and track when they were added. +``` + + From 26ce6e202fc7d6108bb8dd9ab173620531bd1a32 Mon Sep 17 00:00:00 2001 From: Doug Beatty <44704949+dbeatty10@users.noreply.github.com> Date: Thu, 4 Apr 2024 02:55:13 -0600 Subject: [PATCH 03/12] Fix heading and links for `loaded_at_field` (#5207) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [Preview 1.6](https://docs-getdbt-com-git-dbeatty10-patch-3-dbt-labs.vercel.app//reference/resource-properties/freshness?version=1.6#definition) [Preview 1.7](https://docs-getdbt-com-git-dbeatty10-patch-3-dbt-labs.vercel.app//reference/resource-properties/freshness?version=1.7#definition) ## What are you changing in this pull request and why? The changes in this PR fix the following two issues: There are two links on the sidebar for `loaded_at_field`: image Another issue is that links to the `#loaded_at_field` anchor tag won't always go where you intend, depending on the version the user has selected. ## 🎩 ### After image ### 1.6 before image ### 1.6 after image ### 1.7 before image ### 1.7 after image ## Checklist - [x] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) so my content adheres to these guidelines. - [x] For [docs versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#about-versioning), review how to [version a whole page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) and [version a block of content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content). --- .../docs/reference/resource-properties/freshness.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/website/docs/reference/resource-properties/freshness.md b/website/docs/reference/resource-properties/freshness.md index 4db726e5581..03037e7b681 100644 --- a/website/docs/reference/resource-properties/freshness.md +++ b/website/docs/reference/resource-properties/freshness.md @@ -62,8 +62,6 @@ To exclude a source from freshness calculations, you have two options: - Don't add a `freshness:` block. - Explicitly set `freshness: null`. -## loaded_at_field -(Optional on adapters that support pulling freshness from warehouse metadata tables, required otherwise.) @@ -75,11 +73,19 @@ Freshness blocks are applied hierarchically: - A `freshness` and `loaded_at_field` property added to a source _table_ will override any properties applied to the source. This is useful when all of the tables in a source have the same `loaded_at_field`, as is often the case. + ## loaded_at_field + + +(Optional on adapters that support pulling freshness from warehouse metadata tables, required otherwise.) + + + (Required) -A column name (or expression) that returns a timestamp indicating freshness. + +

A column name (or expression) that returns a timestamp indicating freshness. If using a date field, you may have to cast it to a timestamp: ```yml From b668c7ab235e85ad1f757f898d093046b029da44 Mon Sep 17 00:00:00 2001 From: Mirna Wong <89008547+mirnawong1@users.noreply.github.com> Date: Thu, 4 Apr 2024 09:56:56 +0100 Subject: [PATCH 04/12] Update website/docs/faqs/Docs/documenting-macros.md --- website/docs/faqs/Docs/documenting-macros.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/faqs/Docs/documenting-macros.md b/website/docs/faqs/Docs/documenting-macros.md index 1710ced0f1a..2111276baa0 100644 --- a/website/docs/faqs/Docs/documenting-macros.md +++ b/website/docs/faqs/Docs/documenting-macros.md @@ -35,7 +35,7 @@ When you create a [custom materialization](/guides/create-new-materializations), materialization_{materialization_name}_{adapter} ``` -To document a custom materialization, use the format above to determine the associated macro name(s) to document. +To document a custom materialization, use the previously mentioned format to determine the associated macro name(s) to document. From 92be9b8c8c57c01abe81f16e66aaa4a0edf73410 Mon Sep 17 00:00:00 2001 From: Mirna Wong <89008547+mirnawong1@users.noreply.github.com> Date: Thu, 4 Apr 2024 10:28:01 +0100 Subject: [PATCH 05/12] Update website/docs/reference/resource-properties/constraints.md --- website/docs/reference/resource-properties/constraints.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/reference/resource-properties/constraints.md b/website/docs/reference/resource-properties/constraints.md index 4b5f4af51a3..939c9dbad0d 100644 --- a/website/docs/reference/resource-properties/constraints.md +++ b/website/docs/reference/resource-properties/constraints.md @@ -15,7 +15,7 @@ Constraints require the declaration and enforcement of a model [contract](/refer Constraints may be defined for a single column, or at the model level for one or more columns. As a general rule, we recommend defining single-column constraints directly on those columns. -If you are defining multiple `primary_key` constraints for a single model, those MUST be defined at the model level. Defining multiple `primary_key` constraints at the column level is not supported. +If you are defining multiple `primary_key` constraints for a single model, those _must_ be defined at the model level. Defining multiple `primary_key` constraints at the column level is not supported. The structure of a constraint is: - `type` (required): one of `not_null`, `unique`, `primary_key`, `foreign_key`, `check`, `custom` From d208eefcd8c520a4301ecb530b6008d0b6f6bdc8 Mon Sep 17 00:00:00 2001 From: winnie <91998347+gwenwindflower@users.noreply.github.com> Date: Thu, 4 Apr 2024 07:45:24 -0500 Subject: [PATCH 06/12] Fix syntax in mf Dimensions page Noticed some more old syntax in the SL/MF docs, this should fix it (I think I've looked at every metricflow/SL page in the past 24 hours so this is hopefully it!) --- website/docs/docs/build/dimensions.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/website/docs/docs/build/dimensions.md b/website/docs/docs/build/dimensions.md index 3c4edd9aef0..22ac053bc44 100644 --- a/website/docs/docs/build/dimensions.md +++ b/website/docs/docs/build/dimensions.md @@ -112,10 +112,10 @@ You can use multiple time groups in separate metrics. For example, the `users_cr ```bash # dbt Cloud users -dbt sl query --metrics users_created,users_deleted --dimensions metric_time --order metric_time +dbt sl query --metrics users_created,users_deleted --group-by metric_time__year --order-by metric_time__year # dbt Core users -mf query --metrics users_created,users_deleted --dimensions metric_time --order metric_time +mf query --metrics users_created,users_deleted --group-by metric_time__year --order-by metric_time__year ``` @@ -133,10 +133,10 @@ MetricFlow enables metric aggregation during query time. For example, you can ag ```bash # dbt Cloud users -dbt sl query --metrics messages_per_month --dimensions metric_time --order metric_time --time-granularity year +dbt sl query --metrics messages_per_month --group-by metric_time__year --order-by metric_time__year # dbt Core users -mf query --metrics messages_per_month --dimensions metric_time --order metric_time --time-granularity year +mf query --metrics messages_per_month --group-by metric_time__year --order metric_time__year ``` ```yaml @@ -361,10 +361,10 @@ The following command or code represents how to return the count of transactions ```bash # dbt Cloud users -dbt sl query --metrics transactions --dimensions metric_time__month,sales_person__tier --order metric_time__month --order sales_person__tier +dbt sl query --metrics transactions --group-by metric_time__month,sales_person__tier --order-by metric_time__month,sales_person__tier # dbt Core users -mf query --metrics transactions --dimensions metric_time__month,sales_person__tier --order metric_time__month --order sales_person__tier +mf query --metrics transactions --group-by metric_time__month,sales_person__tier --order-by metric_time__month,sales_person__tier ``` From 7d82a9999f7bc4bba56e288dc5ffbd59ce076329 Mon Sep 17 00:00:00 2001 From: Ben Cassell <98852248+benc-db@users.noreply.github.com> Date: Thu, 4 Apr 2024 13:47:59 -0700 Subject: [PATCH 07/12] Update databricks-configs.md to discuss tblproperties. --- .../resource-configs/databricks-configs.md | 53 +++++++++++++------ 1 file changed, 37 insertions(+), 16 deletions(-) diff --git a/website/docs/reference/resource-configs/databricks-configs.md b/website/docs/reference/resource-configs/databricks-configs.md index 908adbc687d..fe513cfec0f 100644 --- a/website/docs/reference/resource-configs/databricks-configs.md +++ b/website/docs/reference/resource-configs/databricks-configs.md @@ -9,30 +9,51 @@ When materializing a model as `table`, you may include several optional configs -| Option | Description | Required? | Example | -|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|----------------| -| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | `delta` | -| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | `/mnt/root` | -| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | `date_day` | -| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | `date_day` | -| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | `country_code` | -| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | `8` | +| Option | Description | Required? | Example | +|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|------------------------| +| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | `delta` | +| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | `/mnt/root` | +| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | `date_day` | +| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | `date_day` | +| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | `country_code` | +| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | `8` | +| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional + | {'this.is.my.key': 12} | -| Option | Description | Required? | Model Support | Example | -|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|---------------|----------------| -| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | SQL, Python | `delta` | -| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | SQL, Python | `/mnt/root` | -| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | SQL, Python | `date_day` | -| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL | `date_day` | -| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | -| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | SQL, Python | `8` | +| Option | Description | Required? | Model Support | Example | +|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|---------------|------------------------| +| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | SQL, Python | `delta` | +| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | SQL, Python | `/mnt/root` | +| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | SQL, Python | `date_day` | +| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL | `date_day` | +| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | +| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | SQL, Python | `8` | +| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional + | SQL | {'this.is.my.key': 12} | + + + + +| Option | Description | Required? | Model Support | Example | +|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|---------------|------------------------| +| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | SQL, Python | `delta` | +| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | SQL, Python | `/mnt/root` | +| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | SQL, Python | `date_day` | +| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL | `date_day` | +| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | +| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | SQL, Python | `8` | +| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional + | SQL, Python* | {'this.is.my.key': 12} | + +\* - Beginning in 1.7.12, we have added tblproperties to Python models via an alter statement that runs after table creation. +We do not yet have a PySpark API to set tblproperties at table creation, so this feature is primarily to allow users to anotate their python-derived tables with tblproperties. ## Incremental models From 6ca2831952c48f5001c6c9d9fc7d5f9ebf686f5a Mon Sep 17 00:00:00 2001 From: Ben Cassell <98852248+benc-db@users.noreply.github.com> Date: Thu, 4 Apr 2024 13:51:57 -0700 Subject: [PATCH 08/12] Update databricks-configs.md - formatting --- .../resource-configs/databricks-configs.md | 58 +++++++++---------- 1 file changed, 28 insertions(+), 30 deletions(-) diff --git a/website/docs/reference/resource-configs/databricks-configs.md b/website/docs/reference/resource-configs/databricks-configs.md index fe513cfec0f..470b5f302f8 100644 --- a/website/docs/reference/resource-configs/databricks-configs.md +++ b/website/docs/reference/resource-configs/databricks-configs.md @@ -9,48 +9,46 @@ When materializing a model as `table`, you may include several optional configs -| Option | Description | Required? | Example | -|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|------------------------| -| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | `delta` | -| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | `/mnt/root` | -| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | `date_day` | -| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | `date_day` | -| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | `country_code` | -| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | `8` | -| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional - | {'this.is.my.key': 12} | +| Option | Description | Required? | Example | +|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|--------------------------| +| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | `delta` | +| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | `/mnt/root` | +| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | `date_day` | +| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | `date_day` | +| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | `country_code` | +| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | `8` | +| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional + | \{'this.is.my.key': 12\} | -| Option | Description | Required? | Model Support | Example | -|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|---------------|------------------------| -| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | SQL, Python | `delta` | -| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | SQL, Python | `/mnt/root` | -| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | SQL, Python | `date_day` | -| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL | `date_day` | -| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | -| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | SQL, Python | `8` | -| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional - | SQL | {'this.is.my.key': 12} | +| Option | Description | Required? | Model Support | Example | +|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|---------------|--------------------------| +| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | SQL, Python | `delta` | +| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | SQL, Python | `/mnt/root` | +| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | SQL, Python | `date_day` | +| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL | `date_day` | +| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | +| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | SQL, Python | `8` | +| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional | SQL | \{'this.is.my.key': 12\} | -| Option | Description | Required? | Model Support | Example | -|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|---------------|------------------------| -| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | SQL, Python | `delta` | -| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | SQL, Python | `/mnt/root` | -| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | SQL, Python | `date_day` | -| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL | `date_day` | -| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | -| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | SQL, Python | `8` | -| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional - | SQL, Python* | {'this.is.my.key': 12} | +| Option | Description | Required? | Model Support | Example | +|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|---------------|--------------------------| +| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | SQL, Python | `delta` | +| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | SQL, Python | `/mnt/root` | +| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | SQL, Python | `date_day` | +| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL | `date_day` | +| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | +| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | SQL, Python | `8` | +| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional | SQL, Python* | \{'this.is.my.key': 12\} | \* - Beginning in 1.7.12, we have added tblproperties to Python models via an alter statement that runs after table creation. We do not yet have a PySpark API to set tblproperties at table creation, so this feature is primarily to allow users to anotate their python-derived tables with tblproperties. From 5063c1798c567ca0588361c46e437d771e3b216b Mon Sep 17 00:00:00 2001 From: Ben Cassell <98852248+benc-db@users.noreply.github.com> Date: Thu, 4 Apr 2024 13:58:49 -0700 Subject: [PATCH 09/12] Update databricks-configs.md - more formatting --- .../resource-configs/databricks-configs.md | 24 +++++++++---------- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/website/docs/reference/resource-configs/databricks-configs.md b/website/docs/reference/resource-configs/databricks-configs.md index 470b5f302f8..935963b26cb 100644 --- a/website/docs/reference/resource-configs/databricks-configs.md +++ b/website/docs/reference/resource-configs/databricks-configs.md @@ -9,16 +9,14 @@ When materializing a model as `table`, you may include several optional configs -| Option | Description | Required? | Example | -|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|--------------------------| -| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | `delta` | -| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | `/mnt/root` | -| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | `date_day` | -| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | `date_day` | -| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | `country_code` | -| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | `8` | -| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional - | \{'this.is.my.key': 12\} | +| Option | Description | Required? | Example | +|---------------------|------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|--------------------------| +| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | `delta` | +| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | `/mnt/root` | +| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | `date_day` | +| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | `country_code` | +| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | `8` | +| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional | `{'this.is.my.key': 12}` | @@ -33,7 +31,7 @@ When materializing a model as `table`, you may include several optional configs | liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL | `date_day` | | clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | | buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | SQL, Python | `8` | -| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional | SQL | \{'this.is.my.key': 12\} | +| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional | SQL | `{'this.is.my.key': 12}` | @@ -48,9 +46,9 @@ When materializing a model as `table`, you may include several optional configs | liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL | `date_day` | | clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | | buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | SQL, Python | `8` | -| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional | SQL, Python* | \{'this.is.my.key': 12\} | +| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional | SQL, Python* | `{'this.is.my.key': 12}` | -\* - Beginning in 1.7.12, we have added tblproperties to Python models via an alter statement that runs after table creation. +\* Beginning in 1.7.12, we have added tblproperties to Python models via an alter statement that runs after table creation. We do not yet have a PySpark API to set tblproperties at table creation, so this feature is primarily to allow users to anotate their python-derived tables with tblproperties. From af9efda482d4b9be9b3401fa73b9311b29e326bc Mon Sep 17 00:00:00 2001 From: ialdg <39755524+ialdg@users.noreply.github.com> Date: Fri, 5 Apr 2024 08:47:15 +0200 Subject: [PATCH 10/12] Update target_database.md Hi. This modification proposal is intended to correct a typo. Regards. IL. --- website/docs/reference/resource-configs/target_database.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/reference/resource-configs/target_database.md b/website/docs/reference/resource-configs/target_database.md index 5f65fa79bad..07837824f33 100644 --- a/website/docs/reference/resource-configs/target_database.md +++ b/website/docs/reference/resource-configs/target_database.md @@ -76,7 +76,7 @@ snapshots: Leverage the [`generate_database_name` macro](/docs/build/custom-databases) to build snapshots in databases that follow the same naming behavior as your models. Notes: -* This macro is not available when configuring from the `dbt_project.yml` file, so must be configured in a snapshot config block. +* This macro is not available when configuring from the `dbt_project.yml` file, so it must be configured in a snapshot config block. * Consider whether this use-case is right for you, as downstream `refs` will select from the `dev` version of a snapshot, which can make it hard to validate models that depend on snapshots. From 0a1f7de1c3fb397530ea1d7152580ca86041324b Mon Sep 17 00:00:00 2001 From: Mirna Wong <89008547+mirnawong1@users.noreply.github.com> Date: Fri, 5 Apr 2024 15:04:44 +0100 Subject: [PATCH 11/12] Update access-gdrive-credential.md the solution provided isn't correct. this pr updates the solution to the correct code. raised by [user feedback](https://dbt-labs.slack.com/archives/C02NCQ9483C/p1712325416338229?thread_ts=1711637426.267879&cid=C02NCQ9483C) --- website/docs/faqs/Troubleshooting/access-gdrive-credential.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/website/docs/faqs/Troubleshooting/access-gdrive-credential.md b/website/docs/faqs/Troubleshooting/access-gdrive-credential.md index 64799291ee2..6ed65f92bfb 100644 --- a/website/docs/faqs/Troubleshooting/access-gdrive-credential.md +++ b/website/docs/faqs/Troubleshooting/access-gdrive-credential.md @@ -12,12 +12,12 @@ If you're seeing the below error when you try to query a dataset from a Google D Access denied: BigQuery BigQuery: Permission denied while getting Drive credentials ``` -Usually this errors indicates that you haven't granted the BigQuery service account access to the specific Google Drive document. If you're seeing this error, try giving the service account (client email seen [here](https://docs.getdbt.com/docs/dbt-cloud/cloud-configuring-dbt-cloud/connecting-your-database#connecting-to-bigquery)) you are using for your BigQuery connection in dbt Cloud, permission to your Google Drive or Google Sheet. You'll want to do this directly in your Google Document and click the 'share' button and enter the client email there. +Usually, this error indicates that you haven't granted the BigQuery service account access to the specific Google Drive document. If you're seeing this error, try giving the service account (client email seen [here](https://docs.getdbt.com/docs/dbt-cloud/cloud-configuring-dbt-cloud/connecting-your-database#connecting-to-bigquery)) you are using for your BigQuery connection in dbt Cloud, permission to your Google Drive or Google Sheet. You'll want to do this directly in your Google Document and click the 'share' button and enter the client email there. If you are experiencing this error when using OAuth, and you have verified your access to the Google Sheet, you may need to grant permissions for gcloud to access Google Drive: ``` -gcloud auth application-default login --scopes=openid,https://www.googleapis.com/auth/userinfo.email,https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/sqlservice.login,https://www.googleapis.com/auth/drive +gcloud auth application-default login --disable-quota-project ``` For more info see the [gcloud auth application-default documentation](https://cloud.google.com/sdk/gcloud/reference/auth/application-default/login) From be6d583c865cf10b54006418a397faf6ae2640c8 Mon Sep 17 00:00:00 2001 From: Isabela Sobral <35778239+belasobral93@users.noreply.github.com> Date: Fri, 5 Apr 2024 09:22:36 -0700 Subject: [PATCH 12/12] Update saved-queries.md added guide to prepend semantic model name when using dimension object --- website/docs/docs/build/saved-queries.md | 1 + 1 file changed, 1 insertion(+) diff --git a/website/docs/docs/build/saved-queries.md b/website/docs/docs/build/saved-queries.md index 90a8fbc467c..fdedbbb7f8f 100644 --- a/website/docs/docs/build/saved-queries.md +++ b/website/docs/docs/build/saved-queries.md @@ -131,6 +131,7 @@ To define a saved query, refer to the following parameters: All metrics in a saved query need to use the same dimensions in the `group_by` or `where` clauses. +When using the `Dimension` object, prepend the semantic model name, for example `Dimension('user__ds')` ## Related docs