From c097396b33ed921f48c88fa23f87f212065f0e9f Mon Sep 17 00:00:00 2001 From: Jeremy Cohen Date: Tue, 7 May 2024 00:14:21 +0200 Subject: [PATCH 1/5] Clarifications for Staging envs, 1:1 projects for Mesh --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 8 ++++++ .../govern/project-dependencies.md | 10 +++---- .../docs/docs/deploy/deploy-environments.md | 28 +++++++++++++++++-- 3 files changed, 37 insertions(+), 9 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 9889ebb9f69..948a0decd34 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -251,6 +251,14 @@ If you’re interested in beta access to “Staging” environments, let your db + + +The short answer is "no." Cross-project references assume that each project `name` is unique in your dbt Cloud account. + +There are historical limitations which required customers to "duplicate" projects, so that one actual dbt project (codebase) would map to more than one dbt Cloud project. To that end, we are working to remove the historical limitations that required customers to "duplicate" projects in dbt Cloud — Staging environments for data isolation (beta), environment-level permissions, and environment-level data warehouse connections (coming soon). Once those pieces are in place, it should no longer be necessary to define separate dbt Cloud projects simply to isolate data environments or permissions. + + + ## Compatibility with other features diff --git a/website/docs/docs/collaborate/govern/project-dependencies.md b/website/docs/docs/collaborate/govern/project-dependencies.md index 8265e953839..8ad35b20eff 100644 --- a/website/docs/docs/collaborate/govern/project-dependencies.md +++ b/website/docs/docs/collaborate/govern/project-dependencies.md @@ -32,12 +32,10 @@ Refer to the [FAQs](#faqs) for more info. ## Prerequisites In order to add project dependencies and resolve cross-project `ref`, you must: -- Use dbt v1.6 or higher for **both** the upstream ("producer") project and the downstream ("consumer") project. -- Define models in an upstream ("producer") project that are configured with [`access: public`](/reference/resource-configs/access). To apply the change, rerun a production job. -- Have a deployment environment in the upstream ("producer") project [that is set to be your production environment](/docs/deploy/deploy-environments#set-as-production-environment) -- Have a successful run of the upstream ("producer") project. -- Define and trigger a job before marking the environment as Staging. Read more about [Staging environments with downstream dependencies](/docs/collaborate/govern/project-dependencies#staging-with-downstream-dependencies). -- Have a multi-tenant or single-tenant [dbt Cloud Enterprise](https://www.getdbt.com/pricing) account (Azure ST is not supported but coming soon.) +- Use a supported version of dbt (v1.6, v1.7, or "Keep on latest version") for both the upstream ("producer") project and the downstream ("consumer") project. +- Define models in an upstream ("producer") project that are configured with [`access: public`](/reference/resource-configs/access). (You need at least one successful job run after defining their `access`.) +- Define a deployment environment in the upstream ("producer") project [that is set to be your Production environment](/docs/deploy/deploy-environments#set-as-production-environment), and ensure it has at least one successful job run in that environment. +- Each project `name` must be unique in your dbt Cloud account. For example, if you have a dbt project (codebase) for the `jaffle_marketing` team, you should not create separate projects for `Jaffle Marketing - Dev` and `Jaffle Marketing - Prod`. That isolation should instead be handled at the environment level. To that end, we are working to add support for environment-level permissions and data warehouse connections; reach out to your dbt Labs account team for beta access in May/June 2024. ## Example diff --git a/website/docs/docs/deploy/deploy-environments.md b/website/docs/docs/deploy/deploy-environments.md index 42f34740164..0057c0b07b8 100644 --- a/website/docs/docs/deploy/deploy-environments.md +++ b/website/docs/docs/deploy/deploy-environments.md @@ -47,10 +47,16 @@ For Semantic Layer-eligible customers, the next section of environment settings Currently in limited availability beta. Contact support or your account team if you're interested in beta access. ::: -Use a Staging environment to grant developers access to deployment workflows and tools while controlling access to production data. You can do this in a couple of ways, but the most straightforward is to configure Staging with a long-living branch (for example, `staging`) similar to but separate from the primary branch (for example, `main`). +Use a Staging environment to grant developers access to deployment workflows and tools while controlling access to production data. Staging environments enable you to achieve more granular control over permissions, data warehouse connections, and data isolation — within the purview of a single project in dbt Cloud. + +### Git workflow + +You can do this in a couple of ways, but the most straightforward is to configure Staging with a long-living branch (for example, `staging`) similar to but separate from the primary branch (for example, `main`). In this scenario, the workflows would ideally move upstream from the Development environment -> Staging environment -> Production environment with developer branches feeding into the `staging` branch, then ultimately merging into `main`. In many cases, the `main` and `staging` branches will be identical after a merge and remain until the next batch of changes from the `development` branches are ready to be elevated. We recommend setting branch protection rules on `staging` similar to `main`. +Some customers prefer to connect Development and Staging to their `main` branch, and then cutting release branches on a regular cadence (daily or weekly) which feed into Production. + ### Why use a staging environment There are two primary motivations for using a Staging environment: @@ -61,9 +67,25 @@ There are two primary motivations for using a Staging environment: Provide developers with the ability to create, edit, and trigger ad hoc jobs in the Staging environment, while keeping the Production environment locked down. ::: -Let's say you have `Project B` downstream of `Project A` with cross-project refs configured in the models. When developers work in the IDE for `Project B`, cross-project refs will resolve to the Staging environment of `Project A`, rather than production. You'll get the same results with those refs when jobs are run in the Staging environment. Only the Production environment will reference the Production data, keeping the data and access isolated without needing separate projects. +**Conditional configuration of sources** enables you to point to "prod" or "non-prod" source data, depending on the environment you're running in. For example, this source will point to `.sensitive_source.table_with_pii`, where `` is dynamically resolved based on an environment variable. + + + +```yaml +sources: + - name: sensitive_source + database: "{{ env_var('SENSITIVE_SOURCE_DATABASE') }}" + tables: + - name: table_with_pii +``` + + + +There is exactly one source (`sensitive_source`), and all downstream dbt models select from it as `{{ source('sensitive_source', 'table_with_pii') }}`. The code in your project and the shape of the DAG remain consistent across environments. By setting it up in this way, rather than duplicating sources, you get some important benefits. + +**Cross-project references in dbt Mesh:** Let's say you have `Project B` downstream of `Project A` with cross-project refs configured in the models. When developers work in the IDE for `Project B`, cross-project refs will resolve to the Staging environment of `Project A`, rather than production. You'll get the same results with those refs when jobs are run in the Staging environment. Only the Production environment will reference the Production data, keeping the data and access isolated without needing separate projects. -If `Project B` also has a Staging deployment, then references to unbuilt upstream models within `Project B` will resolve to that environment, using [deferral](/docs/cloud/about-cloud-develop-defer), rather than resolving to the models in Production. This saves developers time and warehouse spend, while preserving clear separation of environments. +**Faster development, enabled by deferral:** If `Project B` also has a Staging deployment, then references to unbuilt upstream models within `Project B` will resolve to that environment, using [deferral](/docs/cloud/about-cloud-develop-defer), rather than resolving to the models in Production. This saves developers time and warehouse spend, while preserving clear separation of environments. Finally, the Staging environment has its own view in [dbt Explorer](/docs/collaborate/explore-projects), giving you a full view of your prod and pre-prod data. From 101d61d57992e3125c591955f068d0c3b38a6d71 Mon Sep 17 00:00:00 2001 From: Jeremy Cohen Date: Mon, 13 May 2024 14:29:49 +0200 Subject: [PATCH 2/5] Fix build error --- website/docs/docs/deploy/deploy-environments.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/deploy/deploy-environments.md b/website/docs/docs/deploy/deploy-environments.md index 0057c0b07b8..fd74861391a 100644 --- a/website/docs/docs/deploy/deploy-environments.md +++ b/website/docs/docs/deploy/deploy-environments.md @@ -69,7 +69,7 @@ Provide developers with the ability to create, edit, and trigger ad hoc jobs in **Conditional configuration of sources** enables you to point to "prod" or "non-prod" source data, depending on the environment you're running in. For example, this source will point to `.sensitive_source.table_with_pii`, where `` is dynamically resolved based on an environment variable. - + ```yaml sources: From e36567535fa802085e6e4155f44d4d9434619f9e Mon Sep 17 00:00:00 2001 From: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> Date: Mon, 13 May 2024 14:58:27 -0400 Subject: [PATCH 3/5] Apply suggestions from code review --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 4 ++-- website/docs/docs/collaborate/govern/project-dependencies.md | 2 +- website/docs/docs/deploy/deploy-environments.md | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 948a0decd34..476eff1ef90 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -253,9 +253,9 @@ If you’re interested in beta access to “Staging” environments, let your db -The short answer is "no." Cross-project references assume that each project `name` is unique in your dbt Cloud account. +The short answer is "no." Cross-project references require that each project `name` be unique in your dbt Cloud account. -There are historical limitations which required customers to "duplicate" projects, so that one actual dbt project (codebase) would map to more than one dbt Cloud project. To that end, we are working to remove the historical limitations that required customers to "duplicate" projects in dbt Cloud — Staging environments for data isolation (beta), environment-level permissions, and environment-level data warehouse connections (coming soon). Once those pieces are in place, it should no longer be necessary to define separate dbt Cloud projects simply to isolate data environments or permissions. +Historical limitations required customers to "duplicate" projects so that one actual dbt project (codebase) would map to more than one dbt Cloud project. To that end, we are working to remove the historical limitations that required customers to "duplicate" projects in dbt Cloud — Staging environments for data isolation (beta), environment-level permissions, and environment-level data warehouse connections (coming soon). Once those pieces are in place, it should no longer be necessary to define separate dbt Cloud projects to isolate data environments or permissions. diff --git a/website/docs/docs/collaborate/govern/project-dependencies.md b/website/docs/docs/collaborate/govern/project-dependencies.md index 8ad35b20eff..d9d91025fd7 100644 --- a/website/docs/docs/collaborate/govern/project-dependencies.md +++ b/website/docs/docs/collaborate/govern/project-dependencies.md @@ -35,7 +35,7 @@ In order to add project dependencies and resolve cross-project `ref`, you must: - Use a supported version of dbt (v1.6, v1.7, or "Keep on latest version") for both the upstream ("producer") project and the downstream ("consumer") project. - Define models in an upstream ("producer") project that are configured with [`access: public`](/reference/resource-configs/access). (You need at least one successful job run after defining their `access`.) - Define a deployment environment in the upstream ("producer") project [that is set to be your Production environment](/docs/deploy/deploy-environments#set-as-production-environment), and ensure it has at least one successful job run in that environment. -- Each project `name` must be unique in your dbt Cloud account. For example, if you have a dbt project (codebase) for the `jaffle_marketing` team, you should not create separate projects for `Jaffle Marketing - Dev` and `Jaffle Marketing - Prod`. That isolation should instead be handled at the environment level. To that end, we are working to add support for environment-level permissions and data warehouse connections; reach out to your dbt Labs account team for beta access in May/June 2024. +- Each project `name` must be unique in your dbt Cloud account. For example, if you have a dbt project (codebase) for the `jaffle_marketing` team, you should not create separate projects for `Jaffle Marketing - Dev` and `Jaffle Marketing - Prod`. That isolation should instead be handled at the environment level. To that end, we are working on adding support for environment-level permissions and data warehouse connections; reach out to your dbt Labs account team for beta access in May/June 2024. ## Example diff --git a/website/docs/docs/deploy/deploy-environments.md b/website/docs/docs/deploy/deploy-environments.md index fd74861391a..b96d61abae5 100644 --- a/website/docs/docs/deploy/deploy-environments.md +++ b/website/docs/docs/deploy/deploy-environments.md @@ -51,11 +51,11 @@ Use a Staging environment to grant developers access to deployment workflows and ### Git workflow -You can do this in a couple of ways, but the most straightforward is to configure Staging with a long-living branch (for example, `staging`) similar to but separate from the primary branch (for example, `main`). +You can approach this in a couple of ways, but the most straightforward is configuring Staging with a long-living branch (for example, `staging`) similar to but separate from the primary branch (for example, `main`). In this scenario, the workflows would ideally move upstream from the Development environment -> Staging environment -> Production environment with developer branches feeding into the `staging` branch, then ultimately merging into `main`. In many cases, the `main` and `staging` branches will be identical after a merge and remain until the next batch of changes from the `development` branches are ready to be elevated. We recommend setting branch protection rules on `staging` similar to `main`. -Some customers prefer to connect Development and Staging to their `main` branch, and then cutting release branches on a regular cadence (daily or weekly) which feed into Production. +Some customers prefer to connect Development and Staging to their `main` branch and then cut release branches on a regular cadence (daily or weekly), which feeds into Production. ### Why use a staging environment From d7b6883253585c5e5ed34041a5b3c258974b6ee3 Mon Sep 17 00:00:00 2001 From: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> Date: Mon, 13 May 2024 14:58:53 -0400 Subject: [PATCH 4/5] Update website/docs/docs/deploy/deploy-environments.md --- website/docs/docs/deploy/deploy-environments.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/deploy/deploy-environments.md b/website/docs/docs/deploy/deploy-environments.md index b96d61abae5..7d571dd4b9c 100644 --- a/website/docs/docs/deploy/deploy-environments.md +++ b/website/docs/docs/deploy/deploy-environments.md @@ -85,7 +85,7 @@ There is exactly one source (`sensitive_source`), and all downstream dbt models **Cross-project references in dbt Mesh:** Let's say you have `Project B` downstream of `Project A` with cross-project refs configured in the models. When developers work in the IDE for `Project B`, cross-project refs will resolve to the Staging environment of `Project A`, rather than production. You'll get the same results with those refs when jobs are run in the Staging environment. Only the Production environment will reference the Production data, keeping the data and access isolated without needing separate projects. -**Faster development, enabled by deferral:** If `Project B` also has a Staging deployment, then references to unbuilt upstream models within `Project B` will resolve to that environment, using [deferral](/docs/cloud/about-cloud-develop-defer), rather than resolving to the models in Production. This saves developers time and warehouse spend, while preserving clear separation of environments. +**Faster development enabled by deferral:** If `Project B` also has a Staging deployment, then references to unbuilt upstream models within `Project B` will resolve to that environment, using [deferral](/docs/cloud/about-cloud-develop-defer), rather than resolving to the models in Production. This saves developers time and warehouse spend, while preserving clear separation of environments. Finally, the Staging environment has its own view in [dbt Explorer](/docs/collaborate/explore-projects), giving you a full view of your prod and pre-prod data. From 378fbecd9479e487158e01c96f168ba8fc595bb4 Mon Sep 17 00:00:00 2001 From: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> Date: Mon, 13 May 2024 14:58:57 -0400 Subject: [PATCH 5/5] Update website/docs/docs/collaborate/govern/project-dependencies.md --- website/docs/docs/collaborate/govern/project-dependencies.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/collaborate/govern/project-dependencies.md b/website/docs/docs/collaborate/govern/project-dependencies.md index d9d91025fd7..1ca9a0e5312 100644 --- a/website/docs/docs/collaborate/govern/project-dependencies.md +++ b/website/docs/docs/collaborate/govern/project-dependencies.md @@ -33,7 +33,7 @@ Refer to the [FAQs](#faqs) for more info. In order to add project dependencies and resolve cross-project `ref`, you must: - Use a supported version of dbt (v1.6, v1.7, or "Keep on latest version") for both the upstream ("producer") project and the downstream ("consumer") project. -- Define models in an upstream ("producer") project that are configured with [`access: public`](/reference/resource-configs/access). (You need at least one successful job run after defining their `access`.) +- Define models in an upstream ("producer") project that are configured with [`access: public`](/reference/resource-configs/access). You need at least one successful job run after defining their `access`. - Define a deployment environment in the upstream ("producer") project [that is set to be your Production environment](/docs/deploy/deploy-environments#set-as-production-environment), and ensure it has at least one successful job run in that environment. - Each project `name` must be unique in your dbt Cloud account. For example, if you have a dbt project (codebase) for the `jaffle_marketing` team, you should not create separate projects for `Jaffle Marketing - Dev` and `Jaffle Marketing - Prod`. That isolation should instead be handled at the environment level. To that end, we are working on adding support for environment-level permissions and data warehouse connections; reach out to your dbt Labs account team for beta access in May/June 2024.