From 2c08d726ecd8e6f9c212db8ed54c884f42b1772f Mon Sep 17 00:00:00 2001 From: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> Date: Wed, 1 May 2024 19:39:23 -0400 Subject: [PATCH] Staging environment adds (#5356) ## What are you changing in this pull request and why? Adding additional information about Staging environments as outlined [here](https://www.notion.so/dbtlabs/dbt-Cloud-staging-environment-684abeb5baf24b6fbd55bd5c9f56b606) ## Checklist - [ ] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) so my content adheres to these guidelines. - [ ] For [docs versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#about-versioning), review how to [version a whole page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) and [version a block of content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content). - [ ] Add a checklist item for anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch." Adding or removing pages (delete if not applicable): - [ ] Add/remove page in `website/sidebars.js` - [ ] Provide a unique filename for new pages - [ ] Add an entry for deleted pages in `website/vercel.json` - [ ] Run link testing locally with `npm run build` to update the links that point to deleted pages --------- Co-authored-by: Leona B. Campbell <3880403+runleonarun@users.noreply.github.com> Co-authored-by: Jeremy Cohen --- ...2-09-13-the-case-against-cherry-picking.md | 4 + .../2023-11-14-specify-prod-environment.md | 4 + .../docs/cloud/about-cloud-develop-defer.md | 4 +- .../docs/collaborate/dbt-explorer-faqs.md | 4 +- .../docs/docs/collaborate/explore-projects.md | 6 ++ .../govern/project-dependencies.md | 19 ++++- .../docs/docs/deploy/deploy-environments.md | 83 +++++++++---------- 7 files changed, 78 insertions(+), 46 deletions(-) diff --git a/website/blog/2022-09-13-the-case-against-cherry-picking.md b/website/blog/2022-09-13-the-case-against-cherry-picking.md index 84a70e08392..580df2be994 100644 --- a/website/blog/2022-09-13-the-case-against-cherry-picking.md +++ b/website/blog/2022-09-13-the-case-against-cherry-picking.md @@ -9,6 +9,10 @@ hide_table_of_contents: false date: 2022-09-13 is_featured: true --- + +:::note You can now use a Staging environment! +This blog post was written before Staging environments. You can now use dbt Cloud can to support the patterns discussed here. Read more about [Staging environments](/docs/deploy/deploy-environments#staging-environment). +::: ## Why do people cherry pick into upper branches? diff --git a/website/blog/2023-11-14-specify-prod-environment.md b/website/blog/2023-11-14-specify-prod-environment.md index c6ad2b31027..0e205abd749 100644 --- a/website/blog/2023-11-14-specify-prod-environment.md +++ b/website/blog/2023-11-14-specify-prod-environment.md @@ -14,6 +14,10 @@ is_featured: false --- +:::note You can now use a Staging environment! +This blog post was written before Staging environments. You can now use dbt Cloud can to support the patterns discussed here. Read more about [Staging environments](/docs/deploy/deploy-environments#staging-environment). +::: + :::tip The Bottom Line: You should [split your Jobs](#how) across Environments in dbt Cloud based on their purposes (e.g. Production and Staging/CI) and set one environment as Production. This will improve your CI experience and enable you to use dbt Explorer. ::: diff --git a/website/docs/docs/cloud/about-cloud-develop-defer.md b/website/docs/docs/cloud/about-cloud-develop-defer.md index 37bfaacfd0c..d5a48ee4654 100644 --- a/website/docs/docs/cloud/about-cloud-develop-defer.md +++ b/website/docs/docs/cloud/about-cloud-develop-defer.md @@ -19,6 +19,8 @@ By default, dbt follows these rules: For a clean slate, it's a good practice to drop the development schema at the start and end of your development cycle. +If you require additional controls over production data, create a [Staging evironment](/docs/deploy/deploy-environments#staging-environment) and dbt will use that, rather than the Production environment, to resolve `{{ ref() }}` functions. + ## Required setup - You must select the **[Production environment](/docs/deploy/deploy-environments#set-as-production-environment)** checkbox in the **Environment Settings** page. @@ -42,7 +44,7 @@ For example, if you were to start developing on a new branch with [nothing in yo One key difference between using `--defer` in the dbt Cloud CLI and the dbt Cloud IDE is that `--defer` is *automatically* enabled in the dbt Cloud CLI for all invocations, compared with production artifacts. You can disable it with the `--no-defer` flag. -The dbt Cloud CLI offers additional flexibility by letting you choose the source environment for deferral artifacts. You can set a `defer-env-id` key in either your `dbt_project.yml` or `dbt_cloud.yml` file. If you do not provide a `defer-env-id` setting, the dbt Cloud CLI will use artifacts from your dbt Cloud environment marked "Production". +The dbt Cloud CLI offers additional flexibility by letting you choose the source environment for deferral artifacts. You can manually set a `defer-env-id` key in either your `dbt_project.yml` or `dbt_cloud.yml` file. By default, the dbt Cloud CLI will prefer metadata from the project's "Staging" environment (if defined), otherwise "Production." diff --git a/website/docs/docs/collaborate/dbt-explorer-faqs.md b/website/docs/docs/collaborate/dbt-explorer-faqs.md index e214a9735de..ef13b697994 100644 --- a/website/docs/docs/collaborate/dbt-explorer-faqs.md +++ b/website/docs/docs/collaborate/dbt-explorer-faqs.md @@ -37,7 +37,9 @@ No. dbt Explorer and all of its features are only available as a dbt Cloud user -dbt Explorer defaults to the latest production state of a project. Support for staging and development (Cloud CLI and IDE) environments is coming soon. Users can only assign a single production and staging environment per dbt Cloud project. +dbt Explorer defaults to the latest production state of a project. Support for staging is now in . + +development (Cloud CLI and IDE) environments is coming soon. Users can only assign one production and one staging environment per dbt Cloud project. diff --git a/website/docs/docs/collaborate/explore-projects.md b/website/docs/docs/collaborate/explore-projects.md index 4633e86d86c..bfeb0284f69 100644 --- a/website/docs/docs/collaborate/explore-projects.md +++ b/website/docs/docs/collaborate/explore-projects.md @@ -230,6 +230,12 @@ Example of the details view for the model `supplies`: +## Staging environment + +dbt Explorer supports views for [Staging deployment environments](/docs/deploy/deploy-environments#staging-environment), in addition to the Production environment. This gives you a unique view into your pre-production data workflows, with the same tools available in production, while providing an extra layer of scrutiny. Once the Staging environment is configured and has a successful job run, it will be visible on the dbt Explorer landing page. + + + ## Related content - [Enterprise permissions](/docs/cloud/manage-access/enterprise-permissions) - [About model governance](/docs/collaborate/govern/about-model-governance) diff --git a/website/docs/docs/collaborate/govern/project-dependencies.md b/website/docs/docs/collaborate/govern/project-dependencies.md index adb4e1f9df8..8265e953839 100644 --- a/website/docs/docs/collaborate/govern/project-dependencies.md +++ b/website/docs/docs/collaborate/govern/project-dependencies.md @@ -35,8 +35,9 @@ In order to add project dependencies and resolve cross-project `ref`, you must: - Use dbt v1.6 or higher for **both** the upstream ("producer") project and the downstream ("consumer") project. - Define models in an upstream ("producer") project that are configured with [`access: public`](/reference/resource-configs/access). To apply the change, rerun a production job. - Have a deployment environment in the upstream ("producer") project [that is set to be your production environment](/docs/deploy/deploy-environments#set-as-production-environment) -- Have a successful run of the upstream ("producer") project -- Have a multi-tenant or single-tenant [dbt Cloud Enterprise](https://www.getdbt.com/pricing) account (Azure ST is not supported but coming soon) +- Have a successful run of the upstream ("producer") project. +- Define and trigger a job before marking the environment as Staging. Read more about [Staging environments with downstream dependencies](/docs/collaborate/govern/project-dependencies#staging-with-downstream-dependencies). +- Have a multi-tenant or single-tenant [dbt Cloud Enterprise](https://www.getdbt.com/pricing) account (Azure ST is not supported but coming soon.) ## Example @@ -104,6 +105,20 @@ with monthly_revenue as ( For more guidance on how to use dbt Mesh, refer to the dedicated [dbt Mesh guide](/best-practices/how-we-mesh/mesh-1-intro). +### Safeguarding production data with staging environments + +When working in a Development environment, cross-project `ref`s normally resolve to the Production environment of the project. However, to protect production data, set up a [Staging deployment environment](/docs/deploy/deploy-environments#staging-environment) within your projects. With a staging environment integrated into the project, any references from external projects during development workflows resolve to the Staging environment. This adds a layer of security between your Deployment and Production environments by limiting access to production data. + +Read [Why use a staging environment](/docs/deploy/deploy-environments#why-use-a-staging-environment) for more information about the benefits. + +#### Staging with downstream dependencies + +dbt Cloud begins using the Staging environment to resolve cross-project references from downstream projects as soon as it exists in a project without "fail-over" to Production. To avoid causing downtime for downstream developers, you should define and trigger a job before marking the environment as Staging: +1. Create a new environment, but do NOT mark it as **Staging**. +2. Define a job in that environment. +3. Trigger the job to run, and ensure it completes successfully. +4. Update the environment to mark it as **Staging**. + ### Comparison If you were to instead install the `jaffle_finance` project as a `package` dependency, you would instead be pulling down its full source code and adding it to your runtime environment. This means: diff --git a/website/docs/docs/deploy/deploy-environments.md b/website/docs/docs/deploy/deploy-environments.md index e42d39bad82..42f34740164 100644 --- a/website/docs/docs/deploy/deploy-environments.md +++ b/website/docs/docs/deploy/deploy-environments.md @@ -18,7 +18,7 @@ To learn different approaches to managing dbt Cloud environments and recommendat Learn more about development vs. deployment environments in [dbt Cloud Environments](/docs/dbt-cloud-environments). -There are three types of deployment environments that serve different needs: +There are three types of deployment environments: - **Production**: Environment for transforming data and building pipelines for production use. - **Staging**: Environment for working with production tools while limiting access to production data. - **General**: General use environment for deployment development. @@ -41,6 +41,46 @@ In dbt Cloud, each project can have one designated deployment environment, which For Semantic Layer-eligible customers, the next section of environment settings is the Semantic Layer configurations. [The Semantic Layer setup guide](/docs/use-dbt-semantic-layer/setup-sl) has the most up-to-date setup instructions! +## Staging environment + +:::note +Currently in limited availability beta. Contact support or your account team if you're interested in beta access. +::: + +Use a Staging environment to grant developers access to deployment workflows and tools while controlling access to production data. You can do this in a couple of ways, but the most straightforward is to configure Staging with a long-living branch (for example, `staging`) similar to but separate from the primary branch (for example, `main`). + +In this scenario, the workflows would ideally move upstream from the Development environment -> Staging environment -> Production environment with developer branches feeding into the `staging` branch, then ultimately merging into `main`. In many cases, the `main` and `staging` branches will be identical after a merge and remain until the next batch of changes from the `development` branches are ready to be elevated. We recommend setting branch protection rules on `staging` similar to `main`. + +### Why use a staging environment + +There are two primary motivations for using a Staging environment: +1. An additional validation layer before changes are deployed into Production. You can deploy, test, and explore your dbt models in Staging. +2. Clear isolation between development workflows and production data. It enables developers to work in metadata-powered ways, using features like deferral and cross-project references, without accessing data in production deployments. + +:::info Coming soon: environment-level permissions +Provide developers with the ability to create, edit, and trigger ad hoc jobs in the Staging environment, while keeping the Production environment locked down. +::: + +Let's say you have `Project B` downstream of `Project A` with cross-project refs configured in the models. When developers work in the IDE for `Project B`, cross-project refs will resolve to the Staging environment of `Project A`, rather than production. You'll get the same results with those refs when jobs are run in the Staging environment. Only the Production environment will reference the Production data, keeping the data and access isolated without needing separate projects. + +If `Project B` also has a Staging deployment, then references to unbuilt upstream models within `Project B` will resolve to that environment, using [deferral](/docs/cloud/about-cloud-develop-defer), rather than resolving to the models in Production. This saves developers time and warehouse spend, while preserving clear separation of environments. + +Finally, the Staging environment has its own view in [dbt Explorer](/docs/collaborate/explore-projects), giving you a full view of your prod and pre-prod data. + + + + +### Create a Staging environment + +In the dbt Cloud, navigate to **Deploy** -> **Environments** and then click **Create Environment**. Select **Deployment** as the environment type. The option will be greyed out if you already have a development environment. + + + + +Follow the steps outlined in [deployment credentials](#deployment-connection) to complete the remainder of the environment setup. + +We recommend that the data warehouse credentials be for a dedicated user or service principal. + ## Deployment connection @@ -189,47 +229,6 @@ This section allows you to determine the credentials that should be used when co - -## Staging environment - -:::note -Currently in limited availability beta. Contact support or your account team if you're interested in beta access. -::: - -Staging environments are useful ways to grant developers access to deployment workflows and tools while controlling access to production data. They are configured with their own long-living branch (for example, `staging`) that may be very similar to `main` in many ways while potentially limiting the data the developers can access. - -Ideally, the workflows would move upstream from the Development environment -> Staging environment -> Production environment with developer branches feeding into the staging branch, then ultimately `main`. In many cases, the `main` and `staging` branches will be identical after a merge and remain until the next batch of changes from the `development` branches are ready to be elevated. We recommend setting branch protection rules on `staging` similar to `main`. - -### Create a staging environment - -In the dbt Cloud, navigate to **Deploy** -> **Environments** and then click **Create Environment**. Select **Deployment** as the environment type. The option will be greyed out if you already have a development environment. - - - - -Follow the steps outlined in [deployment credentials](#deployment-connection) to complete the remainder of the environment setup. - -We recommend that the data warehouse credentials be for a dedicated user or service principal. - -### Why use a staging environment - -There are two primary motivations for using a Staging environment: -1. An additional validation layer before changes are deployed into Production. You can deploy, test, and explore your dbt models in Staging. -2. Clear isolation between development workflows and production data. It enables developers to work in metadata-powered ways, using features like deferral and cross-project references, without accessing data in production deployments. - -:::info Coming soon: environment-level permissions -Provide developers with the ability to create, edit, and trigger ad hoc jobs in the Staging environment, while keeping the Production environment locked down. -::: - -Let's say you have `Project B` downstream of `Project A` with cross-project refs configured in the models. When developers work in the IDE for `Project B`, cross-project refs will resolve to the Staging environment of `Project A`, rather than production. You'll get the same results with those refs when jobs are run in the Staging environment. Only the Production environment will reference the Production data, keeping the data and access isolated without needing separate projects. - -If `Project B` also has a Staging deployment, then references to unbuilt upstream models within `Project B` will resolve to that environment, using [deferral](/docs/cloud/about-cloud-develop-defer), rather than resolving to the models in Production. This saves developers time and warehouse spend, while preserving clear separation of environments. - -Finally, the Staging environment has its own view in [dbt Explorer](/docs/collaborate/explore-projects), giving you a full view of your prod and pre-prod data. - - - - ## Related docs - [dbt Cloud environment best practices](/guides/set-up-ci)