Skip to content

Commit

Permalink
Merge branch 'nfiann-bigquery-cloud-config' of https://github.com/dbt…
Browse files Browse the repository at this point in the history
…-labs/docs.getdbt.com into nfiann-bigquery-cloud-config
  • Loading branch information
nataliefiann committed Oct 28, 2024
2 parents c0293ac + 88eb73c commit ff2a9d5
Show file tree
Hide file tree
Showing 7 changed files with 33 additions and 18 deletions.
29 changes: 14 additions & 15 deletions website/docs/docs/cloud/connect-data-platform/connnect-bigquery.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,27 +61,26 @@ To customize your optional configurations in dbt Cloud:
1. Click your name at the bottom left-hand side bar menu in dbt Cloud
2. Select **Your profile** from the menu
3. From there, click **Projects** and select your BigQuery project
4. Select your BigQuery project from the left-hand menu
5. Go to **Development Connection** and select BigQuery
6. Click **Edit** and then scroll down to **Optional settings**

<Lightbox src="/img/bigquery/bigquery-optional-config.png" width="70%" title="BigQuery optional configuration"/>

The following are the optional configurations you can set in dbt Cloud:

| Configuration | Information | Type | Example |
|--------------------------------|------------------------------------------------------------------------------------------------------------------------------|---------|-----------------------------|
| [Priority](#priority) | Sets the priority for BigQuery jobs (either immediate or queued for batch processing) | String | `batch` or `interactive` |
| [Retries](#retries) | Specifies the number of retries for failed jobs due to temporary issues | Integer | `3` |
| [Location](#location) | Location for creating new datasets | String | `US`, `EU`, `us-west2` |
| [Maximum bytes billed](#maximum-bytes-billed) | Limits the maximum number of bytes that can be billed for a query | Integer | `1000000000` |
| [Execution project](#execution-project) | Specifies the project ID to bill for query execution | String | `my-project-id` |
| [Impersonate service account](#impersonate-service-account) | Allows users authenticated locally to access BigQuery resources under a specified service account | String | `[email protected]` |
| [Job retry deadline seconds](#job-retry-deadline-seconds) | Sets the total number of seconds BigQuery will attempt to retry a job if it fails | Integer | `600` |
| [Job creation timeout seconds](#job-creation-timeout-seconds) | Specifies the maximum timeout for the job creation step | Integer | `120` |
| [Google cloud storage-bucket](#google-cloud-storage-bucket) | Location for storing objects in Google Cloud Storage | String | `my-bucket` |
| [Dataproc region](#dataproc-region) | Specifies the cloud region for running data processing jobs | String | `US`, `EU`, `asia-northeast1` |
| [Dataproc cluster name](#dataproc-cluster-name) | Assigns a unique identifier to a group of virtual machines in Dataproc | String | `my-cluster` |
| Configuration | <div style={{width:'250'}}>Information</div> | Type | <div style={{width:'150'}}>Example</div> |
|---------------------------|-----------------------------------------|---------|--------------------|
| [Priority](#priority) | Sets the priority for BigQuery jobs (either `interactive` or queued for `batch` processing) | String | `batch` or `interactive` |
| [Retries](#retries) | Specifies the number of retries for failed jobs due to temporary issues | Integer | `3` |
| [Location](#location) | Location for creating new datasets | String | `US`, `EU`, `us-west2` |
| [Maximum bytes billed](#maximum-bytes-billed) | Limits the maximum number of bytes that can be billed for a query | Integer | `1000000000` |
| [Execution project](#execution-project) | Specifies the project ID to bill for query execution | String | `my-project-id` |
| [Impersonate service account](#impersonate-service-account) | Allows users authenticated locally to access BigQuery resources under a specified service account | String | `[email protected]` |
| [Job retry deadline seconds](#job-retry-deadline-seconds) | Sets the total number of seconds BigQuery will attempt to retry a job if it fails | Integer | `600` |
| [Job creation timeout seconds](#job-creation-timeout-seconds) | Specifies the maximum timeout for the job creation step | Integer | `120` |
| [Google cloud storage-bucket](#google-cloud-storage-bucket) | Location for storing objects in Google Cloud Storage | String | `my-bucket` |
| [Dataproc region](#dataproc-region) | Specifies the cloud region for running data processing jobs | String | `US`, `EU`, `asia-northeast1` |
| [Dataproc cluster name](#dataproc-cluster-name) | Assigns a unique identifier to a group of virtual machines in Dataproc | String | `my-cluster` |


<Expandable alt_header="Priority">
Expand Down Expand Up @@ -158,7 +157,7 @@ Everything you store in Cloud Storage must be placed inside a [bucket](https://c

A designated location in the cloud where you can run your data processing jobs efficiently. This region must match the location of your BigQuery dataset if you want to use Dataproc with BigQuery to ensure data doesn't move across regions, which can be inefficient and costly.

For more information on [dataproc regions](https://cloud.google.com/bigquery/docs/locations), refer to the BigQuery documentation.
For more information on [Dataproc regions](https://cloud.google.com/bigquery/docs/locations), refer to the BigQuery documentation.

</Expandable>

Expand Down
5 changes: 4 additions & 1 deletion website/docs/docs/cloud/secure/about-privatelink.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,13 @@ sidebar_label: "About PrivateLink"

import SetUpPages from '/snippets/_available-tiers-privatelink.md';
import PrivateLinkHostnameWarning from '/snippets/_privatelink-hostname-restriction.md';
import CloudProviders from '/snippets/_privatelink-across-providers.md';

<SetUpPages features={'/snippets/_available-tiers-privatelink.md'}/>

PrivateLink enables a private connection from any dbt Cloud Multi-Tenant environment to your data platform hosted on AWS using [AWS PrivateLink](https://aws.amazon.com/privatelink/) technology. PrivateLink allows dbt Cloud customers to meet security and compliance controls as it allows connectivity between dbt Cloud and your data platform without traversing the public internet. This feature is supported in most regions across NA, Europe, and Asia, but [contact us](https://www.getdbt.com/contact/) if you have questions about availability.
PrivateLink enables a private connection from any dbt Cloud Multi-Tenant environment to your data platform hosted on a cloud provider, such as [AWS](https://aws.amazon.com/privatelink/) or [Azure](https://azure.microsoft.com/en-us/products/private-link), using that provider’s PrivateLink technology. PrivateLink allows dbt Cloud customers to meet security and compliance controls as it allows connectivity between dbt Cloud and your data platform without traversing the public internet. This feature is supported in most regions across NA, Europe, and Asia, but [contact us](https://www.getdbt.com/contact/) if you have questions about availability.

<CloudProviders type='a data platform' />

### Cross-region PrivateLink

Expand Down
3 changes: 3 additions & 0 deletions website/docs/docs/cloud/secure/databricks-privatelink.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,14 @@ pagination_next: null

import SetUpPages from '/snippets/_available-tiers-privatelink.md';
import PrivateLinkSLA from '/snippets/_PrivateLink-SLA.md';
import CloudProviders from '/snippets/_privatelink-across-providers.md';

<SetUpPages features={'/snippets/_available-tiers-privatelink.md'}/>

The following steps will walk you through the setup of a Databricks AWS PrivateLink or Azure Private Link endpoint in the dbt Cloud multi-tenant environment.

<CloudProviders type='Databricks'/>

## Configure AWS PrivateLink

1. Locate your [Databricks instance name](https://docs.databricks.com/en/workspace/workspace-details.html#workspace-instance-names-urls-and-ids)
Expand Down
5 changes: 4 additions & 1 deletion website/docs/docs/cloud/secure/postgres-privatelink.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,14 @@ sidebar_label: "PrivateLink for Postgres"
import SetUpPages from '/snippets/_available-tiers-privatelink.md';
import PrivateLinkTroubleshooting from '/snippets/_privatelink-troubleshooting.md';
import PrivateLinkCrossZone from '/snippets/_privatelink-cross-zone-load-balancing.md';
import CloudProviders from '/snippets/_privatelink-across-providers.md';

<SetUpPages features={'/snippets/_available-tiers-privatelink.md'}/>

A Postgres database, hosted either in AWS or in a properly connected on-prem data center, can be accessed through a private network connection using AWS Interface-type PrivateLink. The type of Target Group connected to the Network Load Balancer (NLB) may vary based on the location and type of Postgres instance being connected, as explained in the following steps.

<CloudProviders type='Postgres' />

## Configuring Postgres interface-type PrivateLink

### 1. Provision AWS resources
Expand Down Expand Up @@ -96,4 +99,4 @@ Once dbt Cloud support completes the configuration, you can start creating new c
4. Configure the remaining data platform details.
5. Test your connection and save it.

<PrivateLinkTroubleshooting features={'/snippets/_privatelink-troubleshooting.md'}/>
<PrivateLinkTroubleshooting features={'/snippets/_privatelink-troubleshooting.md'}/>
5 changes: 4 additions & 1 deletion website/docs/docs/cloud/secure/redshift-privatelink.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ sidebar_label: "PrivateLink for Redshift"
import SetUpPages from '/snippets/_available-tiers-privatelink.md';
import PrivateLinkTroubleshooting from '/snippets/_privatelink-troubleshooting.md';
import PrivateLinkCrossZone from '/snippets/_privatelink-cross-zone-load-balancing.md';
import CloudProviders from '/snippets/_privatelink-across-providers.md';

<SetUpPages features={'/snippets/_available-tiers-privatelink.md'}/>

Expand All @@ -17,6 +18,8 @@ AWS provides two different ways to create a PrivateLink VPC endpoint for a Redsh

dbt Cloud supports both types of endpoints, but there are a number of [considerations](https://docs.aws.amazon.com/redshift/latest/mgmt/managing-cluster-cross-vpc.html#managing-cluster-cross-vpc-considerations) to take into account when deciding which endpoint type to use. Redshift-managed provides a far simpler setup with no additional cost, which might make it the preferred option for many, but may not be an option in all environments. Based on these criteria, you will need to determine which is the right type for your system. Follow the instructions from the section below that corresponds to your chosen endpoint type.

<CloudProviders type='Redshift' />

:::note Redshift Serverless
While Redshift Serverless does support Redshift-managed type VPC endpoints, this functionality is not currently available across AWS accounts. Due to this limitation, an Interface-type VPC endpoint service must be used for Redshift Serverless cluster PrivateLink connectivity from dbt Cloud.
:::
Expand Down Expand Up @@ -125,4 +128,4 @@ Once dbt Cloud support completes the configuration, you can start creating new c
4. Configure the remaining data platform details.
5. Test your connection and save it.

<PrivateLinkTroubleshooting features={'/snippets/_privatelink-troubleshooting.md'}/>
<PrivateLinkTroubleshooting features={'/snippets/_privatelink-troubleshooting.md'}/>
3 changes: 3 additions & 0 deletions website/docs/docs/cloud/secure/snowflake-privatelink.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,14 @@ sidebar_label: "PrivateLink for Snowflake"
---

import SetUpPages from '/snippets/_available-tiers-privatelink.md';
import CloudProviders from '/snippets/_privatelink-across-providers.md';

<SetUpPages features={'/snippets/_available-tiers-privatelink.md'}/>

The following steps walk you through the setup of a Snowflake AWS PrivateLink or Azure Private Link endpoint in a dbt Cloud multi-tenant environment.

<CloudProviders type='Snowflake' />

:::note Snowflake SSO with PrivateLink
Users connecting to Snowflake using SSO over a PrivateLink connection from dbt Cloud will also require access to a PrivateLink endpoint from their local workstation.

Expand Down
1 change: 1 addition & 0 deletions website/snippets/_privatelink-across-providers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
PrivateLink endpoints can't connect across cloud providers. For a PrivateLink connection to work, both dbt Cloud and the server (like {props.type}) must be hosted on the same cloud provider. For example, dbt Cloud hosted on AWS cannot connect via PrivateLink to services hosted on Azure, and dbt Cloud hosted on Azure can’t connect via Private Link to services hosted on AWS.

0 comments on commit ff2a9d5

Please sign in to comment.