Skip to content

Commit

Permalink
add suggestions for docs
Browse files Browse the repository at this point in the history
  • Loading branch information
LProcopi15 committed Oct 25, 2024
1 parent e292b7a commit be69a34
Showing 1 changed file with 12 additions and 11 deletions.
23 changes: 12 additions & 11 deletions website/docs/reference/resource-configs/snowflake-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,15 @@ To-do:

## Iceberg table format <Lifecycle status="beta"/>

The dbt-snowflake adapter supports the Iceberg table format. It is available for three of the Snowflake materializations:
The dbt-snowflake adapter supports the Iceberg table format. Prior to developing an Iceberg table with dbt, you need to [configure an external volume in Snowflake](https://docs.snowflake.com/en/user-guide/tables-iceberg-configure-external-volume).

In dbt, Iceberg is available for three of the Snowflake materializations:

- [Table](/docs/build/materializations#table)
- [Incremental](/docs/build/materializations#incremental)
- [Dynamic](#dynamic-tables)
- [Dynamic](#dynamic-tables)

For now, to create Iceberg tables, you must implement a [behavior flag](/reference/global-configs/behavior-changes) due to performance impact related to using Iceberg tables. Snowflake does not support `is_iceberg` on the `Show Objects` query, which dbt depends on for metadata.
For now, to create Iceberg tables, you must implement a [behavior flag](/reference/global-configs/behavior-changes) due to performance impact related to using Iceberg tables.

To use Iceberg, set the `enable_iceberg_materializations` flag to `True` in your dbt_project.yml:

Expand All @@ -34,15 +36,14 @@ flags:

</File>


The following configurations are supported.
For more information, check out the Snowflake reference for [`CREATE ICEBERG TABLE` (Snowflake as the catalog)](https://docs.snowflake.com/en/sql-reference/sql/create-iceberg-table-snowflake).

| Field | Type | Required | Description | Sample input | Note |
| Parameter | Type | Required | Description | Sample input | Note |
| --------------------- | ------ | -------- | -------------------------------------------------------------------------------------------------------------------------- | ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Table Format | String | Yes | Configures the objects table format. | `iceberg` | `iceberg` is the only accepted value. |
| External volume | String | Yes(*) | Specifies the identifier (name) of the external volume where Snowflake writes the Iceberg table's metadata and data files. | `my_s3_bucket` | *You don't need to specify this if the account, database, or schema already has an associated external volume. [More info](https://docs.snowflake.com/en/sql-reference/sql/create-iceberg-table-snowflake#:~:text=Snowflake%20Table%20Structures.-,external_volume) |
| Base location Subpath | String | No | An optional suffix to add to the `base_location` path that dbt automatically specifies. | `jaffle_marketing_folder` | We recommend that you do not specify this. Modifying this parameter results in a new Iceberg table. See [Base Location](#base-location) for more info. |
| `table_format` | String | Yes | Configures the objects table format. | `iceberg` | `iceberg` is the only accepted value. |
| `external_volume` | String | Yes(*) | Specifies the identifier (name) of the external volume where Snowflake writes the Iceberg table's metadata and data files. | `my_s3_bucket` | *You don't need to specify this if the account, database, or schema already has an associated external volume. [More info](https://docs.snowflake.com/user-guide/tables-iceberg-configure-external-volume#set-a-default-external-volume-at-the-account-database-or-schema-level) |
| `base_location_subpath` | String | No | An optional suffix to add to the `base_location` path that dbt automatically specifies. | `jaffle_marketing_folder` | We recommend that you **do not specify this**. Modifying this parameter results in a new Iceberg table. See [Base Location](#base-location) for more info. |

### Example configuration

Expand All @@ -54,7 +55,7 @@ To configure an Iceberg table materialization in dbt, refer to the example confi

{{
config(
materialized = "table",
materialized="table",
table_format="iceberg",
external_volume="s3_iceberg_snow",
)
Expand All @@ -79,14 +80,14 @@ A theoretical (but not recommended) use case is re-using an `EXTERNAL VOLUME` wh

#### Rationale

dbt manages `base_location` on behalf of users to enforce best practices. With Snowflake-managed Iceberg format tables, the user owns and maintains the data storage of the tables in an external storage solution (the declared `external volume`). The `base_ location` parameter declares where to write the data within the external volume. The Snowflake Iceberg catalog keeps track of your Iceberg table regardless of where the data lives within the `external volume` declared and the `base_location` provided. However, Snowflake permits passing anything into the `base_location` field, including an empty string, even reusing the same path across multiple tables. This behavior could result in future technical debt because it will limit the ability to:
dbt manages `base_location` on behalf of users to enforce best practices. With Snowflake-managed Iceberg format tables, the user owns and maintains the data storage of the tables in an external storage solution (the declared `external volume`). The `base_location` parameter declares where to write the data within the external volume. The Snowflake Iceberg catalog keeps track of your Iceberg table regardless of where the data lives within the `external volume` declared and the `base_location` provided. However, Snowflake permits passing anything into the `base_location` field, including an empty string, even reusing the same path across multiple tables. This behavior could result in future technical debt because it will limit the ability to:

- Navigate the underlying object store (S3/Azure blob)
- Read Iceberg tables via an object-store integration
- Grant schema-specific access to tables via object store
- Use a crawler pointed at the tables within the external volume to build a new catalog with another tool

To maintain best practices, we enforce an input. Currently, we do not support overriding the default `base location` input but will consider it based on user feedback.
To maintain best practices, we enforce an input. Currently, we do not support overriding the default `base_location` input but will consider it based on user feedback.

In summary, dbt-snowflake does not support arbitrary definition of `base_location` for Iceberg tables. Instead, dbt, by default, writes your tables within a `_dbt/{SCHEMA_NAME}/{TABLE_NAME}` prefix to ensure easier object-store observability and auditability.

Expand Down

0 comments on commit be69a34

Please sign in to comment.