diff --git a/docs/assets/snippets/models/raw_vault/eff_sats/eff_sat_customer_order_incremental_nae.sql b/docs/assets/snippets/models/raw_vault/eff_sats/eff_sat_customer_order_incremental_nae.sql index 5d5a7fe..2ec3338 100644 --- a/docs/assets/snippets/models/raw_vault/eff_sats/eff_sat_customer_order_incremental_nae.sql +++ b/docs/assets/snippets/models/raw_vault/eff_sats/eff_sat_customer_order_incremental_nae.sql @@ -1,3 +1,15 @@ +/* +config level (compatible with dbt Core): +{{ config( + is_auto_end_dating=false +) }} + +meta level (compatible with dbt Core and Fusion): +{{ config( + meta={'is_auto_end_dating': false} +) }} +*/ + {{ config( is_auto_end_dating=false ) }} diff --git a/docs/best_practises/loading.md b/docs/best_practises/loading.md index 7f4d1fc..35a51d8 100644 --- a/docs/best_practises/loading.md +++ b/docs/best_practises/loading.md @@ -77,6 +77,19 @@ If it cannot be guaranteed that a load contains **_new_** deltas (i.e. data whic `apply_source_filter` config in your Satellites. This is done on a per-satellite basis if using config blocks, or can be applied to all Satellites using YAML configs (see the [dbt docs](https://docs.getdbt.com/reference/model-configs#configuring-models)). +!!! note "Configuration: config level vs. meta level" + The `apply_source_filter` config can be provided in two ways: + + - **config level** (compatible with dbt Core): Set directly under `config()` + ```jinja + {{ config(apply_source_filter=true) }} + ``` + + - **meta level** (compatible with dbt Core and Fusion): Place within a `meta` dict under `config()` + ```jinja + {{ config(meta={'apply_source_filter': true}) }} + ``` + This will add an additional guardrail (to those added in v0.10.0) which will filter the data coming from the `source_model` during **_incremental loads_**. Please note, that though convenient, this is not a substitution for designing your loading and staging approach correctly and using this diff --git a/docs/macros/index.md b/docs/macros/index.md index 122aad1..436edf1 100644 --- a/docs/macros/index.md +++ b/docs/macros/index.md @@ -846,16 +846,16 @@ This section covers global variables ([var](https://docs.getdbt.com/docs/build/p === "apply_source_filter (config)" !!! tip "Added in v0.10.1" - + This config option adds a WHERE clause (in incremental mode) using an additional CTE in the SQL code to filter the `source_model`'s data - + This ensures that records in the source data are filtered so that only records with `src_ldts` after the MAX ldts in the existing Satellite are processed during the satellite load. - + **It is intended for this config option to be used if you cannot guarantee atomic/idempotent batches i.e. only data which has not been loaded yet in your stage data.** - === "Example (model file)" - + === "config level (compatible with dbt Core)" + ```sql -- sat_customer_details.sql @@ -870,10 +870,37 @@ This section covers global variables ([var](https://docs.getdbt.com/docs/build/p {{ automate_dv.sat(src_pk=src_pk, src_hashdiff=src_hashdiff, src_payload=src_payload, src_extra_columns=src_extra_columns, - src_eff=src_eff, src_ldts=src_ldts, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, source_model=source_model) }} - + + ``` + + === "meta level (compatible with dbt Core and Fusion)" + + ```sql + + -- sat_customer_details.sql + {{ + config( + meta={'apply_source_filter': true} + ) + }} + + {% set src_pk = ... %} + ... + + {{ automate_dv.sat(src_pk=src_pk, src_hashdiff=src_hashdiff, src_payload=src_payload, + src_extra_columns=src_extra_columns, + src_eff=src_eff, src_ldts=src_ldts, + src_source=src_source, source_model=source_model) }} + ``` + + !!! note "Configuration: config level vs. meta level" + The `apply_source_filter` config can be provided in two ways: + + - **config level** (compatible with dbt Core): Set directly under `config()` + - **meta level** (compatible with dbt Core and Fusion): Place within a `meta` dict under `config()` === "enable_ghost_records (var)" @@ -1231,14 +1258,33 @@ Generates SQL to build an Effectivity Satellite table using the provided paramet Auto end-dating is enabled by providing a config option as below: -``` jinja -{{ config(is_auto_end_dating=true) }} +=== "config level (compatible with dbt Core)" -{{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, - src_start_date=src_start_date, src_end_date=src_end_date, - src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, - source_model=source_model) }} -``` + ``` jinja + {{ config(is_auto_end_dating=true) }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` + +=== "meta level (compatible with dbt Core and Fusion)" + + ``` jinja + {{ config(meta={'is_auto_end_dating': true}) }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` + +!!! note "Configuration: config level vs. meta level" + The `is_auto_end_dating` config can be provided in two ways: + + - **config level** (compatible with dbt Core): Set directly under `config()` + - **meta level** (compatible with dbt Core and Fusion): Place within a `meta` dict under `config()` This will enable 3 extra CTEs in the Effectivity Satellite SQL generated by the macro. Examples of this SQL are in the Example Output section above. The result of this will be additional effectivity records with end dates included, which diff --git a/docs/materialisations.md b/docs/materialisations.md index 18e31d8..567b63e 100644 --- a/docs/materialisations.md +++ b/docs/materialisations.md @@ -53,51 +53,119 @@ range. More detail on how this works is below. === "Manual Load range #1" - ```jinja - {{ config(materialized='vault_insert_by_period', timestamp_field='LOAD_DATE', period='day', - start_date='2020-01-30') }} - - {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, - src_start_date=src_start_date, src_end_date=src_end_date, - src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, - source_model=source_model) }} - ``` + Specify only `start_date` - the load will start at this date and the `stop_date` will be set to the **current date**. + + === "config level (compatible with dbt Core)" + + ```jinja + {{ config(materialized='vault_insert_by_period', timestamp_field='LOAD_DATE', period='day', + start_date='2020-01-30') }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` + + === "meta level (compatible with dbt Core and Fusion)" + + ```jinja + {{ config(materialized='vault_insert_by_period', + meta={'timestamp_field': 'LOAD_DATE', 'period': 'day', 'start_date': '2020-01-30'}) }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` === "Manual Load range #2" - ```jinja - {{ config(materialized='vault_insert_by_period', timestamp_field='LOAD_DATE', period='day', - start_date='2020-01-30', stop_date='2020-04-30') }} - - {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, - src_start_date=src_start_date, src_end_date=src_end_date, - src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, - source_model=source_model) }} - ``` + Specify both `start_date` and `stop_date` - the load will start at `start_date` and stop at `stop_date`. + + === "config level (compatible with dbt Core)" + + ```jinja + {{ config(materialized='vault_insert_by_period', timestamp_field='LOAD_DATE', period='day', + start_date='2020-01-30', stop_date='2020-04-30') }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` + + === "meta level (compatible with dbt Core and Fusion)" + + ```jinja + {{ config(materialized='vault_insert_by_period', + meta={'timestamp_field': 'LOAD_DATE', 'period': 'day', + 'start_date': '2020-01-30', 'stop_date': '2020-04-30'}) }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` === "Manual Load range #3" - ```jinja - {{ config(materialized='vault_insert_by_period', timestamp_field='LOAD_DATE', period='day', - start_date='2020-01-30', stop_date='2020-04-30', date_source_models=var('source_model')) }} - - {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, - src_start_date=src_start_date, src_end_date=src_end_date, - src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, - source_model=source_model) }} - ``` + Specify all three config options - manually provided configuration acts as an override. The load will start at `start_date` and stop at `stop_date`. + + === "config level (compatible with dbt Core)" + + ```jinja + {{ config(materialized='vault_insert_by_period', timestamp_field='LOAD_DATE', period='day', + start_date='2020-01-30', stop_date='2020-04-30', date_source_models=var('source_model')) }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` + + === "meta level (compatible with dbt Core and Fusion)" + + ```jinja + {{ config(materialized='vault_insert_by_period', + meta={'timestamp_field': 'LOAD_DATE', 'period': 'day', + 'start_date': '2020-01-30', 'stop_date': '2020-04-30', + 'date_source_models': var('source_model')}) }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` === "Inferred Load range" - ```jinja - {{ config(materialized='vault_insert_by_period', timestamp_field='LOAD_DATE', period='day', - date_source_models=var('source_model')) }} - - {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, - src_start_date=src_start_date, src_end_date=src_end_date, - src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, - source_model=source_model) }} - ``` + Specify `date_source_models` only - the models will be unioned together, and the minimum and maximum dates extracted from the data in the `timestamp_field`. + + === "config level (compatible with dbt Core)" + + ```jinja + {{ config(materialized='vault_insert_by_period', timestamp_field='LOAD_DATE', period='day', + date_source_models=var('source_model')) }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` + + === "meta level (compatible with dbt Core and Fusion)" + + ```jinja + {{ config(materialized='vault_insert_by_period', + meta={'timestamp_field': 'LOAD_DATE', 'period': 'day', + 'date_source_models': var('source_model')}) }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` #### Initial/Base Load vs. Incremental Load @@ -154,6 +222,13 @@ materialisation is configured, the start and end of the load will get defined di Please refer to the _Usage_ section above to see examples. +!!! note "Configuration: config level vs. meta level" + The configuration elements (`timestamp_field`, `period`, `start_date`, `stop_date`, `date_source_models`) + can be provided in two ways: + + - **config level** (compatible with dbt Core): Set directly under `config()` + - **meta level** (compatible with dbt Core and Fusion): Place within a `meta` dict under `config()` + #### Configuration Options | Configuration | Description | Type | Default | Required? | @@ -267,14 +342,35 @@ column. #### Usage -```jinja -{{ config(materialized='vault_insert_by_rank', rank_column='AUTOMATE_DV_RANK', rank_source_models='MY_STAGE') }} +=== "config level (compatible with dbt Core)" -{{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, - src_start_date=src_start_date, src_end_date=src_end_date, - src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, - source_model=source_model) }} -``` + ```jinja + {{ config(materialized='vault_insert_by_rank', rank_column='AUTOMATE_DV_RANK', rank_source_models='MY_STAGE') }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` + +=== "meta level (compatible with dbt Core and Fusion)" + + ```jinja + {{ config(materialized='vault_insert_by_rank', + meta={'rank_column': 'AUTOMATE_DV_RANK', 'rank_source_models': 'MY_STAGE'}) }} + + {{ automate_dv.eff_sat(src_pk=src_pk, src_dfk=src_dfk, src_sfk=src_sfk, + src_start_date=src_start_date, src_end_date=src_end_date, + src_eff=src_eff, src_ldts=src_ldts, src_source=src_source, + source_model=source_model) }} + ``` + +!!! note "Configuration: config level vs. meta level" + The configuration elements (`rank_column`, `rank_source_models`) + can be provided in two ways: + + - **config level** (compatible with dbt Core): Set directly under `config()` + - **meta level** (compatible with dbt Core and Fusion): Place within a `meta` dict under `config()` #### Configuration Options diff --git a/mkdocs.yml b/mkdocs.yml index 604a30d..24c9938 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -35,6 +35,7 @@ theme: - announce.dismiss - navigation.top - navigation.indexes + - content.tabs.link nav: - Home: 'index.md'