Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redshift unit test limitation #5508

Merged
merged 9 commits into from
May 15, 2024
2 changes: 1 addition & 1 deletion website/docs/docs/build/unit-tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ With dbt Core v1.8 and dbt Cloud environments that opt to "Keep on latest versio
- You must specify all fields in a BigQuery STRUCT in a unit test. You cannot use only a subset of fields in a STRUCT.
- If your model has multiple versions, by default the unit test will run on *all* versions of your model. Read [unit testing versioned models](#unit-testing-versioned-models) for more information.
- Unit tests must be defined in a YML file in your `models/` directory.
- Available to dbt Cloud customers who have selected ["Keep on latest version"](/docs/dbt-versions/upgrade-dbt-version-in-cloud#keep-on-latest-version) and dbt Core v1.8.0 or later.
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
- Table names must be [aliased](/docs/build/custom-aliases) in order to unit test `join` logic.
- Redshift customers need to be aware of a [limitation when building unit tests](/reference/resource-configs/redshift-configs#unit-test-limitations) that requires a workaround.

Read the [reference doc](/reference/resource-properties/unit-tests) for more details about formatting your unit tests.

Expand Down
85 changes: 85 additions & 0 deletions website/docs/reference/resource-configs/redshift-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,3 +257,88 @@ The workaround is to execute `DROP MATERIALIZED VIEW my_mv CASCADE` on the data
</VersionBlock>

</VersionBlock>

<VersionBlock firstVersion="1.8">

## Unit test limitations

Unit tests aren't supported on Redshift if the SQL in the common table expression (CTE) contains functions such as `LISTAGG`, `MEDIAN`, `PERCENTILE_CONT`, etc. Those functions must be executed against a user-created table. dbt combines given rows to be part of the CTE, which is unsupported by Redshift. You can try the following SQL:
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved

```sql

create temporary table "test_tmpxxxxx" as (
with test_fixture as (
select
cast(1000 as integer) as id,
cast('menu1' as character varying(500)) as name,
cast( 1 as integer) as quantity
union all
select
cast(1001 as integer) as id,
cast('menu2' as character varying(500)) as name,
cast( 1 as integer) as quantity
union all
select
cast(1003 as integer) as id,
cast('menu1' as character varying(500)) as name,
cast( 1 as integer) as quantity
),
agg as (
SELECT
LISTAGG(name || ' x ' || quantity, ',') AS option_name_list,
id
FROM test_fixture
GROUP BY id
)
select * from agg
);

```
This results in the error:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This results in the error:
The previous code results in the error:

Copy link
Contributor Author

@matthewshaver matthewshaver May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could going from "The following query" to the code example, to "the previous code" be a little redundant?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewshaver how about "This query results in the error:"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me!

matthewshaver marked this conversation as resolved.
Show resolved Hide resolved

```bash

[XX000] ERROR: One or more of the used functions must be applied on at least one user created tables. Examples of user table only functions are LISTAGG, MEDIAN, PERCENTILE_CONT, etc

```

However, the following query works as expected:

```sql

create temporary table "test_tmp1234" as (
SELECT
cast(1000 as integer) as id,
cast('menu1' as character varying(500)) as name,
cast( 1 as integer) as quantity
union all
select
cast(1001 as integer) as id,
cast('menu2' as character varying(500)) as name,
cast( 1 as integer) as quantity
union all
select
cast(1000 as integer) as id,
cast('menu1' as character varying(500)) as name,
cast( 1 as integer) as quantity
);

with agg as (
SELECT
LISTAGG(name || ' x ' || quantity, ',') AS option_name_list,
id
FROM test_tmp1234
GROUP BY id
)
select * from agg;

```

If all given rows are created as a temporary table first, then running the test by referring to it will result in a successful run.
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved

In short, separate the unit tests into two steps:
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
1. Prepare test fixtures by creating temporary tables.
2. Run unit test query by referring to the temporary tables.

</VersionBlock>

Loading