Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concat macro can use alternative_concat on Postgres too #296

Merged
merged 2 commits into from
Apr 5, 2021

Conversation

ChristopheDuong
Copy link
Contributor

@ChristopheDuong ChristopheDuong commented Nov 16, 2020

This is a:

  • bug fix PR with no breaking changes (please change the base branch to main)
  • new functionality
  • a breaking change

Description & motivation

Using Postgres database, I have a model that needs to produce a surrogate key or hash on a large number of columns.
Unfortunately it breaks because, on postgres, we cannot pass more than 100 arguments to a function...

The current dbt_utils.surrogate_key is indeed calling the dbt_utils.concat which is using the concat() function that accepts a list of arguments.

However postgres has another alternative to concat string using the binary operator || instead as we can do on Redshift and on Snowflake and adding this adapter version to the dispatched function for the dbt_utils.concat would avoir crashes when passing more than 100 strings to concat together.

Checklist

  • I have verified that these changes work locally on the following warehouses (Note: it's okay if you do not have access to all warehouses, this helps us understand what has been covered)
    • BigQuery
    • Postgres
    • Redshift
    • Snowflake
  • I have updated the README.md (if applicable)
  • I have added tests & descriptions to my models (and macros if applicable)
  • I have added an entry to the changelog

Copy link
Contributor

@clrcrl clrcrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it turns out that Postgres has supported this for a long time, and BigQuery has been supporting it since March

I think we should just rip out the "alternative" syntax and make || the default. Though, @jtcohen6, I'm curious whether you think that might break any other non-core adapters (e.g. does Spark support ||?)

@clrcrl clrcrl force-pushed the dev/0.7.0 branch 5 times, most recently from bbba960 to 60a3b3c Compare January 11, 2021 15:52
@clrcrl clrcrl merged commit dd7e24c into dbt-labs:dev/0.7.0 Apr 5, 2021
@clrcrl clrcrl mentioned this pull request May 19, 2021
11 tasks
jtcohen6 pushed a commit that referenced this pull request Jun 6, 2021
* Postgres also have an alternative concat binary operation (#296)

* Update default implementation of concat macro

Co-authored-by: Christophe Duong <[email protected]>
jtcohen6 added a commit that referenced this pull request Jun 6, 2021
* Tidy up changelog

* Add 0.7.0 entry to changelog

* Add order_by argument to get_column_values (#349)

* Add slugify macro to utils, use in pivot macro (#314)

* 0.20.0 compatibility (#371)

* Explicitly redefine Redshift -> default

* Upgrade generic tests

* Rm namespaces macro. New dispatch syntax

* Run tests with 0.20.0rc1

* Update changelog, readme

Co-authored-by: Jeremy Cohen <[email protected]>

* Simplify concat (#373)

* Postgres also have an alternative concat binary operation (#296)

* Update default implementation of concat macro

Co-authored-by: Christophe Duong <[email protected]>

Co-authored-by: Jeremy Cohen <[email protected]>
Co-authored-by: Christophe Duong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants