-
Notifications
You must be signed in to change notification settings - Fork 496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parameterize union_relations macro to support deduplication with UNION #816
Comments
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days. |
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers. |
I'm interested in this feature as well! Anything I can do to help move it along? |
What do you think about the ergonomics of each of the options below? Option 1: new boolean parameter named Option 2: a new string parameter named Option 3 (like in #819): a new boolean parameter named ExamplesOption 1: {{ dbt_utils.union_relations(
relations=[ref('my_model'), source('my_source', 'my_table')],
distinct=false
) }} Option 2: {{ dbt_utils.union_relations(
relations=[ref('my_model'), source('my_source', 'my_table')],
union='distinct'
) }} Option 3: {{ dbt_utils.union_relations(
relations=[ref('my_model'), source('my_source', 'my_table')],
deduplicate=true
) }} |
Option 2 is closest to my initial thought! I had imagined a new string parameter named |
@dbeatty10, mind if I give option 2 a try? |
Describe the feature
By default, the
union_relations
macro performs aUNION ALL
. However, I think the macro should also support deduplication of rows usingUNION
dbt-utils/macros/sql/union.sql
Line 1 in a17d66f
Describe alternatives you've considered
Users could perform a
SELECT DISTINCT *
after theUNION ALL
in their model which would achieve the same functionality. However, I think it more ideal to include the option as a parameter since it is supported by SQL.Additional context
N/A
Who will this benefit?
While it is better practice to dedupe in staging models, teams may want to use a quick
UNION
for prototyping non-productionized code.Are you interested in contributing this feature?
Yes! I would be happy to make a PR with the change and associated tests.
Tasks
The text was updated successfully, but these errors were encountered: