Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-3208] [Feature] Cross-database date macro #192

Closed
3 tasks done
Tracked by #10075
dbeatty10 opened this issue Oct 11, 2023 · 6 comments · Fixed by #191
Closed
3 tasks done
Tracked by #10075

[CT-3208] [Feature] Cross-database date macro #192

dbeatty10 opened this issue Oct 11, 2023 · 6 comments · Fixed by #191
Assignees
Labels

Comments

@dbeatty10
Copy link
Contributor

dbeatty10 commented Oct 11, 2023

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

To support a built-in metricflow_time_spine in dbt-labs/dbt-core#8825, we want to create a to_date macro that converts an ISO 8601-formatted date string into a SQL DATE type.

Rationale for the name to_date

Even though to_date isn't within the SQL standard, databricks, postgres, redshift, and snowflake all have a to_date function that does what we want. Although bigquery is an outlier, nothing we can't solve with a little dispatch magic ✨

Prototype of to_date()

Assuming to_date() is a cross-database macro that takes an ISO 8601 (YYYY-MM-DD) date string as input, here's a completely untested prototype for dbt-postgres:

{% macro to_date(date_str) %}
  {{ return(adapter.dispatch('to_date', 'dbt') (date_str)) }}
{% endmacro %}

{% macro default__to_date(date_str) -%}
    to_date({{ dbt.string_literal(date_str) }})
{%- endmacro %}

Pulling it all together for metricflow_time_spine

The cross-database Jinja template might look like this:

select cast(date_day as date) as date_day
from ({{ dbt.date_spine("day", dbt.to_date("2023-09-01"), dbt.to_date("2023-09-10")) }})

Appendix

Validation and error checking

If we want, we could always add some format validation to the default implementation of to_date() by using the datetime module:

    {%- set dt = modules.datetime.datetime.strptime(date_str, "%Y-%m-%d") -%}
    ...

dbt-bigquery

The implementation for Bigquery may or may not be simple 🤷 If all else fails, something like this might work (completely untested!):

{% macro bigquery__to_date(date_str) -%}
    {%- set dt = modules.datetime.datetime.strptime(date_str, "%Y-%m-%d") -%}
    date({{ dt.year }}, {{ dt.month }}, {{ dt.day }})
{%- endmacro %}

type_date macro

We may (or may not) want to also create a cross-database type_date macro (which doesn't exist today). I haven't seen any database that doesn't call this data type DATE, so that makes it either easy-peasy or extraneous depending how you look at it.

Describe alternatives you've considered

I can't think of a good alternative to creating a cross-database macro that converts a SQL string into a SQL date.

Who will this benefit?

This supports a built-in metricflow_time_spine in dbt-labs/dbt-core#8825, we want a to_date macro that converts an ISO 8601-formatted date string into a SQL DATE type.

Are you interested in contributing this feature?

No response

Anything else?

For refinement

Theoretically, we could choose to add an optional 2nd parameter that specifies the format to use. e.g. something like mm/dd/yyyy.

There are two options for the supported formats for that parameter:

  1. Python datetime format codes
  2. SQL-native format specifications

If we add this 2nd parameter, I would advocate for using the Python format codes since they would give us solid ground to stand on. Then from there, we could convert the resulting datetime using the format specifications that are specific to each adapter as-needed.

I suspect that we could be in a world of hurt if we choose to use SQL-native format specifications instead, because they may differ on a per-adapter basis. I don't know this for sure, just a suspicion.

@dbeatty10 dbeatty10 added enhancement New feature or request triage labels Oct 11, 2023
@github-actions github-actions bot changed the title [Feature] Cross-database to_date macro [CT-3208] [Feature] Cross-database to_date macro Oct 11, 2023
@alison985
Copy link

alison985 commented Oct 25, 2023

FWIW, I had a lot of problems with dates and timestamps(and the lack of :: for casting in general) trying to make cross-dialect compatible code for Redshift and SQL Server. I ended up making this macro:

{%- macro   cross_cast(field_name, data_type) -%}
    {%- if field_name == 'getdate()' -%}
        {{ field_name }}
    {%- elif field_name == 'current_current' -%}
        {{ get_non_string_variable('current_timestamp') }}
    {%- elif data_type in ('datetime', 'timestamp') -%} cast({{ field_name }} as   
        {%- if env_var('DBT_HOSTING') == 'dbt_cloud' %} timestamp{% elif env_var('DBT_HOSTING') == 'local' %} datetime{% else %} timestamp{%- endif -%}
        )
    {%- else -%} cast({{ field_name }} as {{ data_type }})
    {%- endif -%}
{%- endmacro -%}

Examples passed in:

  • {{ cross_cast('null', 'date') }}
  • {{ cross_cast('null', 'timestamp') }}
  • {{ cross_cast('"START__1"', 'date') }}
  • {{ cross_cast('LIMIT', 'double precision') }}

dbt_cloud == redshift. local == SQL Server. get_non_string_variable is my way around no jinja allowed in vars. I never cleaned cross-cast up to be PR ready, but it does work for these 2 dialects. Again, only FWIW, which could be 0.

cc: @dataders and @schlich

Edit to add:

  • note about what get_non_string_variable is
  • And yes I did try dispatch options first, including tsql-utils. There were a lot of things I would have had to hack/patch in either db-sqlserver or tsql-utils to handle my cases. This was a LOT easier and I kept modifying it as I went.

@dbeatty10
Copy link
Contributor Author

dbeatty10 commented Jan 31, 2024

After discussion with @graciegoheen and @Jstein77, we'd like to enable usage like this (with 3 parameters):

{{ dbt.to_date(2023, 10, 4) }}

Putting it all together:

select cast(date_day as date) as date_day
from ({{ dbt.date_spine("day", dbt.to_date(2023, 5, 11), dbt.to_date(2023, 9, 17)) }})

We might be able to just re-use the definition of the date macro from dbt_date and rename it to to_date.

@dbeatty10
Copy link
Contributor Author

After thinking about it a little more, I think we should choose between the names to_date or date depending on the number of arguments and their type(s).

If three arguments, each integers:

date(year, month, day)

If two arguments, both strings:

to_date(expr [, fmt] )

The former would line up with these:

The latter would line up with these:

That way, we can avoid confusion with any of these pre-existing function names.

@martynydbt
Copy link
Contributor

@dbeatty10 need a 🍐 on this or are you good flying solo?

@FishtownBuildBot
Copy link
Collaborator

Opened a new issue in dbt-labs/docs.getdbt.com: dbt-labs/docs.getdbt.com#5427

@dbeatty10 dbeatty10 transferred this issue from dbt-labs/dbt-core May 2, 2024
@dbeatty10 dbeatty10 changed the title [CT-3208] [Feature] Cross-database to_date macro [CT-3208] [Feature] Cross-database date macro May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants