Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CT-2322/improve contracts error message #7223

Closed
wants to merge 8 commits into from

Conversation

kentkr
Copy link
Contributor

@kentkr kentkr commented Mar 26, 2023

resolves #7209

Description

If a model has a contract error dbt currently prints the yaml columns and sql columns, respectively, as a single string with delimited column names and data type. This is hard to read. The changes made here split the strings and print the yaml and sql columns side by side.

Note that the columns are printed in the order that they are listed in the model or in the yaml file. So columns may not line up correctly. Because of this I decided not to add an extra column that checks if the two columns are the same, a feature requested in the original issue.

I'm not all that familiar with OOP and don't know how self.yaml_columns (and sql columns) is inherited or where it comes from. But surely it can be inherited as something other than a delimited string. If it was a dict ({col : data_type}) it'd be easier to compare and print in an ordered format. That may be more effort than its worth. @jtcohen6 any suggestions?

Checklist

@kentkr kentkr requested a review from a team as a code owner March 26, 2023 03:01
@kentkr kentkr requested review from Fleid and nssalian March 26, 2023 03:01
@cla-bot
Copy link

cla-bot bot commented Mar 26, 2023

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Kyle Kent.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@kentkr
Copy link
Contributor Author

kentkr commented Mar 26, 2023

Just signed the CLA but don't know how to rerun the github check.

@kentkr kentkr force-pushed the CT-2322/improve_contracts_error branch from fa66513 to da35566 Compare March 26, 2023 03:17
@cla-bot
Copy link

cla-bot bot commented Mar 26, 2023

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Kyle Kent.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kentkr Thanks for picking this up so quickly!! :)

But surely it can be inherited as something other than a delimited string.

Surely it can! I don't think this should be too tricky, and it would let us do much nicer things when formatting the error message, to highlight / show only the diffs.

Re: the CLA bot: unfortunately, you might need to do a quick git rebase where you edit your previous commits to include the same email address that's associated with your GitHub username. If that's too tricky to sort out, no worries - I can see you've signed the CLA - once the changes here are good to go, one of us can take your commits, rebase, and open as a new PR to make the bot happy. (We'd of course still give you all credit for the contribution!)

Comment on lines 2142 to 2145
# print columns pretty
# split yaml and sql strings into list
yaml_list = self.yaml_columns.split(', ')
sql_list = self.sql_columns.split(', ')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - I didn't realize that these are the stringified versions of yaml_columns and sql_columns, but that is indeed exactly how they're being passed in here (string_yaml_columns and string_sql_columns):

{%- if sql_columns|length != yaml_columns|length -%}
{%- do exceptions.raise_contract_error(string_yaml_columns, string_sql_columns) -%}
{%- endif -%}
{%- for sql_col in sql_columns -%}
{%- set yaml_col = [] -%}
{%- for this_col in yaml_columns -%}
{%- if this_col['name'] == sql_col['name'] -%}
{%- do yaml_col.append(this_col) -%}
{%- break -%}
{%- endif -%}
{%- endfor -%}
{%- if not yaml_col -%}
{#-- Column with name not found in yaml #}
{%- do exceptions.raise_contract_error(string_yaml_columns, string_sql_columns) -%}
{%- endif -%}
{%- if sql_col['formatted'] != yaml_col[0]['formatted'] -%}
{#-- Column data types don't match #}
{%- do exceptions.raise_contract_error(string_yaml_columns, string_sql_columns) -%}
{%- endif -%}
{%- endfor -%}

It is also true that the logic we're using to compare columns one-by-one, and the logic we'd want to use to display pretty output, should be quite similar.

So, a couple of potential approaches here:

  1. Pass the dict (not str) representation of these columns into the Python exception, so that we can do the same sort of comparison over here again
  2. Since the Jinja macro is already detecting differences - pass only the ordered set of disagreeing columns, from the assert_columns_equivalent macro, into this exception. (But what about the case where the two column lengths differ? We'd still want to detect which columns are present in one, and missing from the other)
  3. Move this comparison logic into a Python method, and have assert_columns_equivalent macro call that method instead of writing this logic in Jinja. That method would call this exception if it detects differences.

(1) is probably simplest with the existing implementation - just switch out string_yaml_columns + string_sql_columns for yaml_columns and sql_columns in the macro definition linked above.

@MichelleArk Curious if you have an opinion on what the right way to go here would be!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtcohen6 Alright, what I just pushed should work. It's not my favorite solution because the columns are actually passed as a list of dictionaries [{col1 : col1 INT}, ...]. I could've removed the list to avoid using so many for loops, but decided mimic the exception call in the .sql file as best I could. Cuz if it aint broke don't fix it. Attached is an image of how the error prints!
Screenshot 2023-04-02 at 6 42 59 PM

As for the rebase issue, if you're happy with these changes I'll open a new PR with the modified files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtcohen6 Alright, what I just pushed should work. It's not my favorite solution because the columns are actually passed as a list of dictionaries [{col1 : col1 INT}, ...]. I could've removed the list to avoid using so many for loops, but decided mimic the exception call in the .sql file as best I could. Cuz if it aint broke don't fix it. Attached is an image of how the error prints!
Screenshot 2023-04-02 at 6 42 59 PM

As for the rebase issue, if you're happy with these changes I'll open a new PR with the modified files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kentkr This is a big improvement!

Sorry for the delay getting back to you. To have a better chance of getting this over the finish line, I'm going to open a new PR with

  • rebased commits to fix the CLA issue
  • a few more aesthetic touch-ups (e.g. highlighting columns with mismatches, and using our existing library for printing markdown tables to stdout)

@kentkr kentkr requested a review from a team as a code owner April 3, 2023 01:45
@cla-bot
Copy link

cla-bot bot commented Apr 3, 2023

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Kyle Kent.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented Apr 3, 2023

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Kyle Kent.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented Apr 3, 2023

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Kyle Kent.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

jtcohen6 added a commit that referenced this pull request Apr 11, 2023
* improved first line of error

* added basic printing of yaml and sql cols as columns

* added changie log

* used listed dictionary as input to match columns

* swapped order of col headers for printing

* used listed dictionary as input to match columns

* removed merge conflict text from file

* Touch-ups

* Update log introspection in functional tests

* Update format_column macro. Case insensitive test

* PR feedback: just data_type, not formatted

---------

Co-authored-by: Kyle Kent <[email protected]>
@jtcohen6
Copy link
Contributor

Incorporated into #7319 !

@jtcohen6 jtcohen6 closed this Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-2322] [Feature] Side-by-side diff when an enforced model contract fails
2 participants