-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CT-2322/improve contracts error message #7223
Conversation
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Kyle Kent.
|
Just signed the CLA but don't know how to rerun the github check. |
fa66513
to
da35566
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Kyle Kent.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kentkr Thanks for picking this up so quickly!! :)
But surely it can be inherited as something other than a delimited string.
Surely it can! I don't think this should be too tricky, and it would let us do much nicer things when formatting the error message, to highlight / show only the diffs.
Re: the CLA bot: unfortunately, you might need to do a quick git rebase
where you edit your previous commits to include the same email address that's associated with your GitHub username. If that's too tricky to sort out, no worries - I can see you've signed the CLA - once the changes here are good to go, one of us can take your commits, rebase, and open as a new PR to make the bot happy. (We'd of course still give you all credit for the contribution!)
core/dbt/exceptions.py
Outdated
# print columns pretty | ||
# split yaml and sql strings into list | ||
yaml_list = self.yaml_columns.split(', ') | ||
sql_list = self.sql_columns.split(', ') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - I didn't realize that these are the stringified versions of yaml_columns
and sql_columns
, but that is indeed exactly how they're being passed in here (string_yaml_columns
and string_sql_columns
):
dbt-core/core/dbt/include/global_project/macros/materializations/models/table/columns_spec_ddl.sql
Lines 45 to 65 in 73ff497
{%- if sql_columns|length != yaml_columns|length -%} | |
{%- do exceptions.raise_contract_error(string_yaml_columns, string_sql_columns) -%} | |
{%- endif -%} | |
{%- for sql_col in sql_columns -%} | |
{%- set yaml_col = [] -%} | |
{%- for this_col in yaml_columns -%} | |
{%- if this_col['name'] == sql_col['name'] -%} | |
{%- do yaml_col.append(this_col) -%} | |
{%- break -%} | |
{%- endif -%} | |
{%- endfor -%} | |
{%- if not yaml_col -%} | |
{#-- Column with name not found in yaml #} | |
{%- do exceptions.raise_contract_error(string_yaml_columns, string_sql_columns) -%} | |
{%- endif -%} | |
{%- if sql_col['formatted'] != yaml_col[0]['formatted'] -%} | |
{#-- Column data types don't match #} | |
{%- do exceptions.raise_contract_error(string_yaml_columns, string_sql_columns) -%} | |
{%- endif -%} | |
{%- endfor -%} |
It is also true that the logic we're using to compare columns one-by-one, and the logic we'd want to use to display pretty output, should be quite similar.
So, a couple of potential approaches here:
- Pass the dict (not str) representation of these columns into the Python exception, so that we can do the same sort of comparison over here again
- Since the Jinja macro is already detecting differences - pass only the ordered set of disagreeing columns, from the
assert_columns_equivalent
macro, into this exception. (But what about the case where the two column lengths differ? We'd still want to detect which columns are present in one, and missing from the other) - Move this comparison logic into a Python method, and have
assert_columns_equivalent
macro call that method instead of writing this logic in Jinja. That method would call this exception if it detects differences.
(1) is probably simplest with the existing implementation - just switch out string_yaml_columns
+ string_sql_columns
for yaml_columns
and sql_columns
in the macro definition linked above.
@MichelleArk Curious if you have an opinion on what the right way to go here would be!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jtcohen6 Alright, what I just pushed should work. It's not my favorite solution because the columns are actually passed as a list of dictionaries [{col1 : col1 INT}, ...]
. I could've removed the list to avoid using so many for loops, but decided mimic the exception call in the .sql file as best I could. Cuz if it aint broke don't fix it. Attached is an image of how the error prints!
As for the rebase issue, if you're happy with these changes I'll open a new PR with the modified files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jtcohen6 Alright, what I just pushed should work. It's not my favorite solution because the columns are actually passed as a list of dictionaries [{col1 : col1 INT}, ...]
. I could've removed the list to avoid using so many for loops, but decided mimic the exception call in the .sql file as best I could. Cuz if it aint broke don't fix it. Attached is an image of how the error prints!
As for the rebase issue, if you're happy with these changes I'll open a new PR with the modified files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kentkr This is a big improvement!
Sorry for the delay getting back to you. To have a better chance of getting this over the finish line, I'm going to open a new PR with
- rebased commits to fix the CLA issue
- a few more aesthetic touch-ups (e.g. highlighting columns with mismatches, and using our existing library for printing markdown tables to stdout)
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Kyle Kent.
|
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Kyle Kent.
|
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Kyle Kent.
|
* improved first line of error * added basic printing of yaml and sql cols as columns * added changie log * used listed dictionary as input to match columns * swapped order of col headers for printing * used listed dictionary as input to match columns * removed merge conflict text from file * Touch-ups * Update log introspection in functional tests * Update format_column macro. Case insensitive test * PR feedback: just data_type, not formatted --------- Co-authored-by: Kyle Kent <[email protected]>
Incorporated into #7319 ! |
resolves #7209
Description
If a model has a contract error dbt currently prints the yaml columns and sql columns, respectively, as a single string with delimited column names and data type. This is hard to read. The changes made here split the strings and print the yaml and sql columns side by side.
Note that the columns are printed in the order that they are listed in the model or in the yaml file. So columns may not line up correctly. Because of this I decided not to add an extra column that checks if the two columns are the same, a feature requested in the original issue.
I'm not all that familiar with OOP and don't know how
self.yaml_columns
(and sql columns) is inherited or where it comes from. But surely it can be inherited as something other than a delimited string. If it was a dict ({col : data_type}
) it'd be easier to compare and print in an ordered format. That may be more effort than its worth. @jtcohen6 any suggestions?Checklist
changie new
to create a changelog entry