fix: Edge case with metric not getting quoted in sort by when normalize_columns is enabled#33337
Conversation
…ze_columns is enabled
There was a problem hiding this comment.
I've completed my review and didn't find any issues.
Files scanned
| File Path | Reviewed |
|---|---|
| superset/models/helpers.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
|
this is mostly changing the "order of evaluation" (to check if the column is a metric first). No tests failed, but I'm not 100% sure if this order of eval has any importance. @betodealmeida @eschutho @mistercrunch @michael-s-molina @villebro any thoughts? |
|
tagging @hughhhh as I believe you worked on this logic in the past |
mistercrunch
left a comment
There was a problem hiding this comment.
LGTM, though overall the this whole section in the code is rough. Had to do a fair amount of thinking/guessing to understand why the evaluation ordering of the elifs matter here...
| elif col in columns_by_name: | ||
| col = self.convert_tbl_column_to_sqla_col( | ||
| columns_by_name[col], template_processor=template_processor | ||
| ) |
There was a problem hiding this comment.
I'm basically just changing the order to first evaluate the column against metrics, and then across columns.
My understanding of the issue is that in the current if/elif ordering, we would first evaluate col across columns_by_name, which would be true (even tho it's a metric), and then convert_tbl_column_to_sqla_col() does col = sa.column(tbl_column.column_name, type_=type_) which seems to return quoted if the column is uppercase (hence the issue does not happen with normalize_columns disabled):
import sqlalchemy as sa
print(sa.column("test", sa.String))
print(sa.column("TEST", sa.String))test
"TEST"With normalize_columns, we send a lowercase column label to sa.column() which then does not quote it.
…ze_columns is enabled (apache#33337)
…ze_columns is enabled (apache#33337) (cherry picked from commit 9f0ae77)
SUMMARY
When using Snowflake, it's possible to enable the
normalize_columnssetting at the dataset level to have all columns as lowercase. With this setting enabled, in case you use a metric in the chart that has the same key as a column that exists in the dataset (but it's not used in the chart), you would get a SQL error as the metric is used on sorting by default, and the metric name won't be quoted. Superset would generate:As opposed to:
This only happens with this setting enabled. If you disable it and have the columns in uppercase, the exact same chart configuration works.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
No UI changes.
TESTING INSTRUCTIONS
normalize_columnson the dataset.ORDER BYstatement.ADDITIONAL INFORMATION