Skip to content

Conversation

@chenkovsky
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

column types of UTMSource and UTMCampaign in clickbench_partitioned are binary, but in datafusion/core/tests/data/clickbench_hits_10.parquet they are string.

What changes are included in this PR?

add a cast in sql

Are these changes tested?

Manually run clickbench

Are there any user-facing changes?

No

Copy link
Member

@Weijun-H Weijun-H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chenkovsky 👍 I tested cargo bench --profile=dev --bench sql_planner -- physical_plan_clickbench_all, and it worked perfectly.

ELSE 0
END > 1920 -- Extract and validate resolution parameter
AND levenshtein("UTMSource", "UTMCampaign") < 3 -- Verify UTM parameter similarity
AND levenshtein(CAST("UTMSource" AS STRING), CAST("UTMCampaign" AS STRING)) < 3 -- Verify UTM parameter similarity
Copy link
Member

@Weijun-H Weijun-H Apr 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
AND levenshtein(CAST("UTMSource" AS STRING), CAST("UTMCampaign" AS STRING)) < 3 -- Verify UTM parameter similarity
AND levenshtein('UTMSource', 'UTMCampaign') < 3 -- Verify UTM parameter similarity

UPDATE:
ignore this suggection

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Weijun-H Thank you. but I have a question. does single quotation mean literal string and double quotation mean column name? so CAST("UTMSource" AS STRING), and 'UTMSource' have different meaning?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I misspoke earlier. In this case, we can’t use ‘UTMSource’ here because it’s a column name.

@xudong963
Copy link
Member

Thank you

@xudong963 xudong963 merged commit 375189c into apache:main Apr 21, 2025
27 checks passed
@alamb
Copy link
Contributor

alamb commented Apr 28, 2025

Thank you so much @chenkovsky and @xudong963

nirnayroy pushed a commit to nirnayroy/datafusion that referenced this pull request May 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cargo bench --bench sql_planner is failing

4 participants