Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dialect param to use CHAR for Utf8 unparsing for MySQL #16

Conversation

sgrebnov
Copy link

Which issue does this PR close?

PR addresses Utf8 unparser issue producing invalid CAST(col AS TEXT) SQL for MySQL.

Rationale for this change

MySQL cast function does not support TEXT and requires CHAR for CAST (automatically returns VARCHAR, TEXT, LONGTEXT)

CHAR[(N)] [charset_info]
Produces a string with the VARCHAR data type, unless the expression expr is empty (zero length), in which case the result type is CHAR(0). If the optional length N is given, CHAR(N) causes the cast to use no more than N characters of the argument. No padding occurs for values shorter than N characters. If the optional length N is not given, MySQL calculates the maximum length from the expression. If the supplied or calculated length is greater than an internal threshold, the result type is TEXT. If the length is still too long, the result type is LONGTEXT.

Example query:

 select * from customer where c_custkey = 'building

Before this change (fails in MySQL)

SELECT `customer`.`c_custkey`, ... FROM `customer` WHERE (CAST(`customer`.`c_custkey` AS TEXT) = 'building')

After this change (works in MySQL)

SELECT `customer`.`c_custkey`,  ... FROM `customer` WHERE (CAST(`customer`.`c_custkey` AS CHAR) = 'building')

What changes are included in this PR?

PR introduces configurable use_char_for_utf8_cast dialect parameter that contolrs wherther CHAR vs TEXT/VARCHAR is using for Utf8 unparsing.

Are these changes tested?

Yes, added unit tests + manual testing

Are there any user-facing changes?

CustomDialogBuilder now supports use_char_for_utf8_cast that can be used to specify whether CHAR vs TEXT/VARCHAR data type should be used for Utf8 unparsing.

@sgrebnov
Copy link
Author

Closing in favor of apache#11494

@sgrebnov sgrebnov closed this Jul 16, 2024
@sgrebnov sgrebnov self-assigned this Jul 19, 2024
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant