Default quoting needs to be smarter #515

seancorfield · 2023-11-17T06:43:22Z

Derived from discussion around PR #514

I get this. The thought behind this PR was essentially building on top the existing default (don't quote anything), but honeysql will still quote based on this regex ^[A-Za-z0-9_]_+.

It only does that if :quoted nil. If you explicitly specify :quoted false it will turn quoting off. The behavior is:

:quoted true -- always quote using the dialect's stropping
:quoted nil -- default: only quote non-alphanumeric entities
:quoted false -- do not quote anything

I gather from your response that the problem you want to solve here is that:

"always quoting" makes things case-sensitive and you don't want that (because you're working from existing SQL that would fail if entities were quoted -- because it already doesn't match the case of tables/columns in the database)
and the existing "quote non-alphanumerics" isn't enough because you have columns in the existing SQL that are reserved words in your new SQL database (that presumably weren't in the old SQL database)
and you can't use the "never quote" approach either due to the reserved words issue

Is that an accurate expression of the problem?

(yes)

Part of the issue is that different dialects have different "must quote" rules. 2023_11_16 is perfectly acceptable to some databases, without quoting, but would require quoting for others. The same applies to the list of reserved words that would need to be quoted when used as column names.

A wrinkle in this is DDL that uses things like [:varchar :80] and expects that to produce VARCHAR(80) without quoting "80". So, either known DDL contexts must know to override some of the rules (*ddl-no-quote* perhaps?) or it's completely "caveat programmer" and if we let users provide arbitrary overriding rules, and it breaks their DDL, well then too bad, too sad?

The text was updated successfully, but these errors were encountered:

p-himik · 2023-11-17T07:17:33Z

A wrinkle in this is DDL that uses things like [:varchar :80]

Why not [:varchar 80]?
Might be my personal thing but :80 looks icky - much more icky than :a.b even though both aren't explicitly allowed.

seancorfield · 2023-11-17T07:19:44Z

For good or bad, it's existing documented behavior. [:varchar 80] in any other context would become VARCHAR(?) with a parameter of 80 -- which doesn't fly for DDL.

So there are backward compatibility issues here in spades...

kranstrom · 2023-11-20T07:40:42Z

Would it be acceptable to "hardcode" the different data types that expect a numeric input? (e.g. char, varchar, char varying, number, numeric, etc.)

I know honeysql is meant to be flexible and there are many database providers out there that it's already compatible with (we are using several ourselves). However, it seems to be quite beneficial for the current function to separate between knowing whether it's assessing one of these data type examples and anything else that's a quotable identifier.

Happy to conjure up an example in a fork and link it if that's the case. Anyway I can be helpful here - let me know. Thanks

seancorfield · 2023-11-20T18:32:05Z

DDL is out of scope for this issue, so let's stay on topic. I'll just note that the "function-like" behavior in DDL just falls out of general function "syntax" in HoneySQL and is not explicitly programmed at all (because DDL is just too large and irregular a space across multiple databases). If you want to suggest changes to DDL generation, feel free to create a separate issue.

this is a partial solution, intended to catch (and quote) things like `0abc` while not changing the behavior for `80` or `2023_11_20`

seancorfield · 2023-11-20T19:59:55Z

So far, I have updated the code to exclude identifiers like 0abc from the "OK not to quote" rule -- so they will now get quoted -- but 80 and 2023_11_16 will still be "OK not to quote" and be left as-is.

As far as addressing the concerns expressed above and in the PR, I'm still leaning toward an additional, optional regex flag that matches identifies that you want quoted even if they satisfy the existing "OK not to quote" rule. That will definitely come with some caveats: if you provide a regex that matches things like 80 then you'll "break" the DDL behavior mentioned above (at least until that gets changed).

kranstrom · 2023-11-20T20:44:34Z

Thanks for the work/input here Sean.

As far as addressing the concerns expressed above and in the PR, I'm still leaning toward an additional, optional regex flag that matches identifies that you want quoted even if they satisfy the existing "OK not to quote" rule.

So just so I understand, if I (for example) wanted to quote reserved words - I'd pass those in as regex? I do like that suggestion fwiw.

seancorfield · 2023-11-20T20:47:29Z

That is my current thinking, yes.

kranstrom · 2023-12-04T11:27:42Z

Beautiful @seancorfield , thank you very much!

seancorfield added a commit that referenced this issue Nov 20, 2023

#515: exclude some entities from smart quoting

f46dbc5

this is a partial solution, intended to catch (and quote) things like `0abc` while not changing the behavior for `80` or `2023_11_20`

seancorfield self-assigned this Nov 25, 2023

seancorfield added the needs analysis I need to think about this! label Nov 25, 2023

seancorfield closed this as completed in 1e44f82 Dec 4, 2023

seancorfield added a commit that referenced this issue Dec 4, 2023

add test for #515

2e81570

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default quoting needs to be smarter #515

Default quoting needs to be smarter #515

seancorfield commented Nov 17, 2023

p-himik commented Nov 17, 2023

seancorfield commented Nov 17, 2023

kranstrom commented Nov 20, 2023

seancorfield commented Nov 20, 2023

seancorfield commented Nov 20, 2023

kranstrom commented Nov 20, 2023

seancorfield commented Nov 20, 2023

kranstrom commented Dec 4, 2023

Default quoting needs to be smarter #515

Default quoting needs to be smarter #515

Comments

seancorfield commented Nov 17, 2023

p-himik commented Nov 17, 2023

seancorfield commented Nov 17, 2023

kranstrom commented Nov 20, 2023

seancorfield commented Nov 20, 2023

seancorfield commented Nov 20, 2023

kranstrom commented Nov 20, 2023

seancorfield commented Nov 20, 2023

kranstrom commented Dec 4, 2023