Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array(String) query parameters don't quote the strings #159

Closed
ewjoachim opened this issue Mar 31, 2023 · 2 comments · Fixed by #161
Closed

Array(String) query parameters don't quote the strings #159

ewjoachim opened this issue Mar 31, 2023 · 2 comments · Fixed by #161
Assignees
Labels
bug Something isn't working

Comments

@ewjoachim
Copy link
Contributor

ewjoachim commented Mar 31, 2023

Describe the bug

When using an Array(String) in server-side params the strings inside the array are sent unquoted, so the query crashes

Steps to reproduce

  1. client.query('SELECT {l:Array(String)}', parameters={"l": ["a"]}).result_rows
  2. Cannot parse quoted string: expected opening quote ''', got 'a'. (CANNOT_PARSE_QUOTED_STRING) (version 22.8.15.23 (official build))

Expected behaviour

Returns ['a']

Code example

import clickhouse_connect
client = clickhouse_connect.get_client(
    dsn=connection_string
)
client.query('SELECT {l:Array(String)}', parameters={"l": ["a"]}).result_rows

clickhouse-connect and/or ClickHouse server logs

Code: 26. DB::ParsingException: Cannot parse quoted string: expected opening quote ''', got 'a'. (CANNOT_PARSE_QUOTED_STRING) (version 22.8.15.23 (official build))

Configuration

Environment

  • clickhouse-connect version: clickhouse-connect==0.5.16
  • Python version: 3.7
  • Operating system: Ubuntu

ClickHouse server

  • ClickHouse Server version:22.8.15.23
  • ClickHouse Server non-default settings, if any:
  • CREATE TABLE statements for tables involved:
  • Sample data for these tables, use clickhouse-obfuscator if necessary

Additional info

It really looks like format_bind_value doesn't quote Strings, because when we use a String at the top level we send param_x="something" and not param_x="'something'", but this logic only holds true at the top level.
I think format_bind_value should have an optional top_level=True parameter, when it calls itself recursively, it should call itself with top_level=False and when the input is a string, if not top_level it should return '{input}' (though with a bit of sanitazing to avoid injections)

@ewjoachim ewjoachim added the bug Something isn't working label Mar 31, 2023
@genzgd
Copy link
Collaborator

genzgd commented Mar 31, 2023

Thanks for the detailed investigation and report! You're correct, format_bind_value doesn't quote strings because ClickHouse doesn't expect them quoted when passed as the top level bind value, but apparently it does want them quoted in nested/contained values.

In this instance I don't know what "sanitizing" would look like, I think ClickHouse itself handles safe substitution?

In any case your solution looks good to me, and it would be great if you could submit a PR, otherwise I'll probably get to it in a week or so.

@genzgd genzgd self-assigned this Mar 31, 2023
@ewjoachim
Copy link
Contributor Author

Sanitizing means "what would happen if I pass ["a", "b', 'c"]

Pretty sure that with the current code, from the python side we're sending 2 elements, but once we'll format it for passing to params, there will be 3 elements.

Except if we already do a broad pass of escaping, but in that case, we'll escape ['a', 'b'] into [''a'', ''b''] and it's no good either.

Escaping needs to take place exactly once, and before we quote the string. We can't quote the string, mix it with other strings and then escape it.

Whereas if, on string element, in format_bind_value we do replace("'", "''"), we should be good and make sure we don't re-escape later, we should be good.

In the tests, we need to check for inputs such as empty string, string consisting of just a single quote, 2 single quotes, and up to 5 single quotes, double quote, string beginning with a single quote, ending with a single quote, containing a single quote, empty array, array containing an int, array containing all the strings described above, array containing an array containing the strings above, other types on containers containing strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants