Skip to content

Unconditionally pushdown varchar predicate to Clickhouse#23516

Merged
raunaqmorarka merged 1 commit intotrinodb:masterfrom
ssheikin:ssheikin/52/trino/clickhouse-PredicatePushdownController
Sep 21, 2024
Merged

Unconditionally pushdown varchar predicate to Clickhouse#23516
raunaqmorarka merged 1 commit intotrinodb:masterfrom
ssheikin:ssheikin/52/trino/clickhouse-PredicatePushdownController

Conversation

@ssheikin
Copy link
Copy Markdown
Contributor

@ssheikin ssheikin commented Sep 20, 2024

ClickHouse collation is case-sensitive.
ClickHouse has same sort ordering as Trino.

Per https://clickhouse.com/docs/en/sql-reference/statements/show#show_columns
ClickHouse has no per-column collations
Clickhouse is UTF-8 encoded with byte-by-byte comparison.
https://clickhouse.com/docs/en/sql-reference/statements/select/order-by#collation-support
So exactly as trino.
https://github.com/airlift/slice/blob/2.2/src/main/java/io/airlift/slice/Slice.java#L1205
That’s why all operations on varchars may pushdown.

Description

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Section
* Pushdown varchar predicate to ClickHouse unconditionally. ({issue}`23516`)

ClickHouse collation is case-sensitive.
ClickHouse has same sort ordering as Trino.

Per https://clickhouse.com/docs/en/sql-reference/statements/show#show_columns
ClickHouse has no per-column collations
Clickhouse is UTF-8 encoded with byte-by-byte comparison.
https://clickhouse.com/docs/en/sql-reference/statements/select/order-by#collation-support
So exactly as trino.
https://github.com/airlift/slice/blob/2.2/src/main/java/io/airlift/slice/Slice.java#L1205
That’s why all operations on varchars may pushdown.
@ssheikin ssheikin force-pushed the ssheikin/52/trino/clickhouse-PredicatePushdownController branch from 6a9e660 to 2944caf Compare September 20, 2024 19:34
@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Sep 20, 2024

ClickHouse supports collation at an index level. https://clickhouse.com/docs/en/sql-reference/statements/show#show-index
What happens if the pushed-down query uses the index?

@ssheikin
Copy link
Copy Markdown
Contributor Author

@ebyhr IIUC DB does not use index if query condition does not match index parameters.
In case of ClickHouse, index collation is just an ordering of the values within index.

collation - The sorting of the column in the index: A if ascending, D if descending, NULL if unsorted. (Nullable(String))

if the pushed-down query uses the index

it means that ordering for index matched order requested by query and ClickHouse executes query faster.

@raunaqmorarka raunaqmorarka merged commit f3751cf into trinodb:master Sep 21, 2024
@github-actions github-actions bot added this to the 459 milestone Sep 21, 2024
@ssheikin ssheikin deleted the ssheikin/52/trino/clickhouse-PredicatePushdownController branch September 21, 2024 11:22
@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Sep 21, 2024

Per https://clickhouse.com/docs/en/sql-reference/statements/show#show_columns
ClickHouse has no per-column collations

Checking SHOW COLUMNS docs is basically insufficient. We should check if the database supports collation when creating tables, starting the instance and so on.

Actually, ClickHouse supports specifying column collation for new tables:

CREATE TABLE test (x varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL) ENGINE = Memory;

It's just allowed at syntax level and it doesn't affect results as far as I tested locally, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants