Support range predicate pushdown for string columns with collation in PostgreSQL connector#9746
Conversation
There was a problem hiding this comment.
Please update io.trino.plugin.postgresql.TestPostgreSqlConnectorTest#hasBehavior to support SUPPORTS_PREDICATE_PUSHDOWN_WITH_VARCHAR_INEQUALITY
hashhar
left a comment
There was a problem hiding this comment.
The impl looks good. Please adjust the declaration in tests to make sure the tests pass with the new assumption.
69077fa to
968b6cd
Compare
There was a problem hiding this comment.
Additional updates are necessary to support aggregation pushdown for varchar columns. Since I would focus on support range predicates pushdown in this pull request, I just introduced a new behavior that indicates whether aggregation pushdown is supported for varchar columns.
There was a problem hiding this comment.
Let's create a GitHub issue and add a TODO so that we can remove the additional behaviour once it's no longer needed (and other people know that this is something they can work on).
There was a problem hiding this comment.
Yeah, that makes sense (as long as this fix can be merged) and I would work on it once this pull request is completed. By the way, supporting type sensitive aggregation pushdown in JDBC plugins doesn't seem easy. Fundamental interface changes may be required.
There was a problem hiding this comment.
Yeah, I ran into some similar issues in #7320.
There were some ideas floated in #7320 (comment) (which we didn't end up doing since it looked like a one-off need at that time).
If you already have some direction in your mind it might be helpful to discuss it on #dev on Slack too (if you think the changes will be large and touch the SPI).
968b6cd to
ac742f8
Compare
|
I'm trying to find a way to fix it. |
|
Looks like collations have been supported by |
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
dc45a73 to
d9c1e3a
Compare
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
testing/trino-testing/src/main/java/io/trino/testing/TestingConnectorBehavior.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
why still isNotFullyPushedDown?
There was a problem hiding this comment.
With the domain compaction, the original predicate seems to remain on Trino side although the compacted predicate is pushed down to PostgreSQL:
SQL on PostgreSQL:
SELECT "nationkey", "name", "regionkey" FROM "tpch"."nation" WHERE ("name" >= ? COLLATE "C" AND "name" <= ? COLLATE "C")Plan on Trino:
Output[regionkey, nationkey, name]
│ Layout: [regionkey:bigint, nationkey:bigint, name:varchar(25)]
└─ RemoteExchange[GATHER]
│ Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint]
└─ ScanFilter[table = postgresql:tpch.nation tpch.nation constraint on [name] columns=[nationkey:bigint:int8, name:varchar(25):varchar, regionkey:bigint:int8], filterPredicate = ("name" IN (CAST('POLAND' AS varchar(25)), CAST('ROMANIA' AS varchar(25)), CAST('VIETNAM' AS varchar(25))))]
Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint]
nationkey := nationkey:bigint:int8
regionkey := regionkey:bigint:int8
name := name:varchar(25):varchar
...n/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Let's create a GitHub issue and add a TODO so that we can remove the additional behaviour once it's no longer needed (and other people know that this is something they can work on).
9908b52 to
ee99698
Compare
|
@findepi Finished updating for now. Could you take a look again? |
hashhar
left a comment
There was a problem hiding this comment.
Some minor comments.
Can we detect (or apply some heuristics) to determine during runtime whether a given predicate needs the COLLATION applied or not? For example for equality predicates adding the collation will lead to a definite performance regression because no indexes can be used anymore.
If it's not possible to do dynamically during runtime then let's add an experimental. prefixed config property to enable this behaviour and keep it opt in for now?
WDYT @wendigo ?
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
...n/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlConnectorTest.java
Outdated
Show resolved
Hide resolved
...n/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlConnectorTest.java
Outdated
Show resolved
Hide resolved
I think it's possible in some cases but 100% is impossible. But anyway it should be better than full-scan?
I'm fine to make this optional as long as we can enable this by configuration. Is adding |
@takezoe You can add a session property with matching config property in cc: @findepi @kokosing @wendigo @ebyhr Any opinions? I'd prefer to have this behind a configuration toggle (at-least for now) until we can verify that there isn't a performance concern in actual practical usage? |
hashhar
left a comment
There was a problem hiding this comment.
Looks good % confirmation about config from other maintainers.
...n/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlConnectorTest.java
Outdated
Show resolved
Hide resolved
|
Looks like |
hashhar
left a comment
There was a problem hiding this comment.
LGTM % comments.
Mostly about possible simplifications.
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlConfig.java
Outdated
Show resolved
Hide resolved
587edbe to
dbddd0a
Compare
28313bd to
c59edbe
Compare
|
@hashhar Finished updating the pull request. Could you take a look again? |
hashhar
left a comment
There was a problem hiding this comment.
Looks good to me % comment. Thanks for working on this.
Also it looks the first two commits should be squashed together.
plugin/trino-postgresql/src/test/java/io/trino/plugin/postgresql/PostgreSqlQueryRunner.java
Outdated
Show resolved
Hide resolved
005f947 to
c5646cc
Compare
...n/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlConnectorTest.java
Outdated
Show resolved
Hide resolved
|
Looks like For example, having a flag as a field and changing its value inside Does splitting |
c5646cc to
0721fe8
Compare
I adopted this way for now and added a case for join on VARCHAR columns. |
e5e96e6 to
7e5ae28
Compare
Since only PostgreSQL has this behaviour today (and not enabled by default) I think it's fine to not add a new behaviour to In terms of future evolution I think once we prove it out with PostgreSQL we can lift the config to BaseJdbcConfig and then add a test in BaseJdbcConnectorTest which sets the session property - at that time we can revisit the best way to structure the |
|
@hashhar sounds good. except i am not convinced we need a config just yet, except for trying things out (a temporary kill switch) |
7e5ae28 to
0a1f394
Compare
|
Ah, I see. That makes sense. I updated the test case. |
6fd6890 to
16b3c0c
Compare
16b3c0c to
5eb794a
Compare
|
Closing and re-opening to get ci to run (some glitch on GitHub end). |
|
Unrelated failure. Thanks @takezoe for the feature. Merging it. |
No description provided.