Skip to content

Added jdbc fetch size config for pg/oracle/redshift#16269

Closed
chenjian2664 wants to merge 1 commit intotrinodb:masterfrom
chenjian2664:add_fetch_size_pg
Closed

Added jdbc fetch size config for pg/oracle/redshift#16269
chenjian2664 wants to merge 1 commit intotrinodb:masterfrom
chenjian2664:add_fetch_size_pg

Conversation

@chenjian2664
Copy link
Copy Markdown
Contributor

Description

Solve #16153
Removed the magic number 1000 in the pg/oracle/redshift client.
Added jdbc-fetch-size configuration for pg/oracle/redshift connector.

Additional context and related issues

Release notes

( ) This is not user-visible or docs only and no release notes are required.
(x) Release notes are required, please propose a release note for me.
() Release notes are required, with the following suggested text:

# Section
* Allow Postgresql, Oracle, Redshift set jdbc fetch size. ({issue}`16153`)

@cla-bot cla-bot bot added the cla-signed label Feb 25, 2023
@chenjian2664
Copy link
Copy Markdown
Contributor Author

Mark, we need add log.

@ebyhr ebyhr requested review from findepi and hashhar February 27, 2023 12:10
@findepi
Copy link
Copy Markdown
Member

findepi commented Feb 28, 2023

Under what circumstances would someone want to change the fetch size?
in particular, is it that our default fetch size is not right?

@chenjian2664
Copy link
Copy Markdown
Contributor Author

Under what circumstances would someone want to change the fetch size?

Hi, I saw a user shared experience in #16153.

in particular, is it that our default fetch size is not right?

I think most of cases it's works fine, but we can not cover all the user cases for sure. In this PR only allow some connectors set the default fetch size, and still put default value same as originally.

@chenjian2664 chenjian2664 self-assigned this Mar 1, 2023
@chenjian2664 chenjian2664 added the jdbc Relates to Trino JDBC driver label Mar 1, 2023
@findepi
Copy link
Copy Markdown
Member

findepi commented Mar 1, 2023

Hi, I saw a user shared experience in #16153.

If we don't understand when someone would want to set the new toggle, we would be able to document it properly. This would result in a feature that's used by very few users, which generally isn't good.

If we understand when someone would want to set the new toggle, then maybe we do not need a toggle at all. Ideally Trino would make right decisions out of the box, which would result in a feature that works for every user, which generally is good.

@vikingUnet
Copy link
Copy Markdown

vikingUnet commented Mar 1, 2023

Hello!
This toggle will be great useful in cases when the Trino and source databases geographically distributed in different data-centers in many reasons. So ping latency from Trino server to another data-center's databases is significant (in our case it was 10ms instead 0.1ms in the same data-center). Without this feature we are forcing our clients to hold all databases in place near the Trino server, that's sometimes is not easy =(
@findepi

@chenjian2664
Copy link
Copy Markdown
Contributor Author

chenjian2664 commented Mar 1, 2023

Thanks @findepi 's sharing

If we understand when someone would want to set the new toggle, then maybe we do not need a toggle at all. Ideally Trino would make right decisions out of the box, which would result in a feature that works for every user, which generally is good.

Honestly, Trino is pretty good for most of the users in the out of the box point, but default config is not always the best in any scenarios, so 'performance tuning' is always a endless topic.
This PR is not going to 'Fix' the out of box default config, instead it's a enhancement for this configuration.
It's very similar to we do about the JdbcWriteConfig.writeBatchSize, but different is here I did not change the base jdbc part to let the change minimum, so only to the connectors that set the default 'read' size.

@ebyhr ebyhr removed the jdbc Relates to Trino JDBC driver label Mar 8, 2023
@subbareddydagumati
Copy link
Copy Markdown

I think this will definitely improve the performance in some cases. Today I was trying to pull 3 columns from oracle and it's really slow. I think increasing the fetch size in these scenarios will definitely boost the performance.

@findepi
Copy link
Copy Markdown
Member

findepi commented Mar 21, 2023

@subbareddydagumati thanks for your feedback.
As stated previously, we should strive to pick the value automatically. Here's my (simplistic) proposal to improve how things work: #16644

@chenjian2664
Copy link
Copy Markdown
Contributor Author

Will close this at this moment.
We can reopen it once we need it. cc @findepi @tooptoop4

@chenjian2664 chenjian2664 deleted the add_fetch_size_pg branch May 25, 2023 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

6 participants