Skip to content

Add S3 connection ttl configuration to reduce http server exception#16007

Merged
ebyhr merged 1 commit intotrinodb:masterfrom
hackeryang:make_s3_connection_ttl_configurable
Feb 15, 2023
Merged

Add S3 connection ttl configuration to reduce http server exception#16007
ebyhr merged 1 commit intotrinodb:masterfrom
hackeryang:make_s3_connection_ttl_configurable

Conversation

@hackeryang
Copy link
Copy Markdown
Member

@hackeryang hackeryang commented Feb 7, 2023

Description

Add Amazon S3 connection ttl configuration to reduce http server errors in high load scenarios.

Additional context and related issues

Fixes #16005

Release notes

( ) This is not user-visible or docs only and no release notes are required.
(x) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Add Amazon S3 connection TTL configuration to reduce HTTP server errors in high-load scenarios. ([{issue}`#16005`](https://github.com/trinodb/trino/issues/16005))

@cla-bot cla-bot bot added the cla-signed label Feb 7, 2023
@hackeryang hackeryang changed the title Set s3 connection ttl to reduce http server exception Add S3 connection ttl configuration to reduce http server exception Feb 7, 2023
@hackeryang hackeryang force-pushed the make_s3_connection_ttl_configurable branch from 10610a2 to 497966e Compare February 7, 2023 09:19
@hackeryang hackeryang requested a review from sopel39 February 7, 2023 09:32
@hackeryang
Copy link
Copy Markdown
Member Author

hackeryang commented Feb 7, 2023

Hi @sopel39 @alexjo2144 , can you please take a look when you have time, thanks~

@hackeryang hackeryang requested a review from alexjo2144 February 7, 2023 09:34
@sopel39 sopel39 requested review from findepi and removed request for sopel39 February 7, 2023 10:33
@sopel39
Copy link
Copy Markdown
Member

sopel39 commented Feb 7, 2023

@findepi please take over this one

@findepi
Copy link
Copy Markdown
Member

findepi commented Feb 7, 2023

@hackeryang can you please fix the build errors?
You can use https://gist.github.com/findepi/04c96f0f60dcc95329f569bb0c44a0cd#run-static-code-analysis to run the checkstyle & error-prone locally

@hackeryang hackeryang force-pushed the make_s3_connection_ttl_configurable branch from 497966e to 0c08967 Compare February 7, 2023 12:44
@hackeryang
Copy link
Copy Markdown
Member Author

@hackeryang can you please fix the build errors? You can use https://gist.github.com/findepi/04c96f0f60dcc95329f569bb0c44a0cd#run-static-code-analysis to run the checkstyle & error-prone locally

@findepi Sure, i have fixed the build error, sorry for the capitalization mistake. Please review again when you have time, thank you~

@findepi findepi requested review from findinpath and removed request for findepi February 7, 2023 16:36
@hackeryang
Copy link
Copy Markdown
Member Author

Hello @findinpath , can you please cc when you have time, thank you

@hackeryang hackeryang self-assigned this Feb 8, 2023
@hackeryang hackeryang force-pushed the make_s3_connection_ttl_configurable branch 5 times, most recently from 0610143 to cc0a718 Compare February 9, 2023 05:27
@findinpath findinpath added the needs-docs This pull request requires changes to the documentation label Feb 9, 2023
@hackeryang hackeryang force-pushed the make_s3_connection_ttl_configurable branch 2 times, most recently from 9d3d564 to d2140f6 Compare February 10, 2023 11:04
@github-actions github-actions bot added the docs label Feb 10, 2023
@hackeryang hackeryang force-pushed the make_s3_connection_ttl_configurable branch from d2140f6 to f09eabd Compare February 13, 2023 02:14
Copy link
Copy Markdown
Contributor

@findinpath findinpath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % comments

Copy link
Copy Markdown
Contributor

@krvikash krvikash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@findinpath
Copy link
Copy Markdown
Contributor

@losipiuk CPTAL ?

@findinpath findinpath requested a review from losipiuk February 13, 2023 10:07
@hackeryang hackeryang force-pushed the make_s3_connection_ttl_configurable branch from f09eabd to 36c68b7 Compare February 13, 2023 14:20
@hackeryang hackeryang requested a review from ebyhr February 14, 2023 06:20
@hackeryang hackeryang force-pushed the make_s3_connection_ttl_configurable branch from 36c68b7 to 7cc9140 Compare February 14, 2023 13:15
@hackeryang hackeryang requested a review from ebyhr February 14, 2023 13:22
When a certain HTTP connection has been idle for a period of time, S3 will close
the connection, but the SDK later tries to reuse the connection, then the client
unable to execute HTTP request.
@hackeryang hackeryang force-pushed the make_s3_connection_ttl_configurable branch from 7cc9140 to 1db4892 Compare February 15, 2023 01:54
@ebyhr ebyhr merged commit f8acfe9 into trinodb:master Feb 15, 2023
@ebyhr ebyhr mentioned this pull request Feb 15, 2023
@hackeryang hackeryang deleted the make_s3_connection_ttl_configurable branch February 15, 2023 08:46
@github-actions github-actions bot added this to the 407 milestone Feb 15, 2023
@steveloughran
Copy link
Copy Markdown

  1. how effective has this change been in making the exception go away?
  2. has anyone experimented with retry/recovery on the exception being raised?

we've had reports of this coming through the S3A connector and as well as adding a TTL, I'm thinking of classifying the the "no response" exception as a connectivity failure which we can treat as idempotent and retry on, always.

@hackeryang
Copy link
Copy Markdown
Member Author

hackeryang commented Aug 11, 2023

  1. how effective has this change been in making the exception go away?
  2. has anyone experimented with retry/recovery on the exception being raised?

we've had reports of this coming through the S3A connector and as well as adding a TTL, I'm thinking of classifying the the "no response" exception as a connectivity failure which we can treat as idempotent and retry on, always.

Hello, good question~
1、It has been for several months after we added the parameter to one of our production cluster with 35 nodes, the timeout error has gone.
2、May be the 'Fault Tolerant Execution' mode can meet your needs: https://github.com/trinodb/trino/wiki/Fault-Tolerant-Execution
https://trino.io/docs/current/admin/fault-tolerant-execution.html
And also you can set this parameter in your config.properties: exchange.data-integrity-verification=RETRY, here is the relevant pr: #3438

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed docs needs-docs This pull request requires changes to the documentation performance

Development

Successfully merging this pull request may close these issues.

Reduce 'The target server failed to respond' errors in Amazon S3 by configuring connection ttl

8 participants