Skip to content

Increase the max number of http connections in Azure native FS#22561

Merged
electrum merged 1 commit intotrinodb:masterfrom
nineinchnick:azure-native-fs-max-connections
Jul 2, 2024
Merged

Increase the max number of http connections in Azure native FS#22561
electrum merged 1 commit intotrinodb:masterfrom
nineinchnick:azure-native-fs-max-connections

Conversation

@nineinchnick
Copy link
Copy Markdown
Member

Description

By default, OkHttp only allows 5 concurrent HTTP connections per host, and 64 total concurrent connections. Increase these values to avoid a performance degradation compared to the legacy FS, when executing queries with a large number of splits, on nodes with large number of split runner threads.

Fixes #22548

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# General
* Fix a performance regression in native FS for Azure, compared to legacy FS. ({issue}`issuenumber`)

By default, OkHttp only allows 5 concurrent HTTP connections per host,
and 64 total concurrent connections. Increase these values to avoid a
performance degradation compared to the legacy FS, when executing
queries with a large number of splits, on nodes with large number of
split runner threads.
@cla-bot cla-bot bot added the cla-signed label Jul 2, 2024
int maximumConnectionPoolSize = (poolSize != null && poolSize > 0) ? poolSize : 5;
Dispatcher dispatcher = new Dispatcher();
dispatcher.setMaxRequests(Runtime.getRuntime().availableProcessors() * 4);
dispatcher.setMaxRequestsPerHost(Runtime.getRuntime().availableProcessors() * 2);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as the default for the number of worker threads: https://github.com/trinodb/trino/blob/master/core/trino-main/src/main/java/io/trino/execution/TaskManagerConfig.java#L62

I didn't want to get it from that config to avoid adding a dependency on trino-main. I set the max requests as twice as much, but I don't have any specific reason for it. I'm open for suggestions how to choose a better value.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this worth exposing it as a configuration property?

Also nit: extract Runtime.getRuntime().availableProcessors() to a variable, and also inline dispatcher below

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't inline it, as the setters don't return anything.

If we'll make it configurable, it should only be a fallback in case we get the default wrong. Maybe we should ask differently - in which cases the number of worker threads should be customized?

@nineinchnick
Copy link
Copy Markdown
Member Author

I tested this manually on a 16 node cluster executing q09 from TPCDS. The wall time went down from 60s to 9s, matching legacy FS.

@electrum electrum merged commit bd289dc into trinodb:master Jul 2, 2024
@github-actions github-actions bot added this to the 452 milestone Jul 2, 2024
@nineinchnick nineinchnick deleted the azure-native-fs-max-connections branch July 3, 2024 07:43
@findinpath
Copy link
Copy Markdown
Contributor

By default, OkHttp only allows 5 concurrent HTTP connections per host, and 64 total concurrent connections.

@nineinchnick does this finding apply for AWS S3 or GCS clients?

@nineinchnick
Copy link
Copy Markdown
Member Author

By default, OkHttp only allows 5 concurrent HTTP connections per host, and 64 total concurrent connections.

@nineinchnick does this finding apply for AWS S3 or GCS clients?

I don't think so. In both AWS and GCP, we're not injecting a custom HTTP client, and relying more on the default SDK configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Performance regression in native FS on Azure

4 participants