Skip to content

Make query result HTTP compression configurable#5818

Merged
dain merged 2 commits intotrinodb:masterfrom
pettyjamesm:query-results-gzip-config
Nov 16, 2020
Merged

Make query result HTTP compression configurable#5818
dain merged 2 commits intotrinodb:masterfrom
pettyjamesm:query-results-gzip-config

Conversation

@pettyjamesm
Copy link
Member

Adapted changes from prestodb/presto#15393

Before this change, query result JSON responses were generally compressed (assuming the response met the minimum size threshold and passed the user agent checks), so that behavior is still the default. However, disabling GZIP compression can significantly improve throughput of sending query results, especially over localhost links where the overhead of compressing the response and then uncompressing it again on the client side is never worth the bandwidth savings.

Clients are allowed to opt-out of compression, but not request compression from a server which has decided to disable compressed query result responses. Both sides ultimately negotiate the result based on their Accept-Encoding or Content-Encoding headers and the way that the gzip compression middleware interprets them.

For queries that are bound only by result processing throughput (eg: SELECT * FROM <large table>) execution time can reduced by 20-50% when submitted over a localhost connection with compression disabled.

@cla-bot cla-bot bot added the cla-signed label Nov 4, 2020
@pettyjamesm pettyjamesm requested review from dain and findepi November 4, 2020 15:38
Copy link
Member

@dain dain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment on the description in the client options, but otherwise looks good.

@pettyjamesm pettyjamesm force-pushed the query-results-gzip-config branch from a5e8306 to 980e115 Compare November 16, 2020 14:56
@pettyjamesm pettyjamesm requested a review from dain November 16, 2020 15:28
By default, presto will GZIP query result JSON payloads sent to the
client. However, especially when the client is connected to the
coordinator over localhost, the added overhead of compressing the
response and then uncompressing it on the client is a losing
proposition.

For queries that are bound only by result processing throughput (eg:
SELECT * FROM <large table>) execution time can reduced by 20-50%
when submitted over a localhost connection with compression disabled.
Allows configuring HTTP response compression for the query results
endpoints at the server level, regardless of client configuration.
@pettyjamesm pettyjamesm force-pushed the query-results-gzip-config branch from 980e115 to 7a1b15a Compare November 16, 2020 20:51
@dain dain merged commit 50db495 into trinodb:master Nov 16, 2020
@martint martint added this to the 347 milestone Nov 16, 2020
@pettyjamesm pettyjamesm deleted the query-results-gzip-config branch November 16, 2020 23:39
@pettyjamesm pettyjamesm mentioned this pull request Nov 17, 2020
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants