-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Add support for fetching Redshift query results using Redshift unload command #24117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -64,6 +64,43 @@ documentation](https://docs.aws.amazon.com/redshift/latest/mgmt/jdbc20-configura | |
| ```{include} jdbc-authentication.fragment | ||
| ``` | ||
|
|
||
| ### UNLOAD configuration | ||
|
|
||
| This feature enables using Amazon S3 to efficiently transfer data out of Redshift | ||
|
||
| instead of the default single threaded JDBC based implementation. | ||
| The connector automatically triggers the appropriate `UNLOAD` command | ||
| on Redshift to extract the output from Redshift to the configured | ||
| S3 bucket in the form of Parquet files. These Parquet files are read in parallel | ||
| from S3 to improve latency of reading from Redshift tables. The Parquet | ||
| files will be removed when Trino finishes executing the query. It is recommended | ||
| to define a custom life cycle policy on the S3 bucket used for unloading the | ||
| Redshift query results. | ||
| This feature is supported only when the Redshift cluster and the configured S3 | ||
| bucket are in the same AWS region. | ||
|
|
||
| The following table describes configuration properties for using | ||
mayankvadariya marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| `UNLOAD` command in Redshift connector. `redshift.unload-location` must be set | ||
| to use `UNLOAD`. | ||
|
|
||
| :::{list-table} UNLOAD configuration properties | ||
| :widths: 30, 60 | ||
| :header-rows: 1 | ||
|
|
||
| * - Property value | ||
| - Description | ||
| * - `redshift.unload-location` | ||
| - A writeable location in Amazon S3, to be used for temporarily unloading | ||
| Redshift query results. | ||
| * - `redshift.unload-iam-role` | ||
| - Optional. Fully specified ARN of the IAM Role attached to the Redshift cluster. | ||
| Provided role will be used in `UNLOAD` command. IAM role must have access to | ||
| Redshift cluster and write access to S3 bucket. The default IAM role attached to | ||
| Redshift cluster is used when this property is not configured. | ||
| ::: | ||
|
|
||
| Additionally, define appropriate [S3 configurations](/object-storage/file-system-s3) | ||
| except `fs.native-s3.enabled`, required to read Parquet files from S3 bucket. | ||
|
|
||
| ### Multiple Redshift databases or clusters | ||
|
|
||
| The Redshift connector can only access a single database within | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.