-
Notifications
You must be signed in to change notification settings - Fork 3k
Open
Labels
proposalIceberg Improvement Proposal (spec/major changes/etc)Iceberg Improvement Proposal (spec/major changes/etc)
Description
Proposed Change
We have developed an opensource library that aims to accelerate read access to S3. (will be referred to as AAL)
https://github.com/awslabs/analytics-accelerator-s3
It is currently merged in behind a feature flag that can be toggled with:
--conf "spark.sql.catalog.<CATALOG_NAME>.s3.analytics-accelerator.enabled=true"
This epic is to track the work needed to turn it on by default for all Iceberg and S3 customers.
If you are a Iceberg User and would like to test please let us know and we will be happy to schedule some time with you.
work needed
- Integrate default OFF: AWS: Integrate S3 analytics accelerator library #12299
- Add read vector support to core iceberg core: Adding read vector to range readable interface and adding mappe… #13997
- Add read vector support to S3FileIO feat: adding s3fileio vector reader #14352
- Add read vector support to AAL AWS: swapping s3 aal to the sync client and adding aal vector path #14247
- Add sync client support for AAL and default to it AWS: swapping s3 aal to the sync client and adding aal vector path #14247
- blocked: feature Parity including:
- Support SSE-C
- All retry logic in default stream is compatible with AAL
- All integration tests pass with AAL - Customer testing and success stories started testing data will be shared later
- Default On
- Iceberg Release
Proposal document
https://docs.google.com/document/d/13shy0RWotwfWC_qQksb95PXdi-vSUCKQyDzjoExQEN0/edit?usp=sharing
Specifications
- Table
- REST
- View
- Puffin
- Encryption
- Other
geruh, Neuw84, stubz151 and fuatbasik
Metadata
Metadata
Assignees
Labels
proposalIceberg Improvement Proposal (spec/major changes/etc)Iceberg Improvement Proposal (spec/major changes/etc)