Skip to content

Improving Retries based on Iceberg Integration Feedback#321

Merged
fuatbasik merged 5 commits intoawslabs:mainfrom
fuatbasik:fix-retries
Jul 24, 2025
Merged

Improving Retries based on Iceberg Integration Feedback#321
fuatbasik merged 5 commits intoawslabs:mainfrom
fuatbasik:fix-retries

Conversation

@fuatbasik
Copy link
Copy Markdown
Collaborator

Description of change

After trying to integrated RetryStrategy with S3FileIO, I realised we exposed too much details (i.e. return types of methods to retry) to customers. Also, it was not possible to pass Retry strategy down to Block. With this change I make

1/ Move Retry module to common to co-locate it with OpenStreamInformation. This way, we can pass RetryStrategy down to block.
2/ We move away from Generic types on IOSupplier, instead only support methods that return byte[] and throws IOException now (or voids through IORunnable). This way, consumers do not need to know about the return types.
3/Update relevant javadocs

Relevant issues

N/A

Does this contribution introduce any breaking changes to the existing APIs or behaviors?

No. Retries were already internal only.

Does this contribution introduce any new public APIs or behaviors?

Yes, with this change, we can now pass RetryStrategy to OpenStreamInformation that would instruct AAL to handle retries of certain failures up to a predefined number of times.

How was the contribution tested?

new unit tests are added. All existing tests, including gray failure tests are passing.

Does this contribution need a changelog entry?

  • I have updated the CHANGELOG or README if appropriate

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

* @return result of the supplier
* @throws IOException if the operation fails after all retry attempts
*/
byte[] get(IOSupplier supplier) throws IOException;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In new PhysicalIO branch, we are using ObjectContent and void as return type of this method. With changing this method/interface non-generic, how can we create a strategy for the operations which are not expecting byte[]?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is now resolved by moving supplier to return a type T instead of byte []

@fuatbasik fuatbasik temporarily deployed to integration-tests July 24, 2025 12:53 — with GitHub Actions Inactive
ozkoca
ozkoca previously approved these changes Jul 24, 2025
*
* @return the Failsafe RetryPolicy for byte arrays
*/
dev.failsafe.RetryPolicy<Object> getDelegate();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be byte[]? Whenever I see Object as a type I get worried.

* @return this builder instance
*/
public RetryPolicyBuilder withMaxRetries(int maxRetries) {
delegateBuilder.withMaxRetries(maxRetries);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there not be a precondition here and on the withDelay below

* Retry strategy implementation for seekable input stream operations. Uses Failsafe library to
* execute operations with configurable retry policies.
*/
public class SeekableInputStreamRetryStrategy implements RetryStrategy {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: something I should have picked up during the original but are we happy with this class name? should it not be DefaultRetryStrategyImpl to follow the current pattern like DefaultLogicalIOImpl

stubz151
stubz151 previously approved these changes Jul 24, 2025
@fuatbasik fuatbasik dismissed stale reviews from stubz151 and ozkoca via 374ef48 July 24, 2025 14:49
@fuatbasik fuatbasik temporarily deployed to integration-tests July 24, 2025 15:07 — with GitHub Actions Inactive
@fuatbasik fuatbasik merged commit 7184a45 into awslabs:main Jul 24, 2025
4 checks passed
ozkoca added a commit that referenced this pull request Jul 25, 2025
## Description of change
This PR rebases ant integrates the changes in PR
#321

#### Relevant issues
#286
#287
#288
#289
#294
#316
#320

#### Does this contribution introduce any breaking changes to the
existing APIs or behaviors?
No

#### Does this contribution introduce any new public APIs or behaviors?
No

#### How was the contribution tested?
Unit tests


---

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and I agree to the terms of
the [Developer Certificate of Origin
(DCO)](https://developercertificate.org/).
@fuatbasik fuatbasik deleted the fix-retries branch July 25, 2025 10:32
@ozkoca ozkoca mentioned this pull request Jul 29, 2025
1 task
dongjoon-hyun added a commit to apache/spark that referenced this pull request Sep 3, 2025
### What changes were proposed in this pull request?

This PR aims to upgrade `analyticsaccelerator-s3` to 1.3.0.

### Why are the changes needed?

To use the latest improvements and bug fixes for Apache Spark 4.1.0.
- https://github.com/awslabs/analytics-accelerator-s3/releases/tag/v1.3.0
  - awslabs/analytics-accelerator-s3#345
  - awslabs/analytics-accelerator-s3#338
  - awslabs/analytics-accelerator-s3#321
  - awslabs/analytics-accelerator-s3#342

### Does this PR introduce _any_ user-facing change?

No, this was a new library which is not released yet.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #52203 from dongjoon-hyun/SPARK-53460.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
huangxiaopingRD pushed a commit to huangxiaopingRD/spark that referenced this pull request Nov 25, 2025
### What changes were proposed in this pull request?

This PR aims to upgrade `analyticsaccelerator-s3` to 1.3.0.

### Why are the changes needed?

To use the latest improvements and bug fixes for Apache Spark 4.1.0.
- https://github.com/awslabs/analytics-accelerator-s3/releases/tag/v1.3.0
  - awslabs/analytics-accelerator-s3#345
  - awslabs/analytics-accelerator-s3#338
  - awslabs/analytics-accelerator-s3#321
  - awslabs/analytics-accelerator-s3#342

### Does this PR introduce _any_ user-facing change?

No, this was a new library which is not released yet.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#52203 from dongjoon-hyun/SPARK-53460.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants