Skip to content

feat: Support multi-threaded asynchronous data upload to object storage#264

Merged
frankobe merged 1 commit intobytedance:mainfrom
weixiuli:main-s3
Mar 2, 2026
Merged

feat: Support multi-threaded asynchronous data upload to object storage#264
frankobe merged 1 commit intobytedance:mainfrom
weixiuli:main-s3

Conversation

@weixiuli
Copy link
Copy Markdown
Contributor

@weixiuli weixiuli commented Feb 25, 2026

What problem does this PR solve?

Cherry-picking a velox pr to bolt. facebookincubator/velox#14472

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 🚀 Performance improvement (optimization)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)
  • 🔨 Refactoring (no logic changes)
  • 🔧 Build/CI or Infrastructure changes
  • 📝 Documentation only

Description

Describe your changes in detail.
For complex logic, explain the "Why" and "How".

Performance Impact

  • No Impact: This change does not affect the critical path (e.g., build system, doc, error handling).

  • Positive Impact: I have run benchmarks.

    Click to view Benchmark Results
    ============================================================================
    [...]/benchmark/S3AsyncUploadBenchmark.cpp     relative  time/iter   iters/s
    ============================================================================
    sync_upload_4M                                             48.82ms     20.48
    async_upload_4M                                 108.32%    45.07ms     22.19
    sync_upload_8M                                             79.81ms     12.53
    async_upload_8M                                 104.41%    76.44ms     13.08
    sync_upload_16M                                           138.74ms      7.21
    async_upload_16M                                117.11%   118.47ms      8.44
    sync_upload_32M                                           260.98ms      3.83
    async_upload_32M                                170.03%   153.49ms      6.52
    sync_upload_64M                                           509.86ms      1.96
    async_upload_64M                                247.14%   206.30ms      4.85
    sync_upload_128M                                          998.46ms      1.00
    async_upload_128M                               284.82%   350.56ms      2.85
    sync_upload_256M                                             2.00s   499.29m
    async_upload_256M                               316.53%   632.76ms      1.58
    sync_upload_512M                                             4.01s   249.50m
    async_upload_512M                               339.35%      1.18s   846.68m
    sync_upload_1024M                                            7.84s   127.62m
    async_upload_1024M                              342.67%      2.29s   437.32m
    sync_upload_2048M                                           16.40s    60.99m
    async_upload_2048M                              332.71%      4.93s   202.93m
    
  • Negative Impact: Explained below (e.g., trade-off for correctness).

Release Note

We used this PR on spark +gluten+ velox incubator-gluten and set the hive.s3.uploadPartAsync to be true in our environment, and the average write performance improved by 85%.

Before this PR: Total Time Across All Tasks: 64.3 h
异步IO写-优化前-耗时-1

After this PR: Total Time Across All Tasks: 32.4 h
异步IO写-优化后-耗时-1

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Feb 25, 2026

CLA assistant check
All committers have signed the CLA.

@frankobe frankobe requested a review from mcheng615 February 26, 2026 07:36
@frankobe
Copy link
Copy Markdown
Collaborator

@weixiuli Thanks for the contribution. Do you mind signing the CLA by clicking "Contributor License Agreement" link in the #264 (comment)

@frankobe frankobe added this pull request to the merge queue Mar 2, 2026
Merged via the queue into bytedance:main with commit 232a951 Mar 2, 2026
11 of 12 checks passed
guhaiyan0221 pushed a commit to guhaiyan0221/bolt that referenced this pull request Mar 4, 2026
guhaiyan0221 pushed a commit to guhaiyan0221/bolt that referenced this pull request Mar 4, 2026
guhaiyan0221 pushed a commit to guhaiyan0221/bolt that referenced this pull request Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants