Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: unload csv file too large. #15758

Merged
merged 3 commits into from
Jun 7, 2024
Merged

Conversation

youngsofun
Copy link
Member

@youngsofun youngsofun commented Jun 7, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

the event() of the original LimitFileSizeProcessor has 2 problems

  1. pull upstream before consuming the inner buffer, which may lead to OOM. fix:
    1. if-else chain examines backwards.
    2. not take input if buffer is enough.
  2. flush all buffers at once when input finishes, which leads to a very large file. fix:
    1. add a flushing state

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-bugfix this PR patches a bug in codebase label Jun 7, 2024
@youngsofun youngsofun force-pushed the fix_unload branch 2 times, most recently from fd73333 to 7966ec3 Compare June 7, 2024 12:56
@youngsofun youngsofun requested a review from BohuTANG June 7, 2024 12:57
@dantengsky dantengsky added this pull request to the merge queue Jun 7, 2024
@wubx wubx removed this pull request from the merge queue due to a manual request Jun 7, 2024
@wubx wubx added this pull request to the merge queue Jun 7, 2024
@wubx wubx removed this pull request from the merge queue due to a manual request Jun 7, 2024
@wubx wubx added this pull request to the merge queue Jun 7, 2024
@BohuTANG BohuTANG removed this pull request from the merge queue due to a manual request Jun 7, 2024
@BohuTANG BohuTANG merged commit 11460ad into databendlabs:main Jun 7, 2024
71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-bugfix this PR patches a bug in codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants