-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[WIP][HUDI-5023] Avoiding using BoundedInMemoryExecutor on the hot-path
#6843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
BoundedInMemoryExecutor on the hot-pathBoundedInMemoryExecutor on the hot-path
f01ec02 to
c733497
Compare
|
BIMQ (baseline): ~5120s After chatting w/ @vinothchandar on this, i went to double check how Spark actually iterates over the output of ShuffleTasks, and apparently it does already implement similar queuing pattern at that level -- see ShuffleBlockFetcherIterator:
As such, there's no point for us to implement the same queueing mechanism at our level provided that Hudi will already by iterating over the records which are cached in memory. |
|
Turned out that my previous run was actually not using Disruptor at all: had to put up a #7188 to make sure Disruptor could be used when going t/h merging as well. After fixing, and actually using Disruptor performance become even worse: |
|
Hey @alexeykudinkin Thanks a lot for this PR! I'd like to discuss some details if u don't mind :) Based on our experience, the performance is different between disruptor, BIMQ, and no queue at all in different scenarios. For example in this PR #7174 test(Although it is only a local benchmark, it can also be a reproducible evidence) different queue do has different advantages. So would u think we can just remove the design of Executor OR we can implement different executor including this
Also this no inner queue is my favor, because it need no extra memory and cpu resources and quite simple. Looking forward your replay. Thanks! |
|
Hey, @zhangyue19921010! Yeah, this PR was put up purely for experimental purposes even before we finalized previous PR landing disruptor which (considerably improved the API!), so i was primarily looking into putting up quick solution i can quickly benchmark against and validate. In terms of taking this forward, i have following thoughts:
Our goal would be to fit in such P.S. What surprised me actually was that i was NOT able to reproduce 20% increase in performance for Disruptor on our benchmarks that you folks were seeing in your environment. First issue was due to #7188, but even after addressing it i still don't see an improvement. |
|
Hey @alexeykudinkin ! Thanks for your response!
Totally agree with u! Let's make it happen. And I will tune #7174 this simpleExecutor pr to simplify APIs even further and remove what we don' t need.
Would u mind to share more test infos about your test? for example records number, cpu/memory resources and schema maybe if u want. From our experience, we have two kinds of spark-streaming ingestion job.
More details for our performance test : Insert/bulk_insert performance Benchmark between BIMQ (baseline) and Disruptor with same kafka input, resources and configs. BIMQ: used 7.9 min to finish writing parquets. Disruptor used 5.6 min to finish writing parquets In terms of Case 1 write performance, Disruptor improved about 29% from 7.9min to 5.6min |
|
Closing in favor of #7174 |


Change Logs
BoundedInMemoryExecutoradd considerable overhead laying in the hot-path of writing the records w/ very little practical benefit. This PR removes it from the hot-path of the few flows where we can bypass BIME completely.Impact
Describe any public API or user-facing feature change or any performance impact.
Risk level: none | low | medium | high
Choose one. If medium or high, explain what verification was done to mitigate the risks.
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist