-
Notifications
You must be signed in to change notification settings - Fork 29k
SPARK-12948. [SQL]. Consider reducing size of broadcasts in OrcRelation #10861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is to let users share the broadcast of the conf across multiple hadoopRDD calls (e.g. when unioning many HadoopRDDs together)? If so, this issue has come up a number of times in the past and may be worth a holistic design review because I think there are some hacks in Spark SQL to address this problem there and it would be nice to have a unified solution for this.
|
Can you add more description to explain how this patch reduces the size of broadcasts? The change isn't obvious to me at first glance, so one or two sentences of description would help me and other reviewers who aren't as familiar with this corner of the code. |
|
Usecase: User tries to map the dataset which is partitioned (e.g TPC-DS dataset at 200 GB scale) & runs a query in spark-shell. E.g When this is executed, OrcRelation creates Config objects for every partition (Ref: OrcRelation.execute()). In the case of TPC-DS, it generates 1826 partitions. This info is broadcasted in DAGScheduler#submitMissingTasks(). As a part of this, the configurations created for 1826 partitions are also streamed through (i.e embedded in HadoopMapParitionsWithSplitRDD -->f()--> wrappedConf). Each of these configuration takes around 251 KB per partition. Please refer to the profiler snapshot attached in the JIRA (mem_snap_shot). This causes quite a bit of delay in the overall job runtime. Patch reuses the already broadcastedconf from SparkContext. fillObject() function is executed later for every partition, which internally sets up any additional config details. This drastically reduces the amount of payload that is broadcasted and helps in reducing the overall job runtime. |
|
@JoshRosen - Please let me know if my latest comment on the usecase addresses your question. Can you.
Can you plz provide more details/pointers on this? |
|
Test build #57670 has finished for PR 10861 at commit
|
|
Hi @rajeshbalamohan, I think this should be a mergeable state at least and the conflicts and style issues should be resolved. Would you be able to update this for now? |
|
We are closing it due to inactivity. please do reopen if you want to push it forward. Thanks! |
## What changes were proposed in this pull request? This PR proposes to close stale PRs, mostly the same instances with apache#18017 I believe the author in apache#14807 removed his account. Closes apache#7075 Closes apache#8927 Closes apache#9202 Closes apache#9366 Closes apache#10861 Closes apache#11420 Closes apache#12356 Closes apache#13028 Closes apache#13506 Closes apache#14191 Closes apache#14198 Closes apache#14330 Closes apache#14807 Closes apache#15839 Closes apache#16225 Closes apache#16685 Closes apache#16692 Closes apache#16995 Closes apache#17181 Closes apache#17211 Closes apache#17235 Closes apache#17237 Closes apache#17248 Closes apache#17341 Closes apache#17708 Closes apache#17716 Closes apache#17721 Closes apache#17937 Added: Closes apache#14739 Closes apache#17139 Closes apache#17445 Closes apache#18042 Closes apache#18359 Added: Closes apache#16450 Closes apache#16525 Closes apache#17738 Added: Closes apache#16458 Closes apache#16508 Closes apache#17714 Added: Closes apache#17830 Closes apache#14742 ## How was this patch tested? N/A Author: hyukjinkwon <[email protected]> Closes apache#18417 from HyukjinKwon/close-stale-pr.
Size of broadcasted data in OrcRelation was significantly higher when running query with large number of partitions (e.g TPC-DS). And it has an impact on the job runtime. This would be more evident when there is large number of partitions/splits. Profiler snapshot is attached in SPARK-12948 (https://issues.apache.org/jira/secure/attachment/12783513/SPARK-12948_cpuProf.png).