-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-9013][SQL] generate MutableProjection directly instead of return a function #7373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I think MutableProject will create a |
|
@davies that is correct. The goal was to be able to amortize the cost of getting the code and then create a new copy with its own row multiple times. Though looking at the change, it doesn't seem like we actually use that functionality in practice. (and given the use of references, I'm not sure that more than one copy can actually be used in a thread safe way anymore anyway). |
|
Test build #37136 has finished for PR 7373 at commit
|
|
I think we will do codegen at executor side for every partition, so multiple tasks won't share one codegened |
|
@cloud-fan How about local mode? We don't always create the projection inside |
|
@davies which "local" do you mean? If you mean cc @rxin, an unrelate question here, I saw we set up task context even for local execution, and set partition id to 0, is there really a case that we don't have task context? |
|
A new question here, now we support non-deterministic/stateful expressions in codegen, so all generated class is not thread safe, not only cc @marmbrus |
|
cc @yhuai |
|
Test build #41924 has finished for PR 7373 at commit
|
|
@cloud-fan can you rebase? |
| val exprs = orderSpec.map(_.child) | ||
| val projection = newMutableProjection(exprs, child.output) | ||
| (orderSpec, projection(), projection()) | ||
| val buildProjection = () => newMutableProjection(exprs, child.output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only place that we want to build the same projection twice.
|
LGTM |
|
Test build #56328 has finished for PR 7373 at commit
|
|
Merging in master. Thanks. |
MutableProjectionis not thread-safe and we won't use it in multiple threads. I think the reason that we return() => MutableProjectionis not about thread safety, but to save the costs of generating code when we need same but individual mutable projections.However, I only found one place that use this feature, and comparing to the troubles it brings, I think we should generate
MutableProjectiondirectly instead of return a function.