-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-11954][SQL] Encoder for JavaBeans #9937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #46608 has finished for PR 9937 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this take Map[String, Expression]? zipping always seems awkward when we can avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I would not make it take a NewInstance as it would be nice to use a Literal sometimes for object reuse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we can not use Map here, as the fields: Seq[Expression] needs resolution and transformation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably just extend this logic to recurse into maps similar to how we handle traversables.
|
I think Java beans is different from pojo. Beans only care about fields that have getters and setters, while pojo should care about all the fiedls. |
|
A POJO is not rigorously defined, and I'm not sure how to detect it and re-construct it... |
|
Use reflection to find all the public and private fields? |
|
Perhaps we should start with Java Beans. They are well defined, we have existing reflection code, and I've never heard a java user complain that was not sufficient (this is what we support for creating java data frames). |
|
Sounds good (supporting only beans for now). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I came up with this by intuition and verified it by experiment, cannot find any doc for it, very hacky...
|
Test build #46691 has finished for PR 9937 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not right, we may have attribute loopVar in MapObjects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I depend on name to skip the attribute loopVar in MapObjects, need a better way...
cc @marmbrus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this does feel kind of hacky, but I'm not sure if its really a problem in practice. This assertion is only to catch our own bugs, not user mistakes.
|
Test build #46693 has finished for PR 9937 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the schemaFor stuff to object ScalaReflection, because of the need of mirror.runtimeClass in getJavaBeanFields
|
Test build #46763 has finished for PR 9937 at commit
|
|
Thanks for working on this. I'm not sure that this is the right tactic though. I don't think that we want to make java bean the default fall back in our scala reflection code path. Instead I think that we only want to create a java bean encoder when the user explicitly asks for it and thus this should be a separate code path. The only common code should be Also, instead of hacking this into scala's reflection, I think we should use the builtin java support for introspecting on beans. I would probably start with the code in JavaTypeInference and use that to construct expressions. |
|
@marmbrus do you mean we should write java version of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that, for java map, if we get the keys and values by keySet and values, we can not guarantee they have same iteration order(which is different from scala map). A possible solution is creating a new MapObjects that can iterate a map directly.
cc @marmbrus
|
Test build #46871 has finished for PR 9937 at commit
|
|
Test build #46880 has finished for PR 9937 at commit
|
|
Test build #46882 has finished for PR 9937 at commit
|
|
Test build #46883 has finished for PR 9937 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.view.force seems a little odd. toList?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to return a Map here, maybe toList.toMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh... hmmm, what happens if we don't have this? is it lazy? I guess view.force is fine. It was just a little confusing when reading.
Also, can you add a test to TreeNodeSuite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, mapValues is lazy, I'll add a comment to explain it, also add test in TreeNodeSuite
|
Some minor comments, but this is looking really good! |
|
Test build #46939 has finished for PR 9937 at commit
|
|
Thanks! I'm going to try and fix conflicts while merging to master and 1.6. |
create java version of `constructorFor` and `extractorFor` in `JavaTypeInference` Author: Wenchen Fan <[email protected]> This patch had conflicts when merged, resolved by Committer: Michael Armbrust <[email protected]> Closes #9937 from cloud-fan/pojo. (cherry picked from commit fd95eea) Signed-off-by: Michael Armbrust <[email protected]>
create java version of
constructorForandextractorForinJavaTypeInference