[SPARK-17029] make toJSON not go through rdd form but operate on dataset always#14615
[SPARK-17029] make toJSON not go through rdd form but operate on dataset always#14615robert3005 wants to merge 7 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
can you add classdoc to explain what boundGen (and other ctor parameters) does.
def3cf0 to
a2e8ebf
Compare
|
@rxin anything else? I added docs to the best of my understanding let me know if you meant something else. |
a2e8ebf to
6f286c1
Compare
6f286c1 to
48342c2
Compare
|
This seems pretty reasonable, assuming test coverage already exists on that toJSON method. Jenkins, this is ok to test. |
|
@rxin any chance you or someone else can take a look? |
48342c2 to
0043190
Compare
|
Jenkins, this is ok to test. |
|
Test build #74755 has finished for PR 14615 at commit
|
|
@robert3005 looks like this has unit test failures on |
|
Jenkins, this is ok to test. |
|
It indeed does look like a flake. |
|
Test build #74759 has finished for PR 14615 at commit
|
|
Test build #74828 has finished for PR 14615 at commit
|
|
ping? @rxin |
There was a problem hiding this comment.
Instead of importing it, we can explicitly pass the encoder.
mapPartitions {
// func
} (sparkSession.implicits.newStringEncoder)6250699 to
274be08
Compare
|
thanks @gatorsmile, updated |
|
Test build #76757 has finished for PR 14615 at commit
|
| } | ||
| import sparkSession.implicits.newStringEncoder | ||
| sparkSession.createDataset(rdd) | ||
| } (sparkSession.implicits.newStringEncoder) |
| case _ => false | ||
| } | ||
|
|
||
| if (containsRDD.isDefined) { |
There was a problem hiding this comment.
nit: assert(containsRDD.isEmpty, "xxx")
|
LGTM |
|
Test build #76766 has finished for PR 14615 at commit
|
|
thanks, merging to master! |
…set always ## What changes were proposed in this pull request? Don't convert toRdd when doing toJSON ## How was this patch tested? Existing unit tests Author: Robert Kruszewski <robertk@palantir.com> Closes apache#14615 from robert3005/robertk/correct-tojson.
What changes were proposed in this pull request?
Don't convert toRdd when doing toJSON
How was this patch tested?
Existing unit tests