[SPARK-19646][CORE][STREAMING] binaryRecords replicates records in scala API #16974

srowen · 2017-02-17T09:43:21Z

What changes were proposed in this pull request?

Use BytesWritable.copyBytes, not getBytes, because getBytes returns the underlying array, which may be reused when repeated reads don't need a different size, as is the case with binaryRecords APIs

How was this patch tested?

Existing tests

…he underlying array, which may be reused when repeated reads don't need a different size, as is the case with binaryRecords APIs

hvanhovell · 2017-02-17T10:19:09Z

LGTM - pending jenkins

SparkQA · 2017-02-17T12:15:44Z

Test build #73047 has finished for PR 16974 at commit 457e735.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

hvanhovell · 2017-02-17T12:51:51Z

@srowen should we add a regression test? It seems weird that we didn't catch this in tests.

srowen · 2017-02-17T14:40:13Z

Agreed, I fixed the tests to actually fix this, and generally cleaned up the relevant test code

SparkQA · 2017-02-17T16:51:19Z

Test build #73056 has finished for PR 16974 at commit 219bbf6.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-02-17T19:39:47Z

Test build #3578 has finished for PR 16974 at commit 219bbf6.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

…ala API ## What changes were proposed in this pull request? Use `BytesWritable.copyBytes`, not `getBytes`, because `getBytes` returns the underlying array, which may be reused when repeated reads don't need a different size, as is the case with binaryRecords APIs ## How was this patch tested? Existing tests Author: Sean Owen <[email protected]> Closes #16974 from srowen/SPARK-19646. (cherry picked from commit d0ecca6) Signed-off-by: Sean Owen <[email protected]>

…ala API Use `BytesWritable.copyBytes`, not `getBytes`, because `getBytes` returns the underlying array, which may be reused when repeated reads don't need a different size, as is the case with binaryRecords APIs Existing tests Author: Sean Owen <[email protected]> Closes #16974 from srowen/SPARK-19646. (cherry picked from commit d0ecca6) Signed-off-by: Sean Owen <[email protected]>

srowen · 2017-02-20T17:20:42Z

Merged to master/2.1/2.0 as it's a reasonably important bug

srowen · 2017-02-20T17:21:38Z

Oops, pick was clean into 2.1 but it actually resulted in an error: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.1-test-sbt-hadoop-2.7/374/consoleFull

Fixing now ...

…ala API ## What changes were proposed in this pull request? Use `BytesWritable.copyBytes`, not `getBytes`, because `getBytes` returns the underlying array, which may be reused when repeated reads don't need a different size, as is the case with binaryRecords APIs ## How was this patch tested? Existing tests Author: Sean Owen <[email protected]> Closes apache#16974 from srowen/SPARK-19646.

Use BytesWritable.copyBytes, not getBytes, because getBytes returns t…

457e735

…he underlying array, which may be reused when repeated reads don't need a different size, as is the case with binaryRecords APIs

uncleGen approved these changes Feb 17, 2017

View reviewed changes

Add tests for fix, and improve surrounding test code

219bbf6

asfgit closed this in d0ecca6 Feb 20, 2017

srowen deleted the SPARK-19646 branch February 20, 2017 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-19646][CORE][STREAMING] binaryRecords replicates records in scala API #16974

[SPARK-19646][CORE][STREAMING] binaryRecords replicates records in scala API #16974

Uh oh!

srowen commented Feb 17, 2017

Uh oh!

hvanhovell commented Feb 17, 2017

Uh oh!

SparkQA commented Feb 17, 2017

Uh oh!

hvanhovell commented Feb 17, 2017

Uh oh!

srowen commented Feb 17, 2017

Uh oh!

SparkQA commented Feb 17, 2017

Uh oh!

SparkQA commented Feb 17, 2017

Uh oh!

srowen commented Feb 20, 2017

Uh oh!

srowen commented Feb 20, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-19646][CORE][STREAMING] binaryRecords replicates records in scala API #16974

[SPARK-19646][CORE][STREAMING] binaryRecords replicates records in scala API #16974

Uh oh!

Conversation

srowen commented Feb 17, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

hvanhovell commented Feb 17, 2017

Uh oh!

SparkQA commented Feb 17, 2017

Uh oh!

hvanhovell commented Feb 17, 2017

Uh oh!

srowen commented Feb 17, 2017

Uh oh!

SparkQA commented Feb 17, 2017

Uh oh!

SparkQA commented Feb 17, 2017

Uh oh!

srowen commented Feb 20, 2017

Uh oh!

srowen commented Feb 20, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants