Skip to content

Conversation

@srowen
Copy link
Member

@srowen srowen commented Feb 17, 2017

What changes were proposed in this pull request?

Use BytesWritable.copyBytes, not getBytes, because getBytes returns the underlying array, which may be reused when repeated reads don't need a different size, as is the case with binaryRecords APIs

How was this patch tested?

Existing tests

…he underlying array, which may be reused when repeated reads don't need a different size, as is the case with binaryRecords APIs
@hvanhovell
Copy link
Contributor

LGTM - pending jenkins

@SparkQA
Copy link

SparkQA commented Feb 17, 2017

Test build #73047 has finished for PR 16974 at commit 457e735.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

@srowen should we add a regression test? It seems weird that we didn't catch this in tests.

@srowen
Copy link
Member Author

srowen commented Feb 17, 2017

Agreed, I fixed the tests to actually fix this, and generally cleaned up the relevant test code

@SparkQA
Copy link

SparkQA commented Feb 17, 2017

Test build #73056 has finished for PR 16974 at commit 219bbf6.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 17, 2017

Test build #3578 has finished for PR 16974 at commit 219bbf6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Feb 20, 2017
…ala API

## What changes were proposed in this pull request?

Use `BytesWritable.copyBytes`, not `getBytes`, because `getBytes` returns the underlying array, which may be reused when repeated reads don't need a different size, as is the case with binaryRecords APIs

## How was this patch tested?

Existing tests

Author: Sean Owen <[email protected]>

Closes #16974 from srowen/SPARK-19646.

(cherry picked from commit d0ecca6)
Signed-off-by: Sean Owen <[email protected]>
@asfgit asfgit closed this in d0ecca6 Feb 20, 2017
asfgit pushed a commit that referenced this pull request Feb 20, 2017
…ala API

Use `BytesWritable.copyBytes`, not `getBytes`, because `getBytes` returns the underlying array, which may be reused when repeated reads don't need a different size, as is the case with binaryRecords APIs

Existing tests

Author: Sean Owen <[email protected]>

Closes #16974 from srowen/SPARK-19646.

(cherry picked from commit d0ecca6)
Signed-off-by: Sean Owen <[email protected]>
@srowen
Copy link
Member Author

srowen commented Feb 20, 2017

Merged to master/2.1/2.0 as it's a reasonably important bug

@srowen
Copy link
Member Author

srowen commented Feb 20, 2017

Oops, pick was clean into 2.1 but it actually resulted in an error: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.1-test-sbt-hadoop-2.7/374/consoleFull

Fixing now ...

@srowen srowen deleted the SPARK-19646 branch February 20, 2017 17:23
Yunni pushed a commit to Yunni/spark that referenced this pull request Feb 27, 2017
…ala API

## What changes were proposed in this pull request?

Use `BytesWritable.copyBytes`, not `getBytes`, because `getBytes` returns the underlying array, which may be reused when repeated reads don't need a different size, as is the case with binaryRecords APIs

## How was this patch tested?

Existing tests

Author: Sean Owen <[email protected]>

Closes apache#16974 from srowen/SPARK-19646.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants