Skip to content

Conversation

@ash211
Copy link
Contributor

@ash211 ash211 commented Aug 10, 2017

What changes were proposed in this pull request?

Fix the race condition when serializing TaskDescriptions and adding jars by keeping the set of jars and files for a TaskSet constant across the lifetime of the TaskSet. Otherwise TaskDescription serialization can produce an invalid serialization when new file/jars are added concurrently as the TaskDescription is serialized.

How was this patch tested?

Additional unit test ensures jars/files contained in the TaskDescription remain constant throughout the lifetime of the TaskSet.

@SparkQA
Copy link

SparkQA commented Aug 11, 2017

Test build #80512 has finished for PR 18913 at commit b06425f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@jiangxb1987 jiangxb1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, cc @cloud-fan

/**
* Adds a JAR dependency for all tasks to be executed on this `SparkContext` in the future.
*
* If a jar is added during execution, it will not be available until the next TaskSet starts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we document the behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This additional sentence is probably overkill, since I'm sure there are many other subtleties about these methods that aren't spelled out in the Javadoc.

I'm happy to remove and keep these docs simple if that's what a committer would prefer before merging.

@asfgit asfgit closed this in 6847e93 Aug 14, 2017
asfgit pushed a commit that referenced this pull request Aug 14, 2017
…ons and adding jars

## What changes were proposed in this pull request?

Fix the race condition when serializing TaskDescriptions and adding jars by keeping the set of jars and files for a TaskSet constant across the lifetime of the TaskSet.  Otherwise TaskDescription serialization can produce an invalid serialization when new file/jars are added concurrently as the TaskDescription is serialized.

## How was this patch tested?

Additional unit test ensures jars/files contained in the TaskDescription remain constant throughout the lifetime of the TaskSet.

Author: Andrew Ash <[email protected]>

Closes #18913 from ash211/SPARK-21563.

(cherry picked from commit 6847e93)
Signed-off-by: Wenchen Fan <[email protected]>
@cloud-fan
Copy link
Contributor

LGTM, merging to master/2.2!

MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
…ons and adding jars

## What changes were proposed in this pull request?

Fix the race condition when serializing TaskDescriptions and adding jars by keeping the set of jars and files for a TaskSet constant across the lifetime of the TaskSet.  Otherwise TaskDescription serialization can produce an invalid serialization when new file/jars are added concurrently as the TaskDescription is serialized.

## How was this patch tested?

Additional unit test ensures jars/files contained in the TaskDescription remain constant throughout the lifetime of the TaskSet.

Author: Andrew Ash <[email protected]>

Closes apache#18913 from ash211/SPARK-21563.

(cherry picked from commit 6847e93)
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants