Skip to content

Conversation

@kiszk
Copy link
Member

@kiszk kiszk commented May 6, 2018

What changes were proposed in this pull request?

When multiple clients attempt to resolve artifacts via the --packages parameter, they could run into race condition when they each attempt to modify the dummy org.apache.spark-spark-submit-parent-default.xml file created in the default ivy cache dir.
This PR changes the behavior to encode UUID in the dummy module descriptor so each client will operate on a different resolution file in the ivy cache dir. In addition, this patch changes the behavior of when and which resolution files are cleaned to prevent accumulation of resolution files in the default ivy cache dir.

Since this PR is a successor of #18801, close #18801. Many codes were ported from #18801. Many efforts were put here. I think this PR should credit to @Victsm .

How was this patch tested?

added UT into SparkSubmitUtilsSuite

@SparkQA
Copy link

SparkQA commented May 6, 2018

Test build #90273 has finished for PR 21251 at commit 949ec1d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member Author

kiszk commented May 6, 2018

cc @jiangxb1987 @vanzin

@kiszk
Copy link
Member Author

kiszk commented May 7, 2018

cc @gatorsmile

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor things.

ivySettings: IvySettings,
ivyConfName: String): Unit = {
val currentResolutionFiles = Seq[File](
new File(ivySettings.getDefaultCache,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could move new File(ivySettings.getDefaultCache to the foreach loop instead.

isTest = true)
val r = """.*org.apache.spark-spark-submit-parent-.*""".r
assert(ivySettings.getDefaultCache.listFiles.map(_.getName)
.forall { case n @ r() => false case _ => true },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think using !<list of files>.exists(r.findFirstIn(_).isDefined) would be slightly clearer than you version.

"1.0"))

/**
* clear ivy resolution from current launch. The resolution file is usually at
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: start comment with capital letter.

@SparkQA
Copy link

SparkQA commented May 9, 2018

Test build #90396 has finished for PR 21251 at commit 9a6377b.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member Author

kiszk commented May 9, 2018

retest this please

@SparkQA
Copy link

SparkQA commented May 9, 2018

Test build #90405 has finished for PR 21251 at commit 9a6377b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented May 10, 2018

LGTM. Merging to master / 2.3.

asfgit pushed a commit that referenced this pull request May 10, 2018
… artifacts at the same time

## What changes were proposed in this pull request?

When multiple clients attempt to resolve artifacts via the `--packages` parameter, they could run into race condition when they each attempt to modify the dummy `org.apache.spark-spark-submit-parent-default.xml` file created in the default ivy cache dir.
This PR changes the behavior to encode UUID in the dummy module descriptor so each client will operate on a different resolution file in the ivy cache dir. In addition, this patch changes the behavior of when and which resolution files are cleaned to prevent accumulation of resolution files in the default ivy cache dir.

Since this PR is a successor of #18801, close #18801. Many codes were ported from #18801. **Many efforts were put here. I think this PR should credit to Victsm .**

## How was this patch tested?

added UT into `SparkSubmitUtilsSuite`

Author: Kazuaki Ishizaki <[email protected]>

Closes #21251 from kiszk/SPARK-10878.

(cherry picked from commit d3c426a)
Signed-off-by: Marcelo Vanzin <[email protected]>
@asfgit asfgit closed this in d3c426a May 10, 2018
robert3005 pushed a commit to palantir/spark that referenced this pull request Jun 24, 2018
… artifacts at the same time

## What changes were proposed in this pull request?

When multiple clients attempt to resolve artifacts via the `--packages` parameter, they could run into race condition when they each attempt to modify the dummy `org.apache.spark-spark-submit-parent-default.xml` file created in the default ivy cache dir.
This PR changes the behavior to encode UUID in the dummy module descriptor so each client will operate on a different resolution file in the ivy cache dir. In addition, this patch changes the behavior of when and which resolution files are cleaned to prevent accumulation of resolution files in the default ivy cache dir.

Since this PR is a successor of apache#18801, close apache#18801. Many codes were ported from apache#18801. **Many efforts were put here. I think this PR should credit to Victsm .**

## How was this patch tested?

added UT into `SparkSubmitUtilsSuite`

Author: Kazuaki Ishizaki <[email protected]>

Closes apache#21251 from kiszk/SPARK-10878.
otterc pushed a commit to linkedin/spark that referenced this pull request Mar 22, 2023
… artifacts at the same time

When multiple clients attempt to resolve artifacts via the `--packages` parameter, they could run into race condition when they each attempt to modify the dummy `org.apache.spark-spark-submit-parent-default.xml` file created in the default ivy cache dir.
This PR changes the behavior to encode UUID in the dummy module descriptor so each client will operate on a different resolution file in the ivy cache dir. In addition, this patch changes the behavior of when and which resolution files are cleaned to prevent accumulation of resolution files in the default ivy cache dir.

Since this PR is a successor of apache#18801, close apache#18801. Many codes were ported from apache#18801. **Many efforts were put here. I think this PR should credit to Victsm .**

added UT into `SparkSubmitUtilsSuite`

Author: Kazuaki Ishizaki <[email protected]>

Closes apache#21251 from kiszk/SPARK-10878.

(cherry picked from commit d3c426a)

RB=1313943
G=superfriends-reviewers
R=fli,mshen,yezhou,edlu
A=fli
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants