Skip to content

Conversation

@foxik
Copy link
Contributor

@foxik foxik commented Feb 25, 2015

As documented in createDirectory, the result of createDirectory is not registered for automatic removal. Currently there are 4 directories left in /tmp after just running pyspark.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srowen
Copy link
Member

srowen commented Feb 25, 2015

ok to test

@SparkQA
Copy link

SparkQA commented Feb 25, 2015

Test build #27946 has started for PR 4759 at commit 56feaca.

  • This patch merges cleanly.

@srowen
Copy link
Member

srowen commented Feb 25, 2015

I'm pretty sure this is the right move but CC @vanzin as this is closely related to changes he's making as we speak.

@foxik
Copy link
Contributor Author

foxik commented Feb 25, 2015

The getOrCreateLocalRootDirs will work with both changes.

Nevertheless, in the #4747 patch I am seeing more Utils.createDirectory (this time in core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala).

Maybe a better way would be to modify Utils.createDirectory to automatically register directory deletion (or, if it is used somewhere to create a permanent directory, we can have both Utils.createPermanentDirectory and Utils.createTemporaryDirectory).

@srowen
Copy link
Member

srowen commented Feb 25, 2015

@foxik I think we can stick to your current fix in this PR. You have a good point about the related change which I'll comment on there.

@SparkQA
Copy link

SparkQA commented Feb 25, 2015

Test build #27946 has finished for PR 4759 at commit 56feaca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27946/
Test PASSed.

@vanzin
Copy link
Contributor

vanzin commented Feb 25, 2015

Ok, I think this might work. But I'd call just createTempDir instead of createDirectory + registerForWhatever.

I was worried that this might cause shuffle data to be deleted when executors are killed in dynamic allocation mode. But the code path being changed is not used by executors in Yarn mode (nor in Standalone mode after my change), so it all should be fine.

@srowen
Copy link
Member

srowen commented Feb 25, 2015

Hm, I was going to say that createTempDir will do something more, and create a subdirectory. But now looking at the code, isn't the point to create a sub-directory of the user-supplied directory? It doesn't do that. You know this pretty well @vanzin , is that OK? if so, given SPARK-5801, I think we want to avoid the extra directory? In which case this change is OK.

@vanzin
Copy link
Contributor

vanzin commented Feb 25, 2015

@srowen not sure I understand; createTempDir takes a root just like createDirectory.

@srowen
Copy link
Member

srowen commented Feb 25, 2015

@vanzin Nevermind, both of these end up in the same place with createDirectory. I agree with @vanzin that this could be simplified by just calling createTempDir. Then I think it's good to go.

to automatically remove the directory on shutdown.
@foxik
Copy link
Contributor Author

foxik commented Feb 25, 2015

You are right, I just modified the patch to use createTempDir.

@SparkQA
Copy link

SparkQA commented Feb 25, 2015

Test build #27955 has started for PR 4759 at commit 280450d.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 25, 2015

Test build #27955 has finished for PR 4759 at commit 280450d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27955/
Test PASSed.

@srowen
Copy link
Member

srowen commented Feb 25, 2015

LGTM. This is a clean simple fix that we three have reviewed a fair bit, so let's do it.

@asfgit asfgit closed this in a777c65 Feb 25, 2015
@foxik foxik deleted the remove-tmp-dirs branch March 3, 2015 07:26
mingyukim pushed a commit to palantir/spark that referenced this pull request Jul 10, 2015
…Dirs for automatic deletion.

As documented in createDirectory, the result of createDirectory is not registered for automatic removal. Currently there are 4 directories left in `/tmp` after just running `pyspark`.

Author: Milan Straka <[email protected]>

Closes apache#4759 from foxik/remove-tmp-dirs and squashes the following commits:

280450d [Milan Straka] Use createTempDir in getOrCreateLocalRootDirs...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants