-
Notifications
You must be signed in to change notification settings - Fork 147
slurm_scheduler, dir_workspace: add isolated workspaces for Slurm #416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #416 +/- ##
==========================================
+ Coverage 94.20% 94.28% +0.08%
==========================================
Files 66 67 +1
Lines 3690 3761 +71
==========================================
+ Hits 3476 3546 +70
- Misses 214 215 +1
Continue to review full report at Codecov.
|
f91cfbf to
9b60e81
Compare
|
@d4l3k has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@d4l3k has updated the pull request. You must reimport the pull request before landing. |
|
@d4l3k has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: This adds a new `DirWorkspace` which will copy the current workspace to the directory for code isolation purposes and integrates it in with the `slurm` scheduler via the `job_dir` runopt. The job dir must not exist and will be created. The CWD will be located in that job_dir. * `.torchxignore` is used for excluding files from the workspace * `.torchxslurmjobdirs` is used to track where job directories and thus logs are located Pull Request resolved: #416 Test Plan: Slurm integ tests + unit tests Reviewed By: kiukchung Differential Revision: D34801126 Pulled By: d4l3k fbshipit-source-id: 7423897d4a372f524230d08bc681493c112ce383
|
This pull request was exported from Phabricator. Differential Revision: D34801126 |
Summary: This adds a new `DirWorkspace` which will copy the current workspace to the directory for code isolation purposes and integrates it in with the `slurm` scheduler via the `job_dir` runopt. The job dir must not exist and will be created. The CWD will be located in that job_dir. * `.torchxignore` is used for excluding files from the workspace * `.torchxslurmjobdirs` is used to track where job directories and thus logs are located Pull Request resolved: #416 Test Plan: Slurm integ tests + unit tests Reviewed By: kiukchung Differential Revision: D34801126 Pulled By: d4l3k fbshipit-source-id: d53f6f36ad76921289116ee3e8a7c05b0975e594
|
This pull request was exported from Phabricator. Differential Revision: D34801126 |
This adds a new
DirWorkspacewhich will copy the current workspace to the directory for code isolation purposes and integrates it in with theslurmscheduler via thejob_dirrunopt. The job dir must not exist and will be created. The CWD will be located in that job_dir..torchxignoreis used for excluding files from the workspace.torchxslurmjobdirsis used to track where job directories and thus logs are locatedTest plan:
Slurm integ tests + unit tests