Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Fix env folder #3472

Merged
Merged
Show file tree
Hide file tree
Changes from 37 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
dcd2ffd
Merge pull request #251 from microsoft/master
SparkSnail May 29, 2020
3b8b6fb
Merge pull request #252 from microsoft/master
SparkSnail Jun 7, 2020
916e444
Merge pull request #253 from microsoft/master
SparkSnail Jun 15, 2020
caeffb8
Merge pull request #254 from microsoft/master
SparkSnail Jun 17, 2020
57c300e
Merge pull request #255 from microsoft/master
SparkSnail Jun 28, 2020
65660e6
Merge pull request #257 from microsoft/master
SparkSnail Jun 30, 2020
9376d6a
Merge pull request #258 from microsoft/master
SparkSnail Jul 1, 2020
5fef3cf
Merge pull request #259 from microsoft/master
SparkSnail Jul 3, 2020
5544ae8
Merge pull request #261 from microsoft/master
SparkSnail Jul 10, 2020
f9fdfee
Merge pull request #262 from microsoft/master
SparkSnail Jul 16, 2020
aa64fe6
Merge pull request #263 from microsoft/master
SparkSnail Jul 27, 2020
c6a5f8c
Merge pull request #264 from microsoft/master
SparkSnail Jul 31, 2020
68abe2f
Merge pull request #265 from microsoft/master
SparkSnail Aug 4, 2020
14e9619
Merge pull request #266 from microsoft/master
SparkSnail Aug 13, 2020
f69e206
Merge pull request #267 from microsoft/master
SparkSnail Aug 13, 2020
12ef0aa
Merge pull request #270 from microsoft/master
SparkSnail Sep 10, 2020
ddcf229
Merge pull request #271 from microsoft/master
SparkSnail Sep 15, 2020
c4f6e66
Merge pull request #272 from microsoft/master
SparkSnail Sep 21, 2020
88f8c1b
Merge pull request #273 from microsoft/master
SparkSnail Sep 22, 2020
7eb15f8
Merge pull request #274 from microsoft/master
SparkSnail Oct 27, 2020
f73367f
Merge pull request #275 from microsoft/master
SparkSnail Nov 16, 2020
765bc33
Merge pull request #276 from microsoft/master
SparkSnail Nov 29, 2020
cff51cc
Merge pull request #277 from microsoft/master
SparkSnail Dec 2, 2020
4232fea
Merge pull request #278 from microsoft/master
SparkSnail Dec 8, 2020
cb9efcc
Merge pull request #279 from microsoft/master
SparkSnail Dec 11, 2020
ee71f16
Merge pull request #280 from microsoft/master
SparkSnail Dec 14, 2020
c3921ed
Merge pull request #281 from microsoft/master
SparkSnail Dec 24, 2020
561f1ad
Merge pull request #284 from microsoft/master
SparkSnail Jan 22, 2021
daf028a
Merge pull request #285 from microsoft/master
SparkSnail Feb 5, 2021
9a8a4a3
Merge pull request #286 from microsoft/master
SparkSnail Feb 8, 2021
22a38dd
Merge pull request #287 from microsoft/master
SparkSnail Feb 23, 2021
645e1a6
Merge pull request #288 from microsoft/master
SparkSnail Feb 24, 2021
f41c25d
Merge pull request #289 from microsoft/master
SparkSnail Feb 25, 2021
9fb5ff9
Merge pull request #290 from microsoft/master
SparkSnail Mar 4, 2021
e3fab14
Merge pull request #291 from microsoft/master
SparkSnail Mar 23, 2021
19f95f0
init
SparkSnail Mar 23, 2021
93a0345
update
SparkSnail Mar 23, 2021
942b196
revert change
SparkSnail Mar 29, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion nni/tools/trial_tool/trial.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def run(self):

nni_log(LogType.Info, "%s: start to run trial" % self.name)

trial_working_dir = os.path.realpath(os.path.join(os.curdir, "..", "..", "trials", self.id))
trial_working_dir = os.path.realpath(os.path.join(os.curdir, "trials", self.id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a concern that if users want to find their outputs, they need know the trail's runner. And do we have a place to save these mapping?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, already add trials' envId and platform information in nnictl trial ls and webui.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I find envId, but it seems there no place to store the runnerId, that I need it to splice path like expId/envs/runnerId/trials/trialId. And I am worried if the checkpoint or output is easy to find under this way. Before, I know it is under expId/trials/trialId, but now I need a way to get runnerId.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I find we saved envName, but no envId, I made a mistake upstairs.

self.trial_output_dir = os.path.join(trial_working_dir, trial_output_path_name)
trial_code_dir = os.path.join(trial_working_dir, "code")
trial_nnioutput_dir = os.path.join(trial_working_dir, "nnioutput")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ export class RemoteEnvironmentService extends EnvironmentService {
private readonly log: Logger;
private sshConnectionPromises: any[];
private experimentRootDir: string;
private remoteExperimentRootDir: string = "";
private experimentId: string;

constructor() {
Expand Down Expand Up @@ -249,16 +250,14 @@ export class RemoteEnvironmentService extends EnvironmentService {
this.environmentExecutorManagerMap.set(environment.id, executorManager);
const executor = await this.getExecutor(environment.id);
if (environment.useSharedStorage) {
const environmentRoot = component.get<SharedStorageService>(SharedStorageService).remoteWorkingRoot;
environment.runnerWorkingFolder = executor.joinPath(environmentRoot, 'envs', environment.id)
this.remoteExperimentRootDir = component.get<SharedStorageService>(SharedStorageService).remoteWorkingRoot;
const remoteMountCommand = component.get<SharedStorageService>(SharedStorageService).remoteMountCommand;
await executor.executeScript(remoteMountCommand, false, false);
} else {
environment.runnerWorkingFolder =
executor.joinPath(executor.getRemoteExperimentRootDir(getExperimentId()),
'envs', environment.id)
this.remoteExperimentRootDir = executor.getRemoteExperimentRootDir(getExperimentId());
}
environment.command = `cd ${environment.runnerWorkingFolder} && \
environment.runnerWorkingFolder = executor.joinPath(this.remoteExperimentRootDir, 'envs', environment.id);
environment.command = `cd ${this.remoteExperimentRootDir} && \
${environment.command} --job_pid_file ${environment.runnerWorkingFolder}/pid \
1>${environment.runnerWorkingFolder}/trialrunner_stdout 2>${environment.runnerWorkingFolder}/trialrunner_stderr \
&& echo $? \`date +%s%3N\` >${environment.runnerWorkingFolder}/code`;
Expand All @@ -278,7 +277,7 @@ export class RemoteEnvironmentService extends EnvironmentService {
await fs.promises.writeFile(path.join(environmentLocalTempFolder, executor.getScriptName("run")),
environment.command, { encoding: 'utf8' });
// Copy files in codeDir to remote working directory
await executor.copyDirectoryToRemote(environmentLocalTempFolder, environment.runnerWorkingFolder);
await executor.copyDirectoryToRemote(environmentLocalTempFolder, this.remoteExperimentRootDir);
// Execute command in remote machine, set isInteractive=true to run script in conda environment
executor.executeScript(executor.joinPath(environment.runnerWorkingFolder,
executor.getScriptName("run")), true, true);
Expand Down