Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix benchmarking scripts #1005

Merged
merged 10 commits into from
Nov 28, 2022
Merged

Fix benchmarking scripts #1005

merged 10 commits into from
Nov 28, 2022

Conversation

devernay
Copy link
Contributor

@devernay devernay commented Nov 22, 2022

- add shebang
- add -s option to launch a single job per GPU
- last GPU was ignored
- kill all subprocesses when script is terminated
- add -s option to launch a single job per GPU
- add -v option to use tensorboard instead of wandb
- set nerfacto options according to #806 (comment)
- use a single timestamp for all training jobs
- last GPU was ignored
- kill all subprocesses when script is terminated
- print the eval script command-line
@devernay devernay marked this pull request as ready for review November 22, 2022 20:39
@@ -71,6 +73,7 @@ The flags used in the benchmarking script are defined as follows:
- `-m`: config name (e.g. `instant-ngp`). This should be the same as what was passed in for -c in the train script.
- `-o`: base output directory for where all of the benchmarks are stored (e.g. `outputs/`). Corresponds to the `--output-dir` in the base `Config` for training.
- `-t`: timestamp of benchmark; also the identifier (e.g. `2022-08-10_172517`).
- `-s`: Launch a single job per GPU.
- `-g`: specifies the gpus to use and if not specified (no -g flag), will automaticaly search for available gpus.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this flag still used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a flag I added. It basically launches one job per GPU, then waits on the first one to be finished before relaunching on that GPU.
In the previous version, all jobs were launched in parallel at script launch (and Ctrl-C didn't kill the jobs). That works fine if you have several GPUs and lots of GPU memory, but not if you have a single 16Gb GPU.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh sorry, I was trying to highlight the -g flag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well this flag was simply ignored in the previous version of the scripts. It just took the list of remaining arguments as the list of GPUs, so I kept it that way and removed the unused flag. I'll adjust this doc.

Copy link
Contributor

@tancik tancik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tancik tancik merged commit 8fc7950 into nerfstudio-project:main Nov 28, 2022
tancik pushed a commit to dozeri83/nerfstudio that referenced this pull request Jan 20, 2023
* launch_eval_blender.sh: fix script

- add shebang
- add -s option to launch a single job per GPU
- last GPU was ignored
- kill all subprocesses when script is terminated

* launch_train_blender.sh: fix script

- add -s option to launch a single job per GPU
- add -v option to use tensorboard instead of wandb
- set nerfacto options according to nerfstudio-project#806 (comment)
- use a single timestamp for all training jobs
- last GPU was ignored
- kill all subprocesses when script is terminated
- print the eval script command-line

* launch_eval_blender.sh: fix script

* Update benchmarking.md

* launch_train_blender.sh: add -s to eval command-line

* Update launch_train_blender.sh

* Update benchmarking.md
chris838 pushed a commit to chris838/nerfstudio that referenced this pull request Apr 22, 2023
* launch_eval_blender.sh: fix script

- add shebang
- add -s option to launch a single job per GPU
- last GPU was ignored
- kill all subprocesses when script is terminated

* launch_train_blender.sh: fix script

- add -s option to launch a single job per GPU
- add -v option to use tensorboard instead of wandb
- set nerfacto options according to nerfstudio-project#806 (comment)
- use a single timestamp for all training jobs
- last GPU was ignored
- kill all subprocesses when script is terminated
- print the eval script command-line

* launch_eval_blender.sh: fix script

* Update benchmarking.md

* launch_train_blender.sh: add -s to eval command-line

* Update launch_train_blender.sh

* Update benchmarking.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nerfacto overfitting on blender scenes?
2 participants