-
Notifications
You must be signed in to change notification settings - Fork 19
Shift GPU Memory Computation to End of Benchmarking Script #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shift GPU Memory Computation to End of Benchmarking Script #30
Conversation
scripts/benchmarks/benchmark.py
Outdated
| gpu_logs = pd.read_csv(gpu_log_filename, skipinitialspace=True) | ||
| peak_nvidia_mem_by_device_id, device_name = get_peak_mem_usage_by_device_id(gpu_logs) | ||
| experiment_stats[tag].update({ | ||
| RESULT_FIELD_RESERVED_GPU_MEM: peak_nvidia_mem_by_device_id.mean(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs a comment what we are taking the mean over.
scripts/benchmarks/benchmark.py
Outdated
| except FileNotFoundError: | ||
| pass | ||
|
|
||
| if script_args['log_nvidia_smi'] is True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont need is True
scripts/benchmarks/benchmark.py
Outdated
| RESULT_FIELD_DEVICE_NAME: device_name, | ||
| }) | ||
|
|
||
| if script_args['log_memory_hf'] is True and tag in experiment_stats.keys(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
scripts/benchmarks/benchmark.py
Outdated
| k: v for k, v in experiment_stats[tag].items() | ||
| if any([prefix in k for prefix in memory_metrics_prefixes]) | ||
| } | ||
| if len(memory_metrics.keys())>0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls lint the file with tox -e lint
Description
This PR shifts all GPU memory computation from the end of each experiment to the end of the benchmarking script. This avoids the need to rerun experiments, instead the raw values are saved and the aggregated values are computed at the end across all the experiments in
gather_report.