Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

self-document the benchmark's setup #29

Merged
merged 3 commits into from
Feb 19, 2024
Merged

self-document the benchmark's setup #29

merged 3 commits into from
Feb 19, 2024

Conversation

stas00
Copy link
Collaborator

@stas00 stas00 commented Feb 17, 2024

This PR extends the report to include a header that includes important details about the benchmark so that the results are self-documenting/reproducible.

It also adds an optional --notes cli arg if the experimenter needs to add additional notes to be included in the report.

Now it should be much easier to reproduce the report.

Here is an example:

$ python mm_flops.py -m 1 -n 2048 -k 2048 --output_file=mm_m_range.txt --notes="numa_balancing=0"

Benchmark started on 2024-02-17 02:20:50

** Command line:
/env/lib/conda/py39-pt23/bin/python mm_flops.py -m 1 -n 2048 -k 2048 --output_file=mm_m_range.txt --notes=numa_balancing=0

** Platform:
Linux example 5.15.0-1004- #4-Ubuntu SMP Fri Jan 26 21:18:52 UTC 2024 x86_64 x86_64

** Critical component versions:
torch=2.3.0.dev20240216+cu121, cuda=12.1, nccl=(2, 19, 3)

** Additional notes:
numa_balancing=0

--------------------------------------------------------------------------------


Elapsed time for 1x2048x2048: 0.000
Throughput (in TFLOP/s) for 1x2048x2048: 1.502
--------------------------------------------------------------------------------

You can later add other component versions that you think are important

@stas00 stas00 marked this pull request as ready for review February 17, 2024 02:23
@Quentin-Anthony Quentin-Anthony merged commit 939fa3c into main Feb 19, 2024
1 check passed
@stas00 stas00 deleted the self-document branch February 20, 2024 01:47
@stas00
Copy link
Collaborator Author

stas00 commented Feb 20, 2024

hmm, probably should try to detect nvidia-smi or rocm-smi and dump those as well, right? as it would also be useful to have in the log file to know exactly what the file was generated on, including the driver versions.

anything else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants