Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reformat benchmark metrics #42

Merged
merged 7 commits into from
Apr 19, 2024
Merged

Reformat benchmark metrics #42

merged 7 commits into from
Apr 19, 2024

Conversation

yeandy
Copy link
Contributor

@yeandy yeandy commented Apr 18, 2024

As part of effort the to automate benchmarks in XLML, the benchmark_serving.py script will be used to benchmark MaxText JetModels. The benchmark results, however, need to be in a slightly different format:

{
  "metrics": {"step_time": 100},
  "dimension": {"framework": "jax", ...}
}

This PR refactors the output, casts the values, and returns it for later use.

@yeandy
Copy link
Contributor Author

yeandy commented Apr 18, 2024

@JoeZijunZhou Is the key request_inthroughput a typo? should it be request_throughput?

@JoeZijunZhou
Copy link
Collaborator

@JoeZijunZhou Is the key request_inthroughput a typo? should it be request_throughput?

Yes, plz change it to request_throughput, thanks!

Copy link
Collaborator

@JoeZijunZhou JoeZijunZhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Shall we also update the README in /benchmarks to give an example of running benchmark with eval? Also resolve the python checks before merge: https://github.com/google/JetStream/actions/runs/8742895378/job/23992198244?pr=42. Thanks!

@@ -735,6 +748,12 @@ def main(args: argparse.Namespace):
default="/tmp/request-outputs.json",
help="File path to store request outputs",
)
parser.add_argument(
"--run-eval",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this change to readme?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pylint check failed, could you fix the check error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@vipannalla
Copy link
Collaborator

vipannalla commented Apr 18, 2024

LGTM! Shall we also update the README in /benchmarks to give an example of running benchmark with eval? Also resolve the python checks before merge: https://github.com/google/JetStream/actions/runs/8742895378/job/23992198244?pr=42. Thanks!

Agree, need to update our documentation.

Copy link
Collaborator

@vipannalla vipannalla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a much needed change. Makes it easy to run perf and eval in one command.

metrics_json = {**metrics_json, **benchmark_result}
if args.run_eval:
eval_json = eval_accuracy(output)
metrics_json = {**metrics_json, **eval_json}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It maybe work keeping metrics separated into "perf" and "eval" categories. What is the interface to XLML storage -- are you flattening all metrics into one table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metrics json is a single-level json. During json-lines processing, it uses the assumption that the key is a string and value is a float (not another json).

It populates BigQuery tables in normalized tables. I'll show you what I mean via chat.

Also, see example in this test https://github.com/GoogleCloudPlatform/ml-auto-solutions/blob/master/xlml/utils/metric_test.py#L155

@yeandy yeandy merged commit 1b1d93f into main Apr 19, 2024
3 checks passed
@yeandy yeandy deleted the consolidate_benchmark_results branch April 19, 2024 15:22
jwyang-google pushed a commit that referenced this pull request May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants