Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish results of JMH benchmark runs (Java SDK) to InfluxDB (#22238). #23041

Merged
merged 5 commits into from
Sep 14, 2022

Conversation

mosche
Copy link
Member

@mosche mosche commented Sep 6, 2022

Adds a custom main wrapper around the JMH runner that supports publishing JMH benchmark results to InfluxDB.

Schema

The wrapper writes an aggregated InfluxDB datapoint for each benchmark to measurement
{INFLUXDB_BASE_MEASUREMENT}_{mode}. Typically this is java_jmh_thrpt.

The timestamp of the datapoint corresponds to the start time of the respective benchmark.

Individual timeseries are discriminated using the following tags including tags corresponding to additional benchmark parameters in case of parameterized benchmarks:

  • benchmark (string): Fully qualified name of the benchmark
  • scoreUnit (string): JMH score unit
  • optionally, additional parameters in case of a parameterized benchmark (string)

The following fields are captured for each benchmark:

  • score (float): JMH score
  • scoreMean (float): Mean score of all iterations
  • scoreMedian (float): Median score of all iterations
  • scoreError (float): Mean error of the score
  • sampleCount (integer): Number of score samples
  • durationMs (integer): Total benchmark duration (including warmups)

Configuration

If settings can be inferred from the environment, benchmark results will be published to InfluxDB. Otherwise this will just delegate to the default JMH Main class.

Use the following environment variables to configure the publisher:

  • INFLUXDB_HOST: InfluxDB host
  • INFLUXDB_DATABASE: InfluxDB database
  • INFLUXDB_USER: InfluxDB user
  • INFLUXDB_USER_PASSWORD: InfluxDB user password
  • INFLUXDB_BASE_MEASUREMENT: Prefix for measurement name, the benchmark mode will be appended to this

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

@mosche
Copy link
Member Author

mosche commented Sep 6, 2022

R: @lukecwik
The final piece missing here is the infrastructure side where (how) to run the benchmarks so that there's minimal side effects / noise. Who could help there?

@github-actions
Copy link
Contributor

github-actions bot commented Sep 6, 2022

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

@codecov
Copy link

codecov bot commented Sep 6, 2022

Codecov Report

Merging #23041 (d0c6c91) into master (3c91e7b) will decrease coverage by 0.04%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #23041      +/-   ##
==========================================
- Coverage   73.68%   73.63%   -0.05%     
==========================================
  Files         714      716       +2     
  Lines       95220    95301      +81     
==========================================
+ Hits        70161    70174      +13     
- Misses      23762    23830      +68     
  Partials     1297     1297              
Flag Coverage Δ
python 83.41% <ø> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdks/python/apache_beam/internal/pickler.py 92.00% <0.00%> (-3.46%) ⬇️
sdks/python/apache_beam/io/source_test_utils.py 88.01% <0.00%> (-1.39%) ⬇️
sdks/python/apache_beam/io/localfilesystem.py 90.97% <0.00%> (-0.76%) ⬇️
...eam/runners/portability/fn_api_runner/execution.py 92.44% <0.00%> (-0.65%) ⬇️
sdks/python/apache_beam/runners/direct/executor.py 96.46% <0.00%> (-0.55%) ⬇️
sdks/python/apache_beam/runners/common.py 88.59% <0.00%> (-0.13%) ⬇️
...thon/apache_beam/ml/inference/pytorch_inference.py 0.00% <0.00%> (ø)
...hon/apache_beam/runners/worker/bundle_processor.py 93.67% <0.00%> (ø)
...beam/runners/portability/expansion_service_main.py 0.00% <0.00%> (ø)
...am/examples/inference/pytorch_language_modeling.py 0.00% <0.00%> (ø)
... and 9 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@mosche
Copy link
Member Author

mosche commented Sep 12, 2022

Run Java PreCommit

@lukecwik
Copy link
Member

R: @lukecwik The final piece missing here is the infrastructure side where (how) to run the benchmarks so that there's minimal side effects / noise. Who could help there?

We would need to tag a Jenkins machine to be only used for benchmark runs and configure it to run at most one job at a time. I would reach out on the dev@ mailing list to see if there are any volunteers to help set that up.

@@ -1449,7 +1454,8 @@ class BeamModulePlugin implements Plugin<Project> {
// Note that these tests will fail on JVMs that JMH doesn't support.
def jmhTest = project.tasks.register("jmhTest", JavaExec) {
dependsOn project.classes
mainClass = "org.openjdk.jmh.Main"
// Note: this will delegate to the default JMH runner
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Note: this will delegate to the default JMH runner
// Note: this will wrap the default JMH runner publishing results to InfluxDB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For jmhTest the code just delegates to the default JMH runner, we don't want to publish single shot times. I'll updated to comment accordingly

@@ -37,6 +37,7 @@ dependencies {
implementation library.java.http_client
implementation library.java.http_core
implementation library.java.slf4j_api
provided library.java.jmh_core
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the motivation to use provided over implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Motivation was to not include jmh-core on every classpath that depends on test-utils. Happy to change to implementation if you prefer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

if (host != null) {
builder.withHost(host); // default to localhost otherwise
}
return builder.withDatabase(database).withMeasurement(measurement).get();
Copy link
Member

@lukecwik lukecwik Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of specifying the measurement here (since since each datapoint does it)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for catching, fixed 👍

Moritz Mack and others added 2 commits September 13, 2022 10:43
@mosche mosche merged commit 9d9db56 into apache:master Sep 14, 2022
@mosche mosche deleted the 22238-publish-jmh-results branch September 14, 2022 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants