-
Notifications
You must be signed in to change notification settings - Fork 66
[ML] Add CI build timing analytics and Gradle build cache for Java ITs #2907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 3 commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
98af62a
[ML] Add daily build timing analysis step to snapshot pipeline
edsavage 8d604e6
[ML] Add Gradle build cache for Java integration tests
edsavage af7e190
[ML] Fix GCS authentication for Gradle build cache in CI
edsavage dac1ef2
Merge remote-tracking branch 'upstream/main' into pr-2907
edsavage File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| #!/bin/bash | ||
| # Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
| # or more contributor license agreements. Licensed under the Elastic License | ||
| # 2.0 and the following additional limitation. Functionality enabled by the | ||
| # files subject to the Elastic License 2.0 may only be used in production when | ||
| # invoked by an Elasticsearch process with a license key installed that permits | ||
| # use of machine learning features. You may not use this file except in | ||
| # compliance with the Elastic License 2.0 and the foregoing additional | ||
| # limitation. | ||
|
|
||
| cat <<EOL | ||
| steps: | ||
| - label: "Analyse build timings :chart_with_upwards_trend:" | ||
| key: "analyze_build_timings" | ||
| command: | ||
| - "python3 .buildkite/scripts/steps/analyze_build_timings.py" | ||
| depends_on: | ||
| - "build_test_linux-aarch64-RelWithDebInfo" | ||
| - "build_test_linux-x86_64-RelWithDebInfo" | ||
| - "build_test_macos-aarch64-RelWithDebInfo" | ||
| - "build_test_Windows-x86_64-RelWithDebInfo" | ||
| allow_dependency_failure: true | ||
| soft_fail: true | ||
| agents: | ||
| image: "python:3-slim" | ||
| EOL | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,185 @@ | ||
| #!/usr/bin/env python3 | ||
| # | ||
| # Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
| # or more contributor license agreements. Licensed under the Elastic License | ||
| # 2.0 and the following additional limitation. Functionality enabled by the | ||
| # files subject to the Elastic License 2.0 may only be used in production when | ||
| # invoked by an Elasticsearch process with a license key installed that permits | ||
| # use of machine learning features. You may not use this file except in | ||
| # compliance with the Elastic License 2.0 and the foregoing additional | ||
| # limitation. | ||
|
|
||
| """ | ||
| Analyse build+test timings for the current snapshot build and compare | ||
| against recent history. Produces a Buildkite annotation with a summary | ||
| table and flags any regressions. | ||
| """ | ||
|
|
||
| import json | ||
| import math | ||
| import os | ||
| import subprocess | ||
| import sys | ||
| import urllib.request | ||
| import urllib.error | ||
|
|
||
| PIPELINE_SLUG = "ml-cpp-snapshot-builds" | ||
| ORG_SLUG = "elastic" | ||
| API_BASE = f"https://api.buildkite.com/v2/organizations/{ORG_SLUG}/pipelines/{PIPELINE_SLUG}" | ||
| HISTORY_COUNT = 14 | ||
|
|
||
| PLATFORM_MAP = { | ||
| "Windows": "windows_x86_64", | ||
| "MacOS": "macos_aarch64", | ||
| "linux-x86_64": "linux_x86_64", | ||
| "linux-aarch64": "linux_aarch64", | ||
| } | ||
|
|
||
|
|
||
| def api_get(path, token): | ||
| url = f"{API_BASE}{path}" | ||
| req = urllib.request.Request(url, headers={"Authorization": f"Bearer {token}"}) | ||
| try: | ||
| with urllib.request.urlopen(req, timeout=30) as resp: | ||
| return json.loads(resp.read()) | ||
| except urllib.error.HTTPError as e: | ||
| print(f"API error {e.code} for {url}: {e.read().decode()}", file=sys.stderr) | ||
| sys.exit(1) | ||
|
|
||
|
|
||
| def extract_timings(build_data): | ||
| """Extract per-platform build+test timings from a build's jobs.""" | ||
| timings = {} | ||
| for job in build_data.get("jobs", []): | ||
| name = job.get("name") or "" | ||
| if "Build & test" not in name: | ||
| continue | ||
| if "debug" in name.lower(): | ||
| continue | ||
| started = job.get("started_at") | ||
| finished = job.get("finished_at") | ||
| if not started or not finished: | ||
| continue | ||
|
|
||
| for pattern, key in PLATFORM_MAP.items(): | ||
| if pattern in name: | ||
| from datetime import datetime, timezone | ||
| fmt = "%Y-%m-%dT%H:%M:%S.%fZ" | ||
| t_start = datetime.strptime(started, fmt).replace(tzinfo=timezone.utc) | ||
| t_end = datetime.strptime(finished, fmt).replace(tzinfo=timezone.utc) | ||
| mins = (t_end - t_start).total_seconds() / 60.0 | ||
| timings[key] = round(mins, 1) | ||
| break | ||
| return timings | ||
|
|
||
|
|
||
| def mean_stddev(values): | ||
| if not values: | ||
| return 0.0, 0.0 | ||
| n = len(values) | ||
| m = sum(values) / n | ||
| if n < 2: | ||
| return m, 0.0 | ||
| variance = sum((x - m) ** 2 for x in values) / (n - 1) | ||
| return m, math.sqrt(variance) | ||
|
|
||
|
|
||
| def annotate(markdown, style="info"): | ||
| """Create a Buildkite annotation.""" | ||
| cmd = ["buildkite-agent", "annotate", "--style", style, "--context", "build-timings"] | ||
| proc = subprocess.run(cmd, input=markdown.encode(), capture_output=True) | ||
| if proc.returncode != 0: | ||
| print(f"buildkite-agent annotate failed: {proc.stderr.decode()}", file=sys.stderr) | ||
|
|
||
|
|
||
| def main(): | ||
| token = os.environ.get("BUILDKITE_API_READ_TOKEN", "") | ||
| if not token: | ||
| print("BUILDKITE_API_READ_TOKEN not set, skipping timing analysis", file=sys.stderr) | ||
| sys.exit(0) | ||
|
|
||
| build_number = os.environ.get("BUILDKITE_BUILD_NUMBER", "") | ||
| branch = os.environ.get("BUILDKITE_BRANCH", "main") | ||
|
|
||
| # Fetch current build | ||
| current = api_get(f"/builds/{build_number}", token) | ||
| current_timings = extract_timings(current) | ||
| current_date = current.get("created_at", "")[:10] | ||
|
|
||
| if not current_timings: | ||
| print("No build+test timings found for current build") | ||
| sys.exit(0) | ||
|
|
||
| # Fetch historical builds for the same branch | ||
| history_data = api_get( | ||
| f"/builds?branch={branch}&state=passed&per_page={HISTORY_COUNT + 1}", token | ||
| ) | ||
|
|
||
| # Exclude the current build from history | ||
| history_builds = [ | ||
| b for b in history_data if str(b.get("number")) != str(build_number) | ||
| ][:HISTORY_COUNT] | ||
|
|
||
| # Collect historical timings per platform | ||
| history = {key: [] for key in PLATFORM_MAP.values()} | ||
| for build in history_builds: | ||
| full_build = api_get(f"/builds/{build['number']}", token) | ||
| timings = extract_timings(full_build) | ||
| for key, val in timings.items(): | ||
| history[key].append(val) | ||
|
|
||
| # Build the summary table | ||
| platforms = ["linux_x86_64", "linux_aarch64", "macos_aarch64", "windows_x86_64"] | ||
| platform_labels = { | ||
| "linux_x86_64": "Linux x86_64", | ||
| "linux_aarch64": "Linux aarch64", | ||
| "macos_aarch64": "macOS aarch64", | ||
| "windows_x86_64": "Windows x86_64", | ||
| } | ||
|
|
||
| lines = [] | ||
| lines.append(f"### Build Timing Analysis — {current_date} (build #{build_number})") | ||
| lines.append("") | ||
| lines.append("| Platform | Current (min) | Avg (min) | Std Dev | Delta | Status |") | ||
| lines.append("|----------|:------------:|:---------:|:-------:|:-----:|:------:|") | ||
|
|
||
| has_regression = False | ||
| for plat in platforms: | ||
| cur = current_timings.get(plat) | ||
| hist = history.get(plat, []) | ||
| avg, sd = mean_stddev(hist) | ||
|
|
||
| if cur is None: | ||
| lines.append(f"| {platform_labels[plat]} | — | {avg:.1f} | {sd:.1f} | — | — |") | ||
| continue | ||
|
|
||
| delta = cur - avg | ||
| delta_pct = (delta / avg * 100) if avg > 0 else 0 | ||
| sign = "+" if delta >= 0 else "" | ||
|
|
||
| if avg > 0 and sd > 0 and cur > avg + 2 * sd: | ||
| status = ":rotating_light: Regression" | ||
| has_regression = True | ||
| elif avg > 0 and cur < avg - sd: | ||
| status = ":rocket: Faster" | ||
| else: | ||
| status = ":white_check_mark: Normal" | ||
|
|
||
| lines.append( | ||
| f"| {platform_labels[plat]} | **{cur:.1f}** | {avg:.1f} | {sd:.1f} " | ||
| f"| {sign}{delta:.1f} ({sign}{delta_pct:.0f}%) | {status} |" | ||
| ) | ||
|
|
||
| n_hist = len(history_builds) | ||
| lines.append("") | ||
| lines.append(f"_Compared against {n_hist} recent `{branch}` builds._") | ||
|
|
||
| markdown = "\n".join(lines) | ||
| print(markdown) | ||
|
|
||
| style = "warning" if has_regression else "info" | ||
| annotate(markdown, style) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| /* | ||
| * Gradle init script to enable the local build cache for ES integration test | ||
| * builds. Injected via --init-script so that we don't need to modify the | ||
| * cloned Elasticsearch repository. | ||
| * | ||
| * The local build cache stores task outputs keyed on their inputs. When the | ||
| * cache directory is persisted between CI runs (e.g. via GCS), subsequent | ||
| * builds with the same ES commit get near-instant compilation. | ||
| */ | ||
|
|
||
| settingsEvaluated { settings -> | ||
| settings.buildCache { | ||
| local { | ||
| enabled = true | ||
| } | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen to these timings? Should someone look at them and make any decisions?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timings will appear as an annotation on PRs, so generally the PR author would look at them, and discuss in our normal channels if anything raises concern.