chore(smp upgrade) - Upgrade SMP to latest release#20713
Conversation
|
/ci-run-regression |
Datadog ReportBranch report: ✅ 0 Failed, 7 Passed, 0 Skipped, 25.42s Total Time |
…een lading and target
|
/ci-run-regression |
|
/ci-run-regression |
Regression Detector ResultsRun ID: 5ca95b9b-f8b0-42ae-a6f7-9a46d8fa7e61 Metrics dashboard Baseline: 52759aa Performance changes are noted in the perf column of each table:
Significant changes in experiment optimization goalsConfidence level: 90.00%
|
| perf | experiment | goal | Δ mean % | Δ mean % CI | links |
|---|---|---|---|---|---|
| ➖ | syslog_humio_logs | ingress throughput | +1.64 | [+1.49, +1.80] | |
| ➖ | syslog_splunk_hec_logs | ingress throughput | +0.52 | [+0.40, +0.64] | |
| ➖ | http_elasticsearch | ingress throughput | +0.15 | [-0.01, +0.31] | |
| ➖ | splunk_hec_route_s3 | ingress throughput | +0.14 | [-0.23, +0.51] | |
| ➖ | http_to_http_noack | ingress throughput | +0.08 | [+0.02, +0.14] | |
| ➖ | http_to_http_json | ingress throughput | +0.04 | [-0.00, +0.09] | |
| ➖ | splunk_hec_indexer_ack_blackhole | ingress throughput | +0.01 | [-0.07, +0.09] | |
| ➖ | splunk_hec_to_splunk_hec_logs_noack | ingress throughput | -0.00 | [-0.10, +0.09] | |
| ➖ | splunk_hec_to_splunk_hec_logs_acks | ingress throughput | -0.00 | [-0.13, +0.12] | |
| ➖ | otlp_http_to_blackhole | ingress throughput | -0.06 | [-0.22, +0.09] | |
| ➖ | datadog_agent_remap_datadog_logs | ingress throughput | -0.10 | [-0.31, +0.12] | |
| ➖ | datadog_agent_remap_blackhole_acks | ingress throughput | -0.18 | [-0.31, -0.06] | |
| ➖ | otlp_grpc_to_blackhole | ingress throughput | -0.23 | [-0.35, -0.11] | |
| ➖ | syslog_log2metric_tag_cardinality_limit_blackhole | ingress throughput | -0.23 | [-0.34, -0.12] | |
| ➖ | socket_to_socket_blackhole | ingress throughput | -0.32 | [-0.38, -0.25] | |
| ➖ | http_to_s3 | ingress throughput | -0.37 | [-0.65, -0.10] | |
| ➖ | syslog_log2metric_splunk_hec_metrics | ingress throughput | -0.93 | [-1.02, -0.84] | |
| ➖ | syslog_regex_logs2metric_ddmetrics | ingress throughput | -0.94 | [-1.12, -0.75] | |
| ➖ | datadog_agent_remap_datadog_logs_acks | ingress throughput | -1.49 | [-1.67, -1.30] | |
| ➖ | datadog_agent_remap_blackhole | ingress throughput | -1.50 | [-1.63, -1.37] | |
| ➖ | fluent_elasticsearch | ingress throughput | -1.52 | [-2.00, -1.04] | |
| ➖ | http_text_to_http_json | ingress throughput | -2.12 | [-2.25, -1.99] | |
| ➖ | syslog_loki | ingress throughput | -2.13 | [-2.23, -2.04] | |
| ➖ | http_to_http_acks | ingress throughput | -2.33 | [-3.63, -1.03] | |
| ➖ | syslog_log2metric_humio_metrics | ingress throughput | -3.51 | [-3.64, -3.37] | |
| ❌ | file_to_blackhole | egress throughput | -10.19 | [-16.73, -3.65] |
Explanation
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
| seed: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, | ||
| 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131] | ||
| addr: "localhost:8282" | ||
| addr: "0.0.0.0:8282" |
There was a problem hiding this comment.
I'm somewhat surprised that 0.0.0.0 works here; lading is connecting to this and I didn't know Linux supported connections to that address. (I'd have expected 127.0.0.1 here.) No change needed, just curious.
There was a problem hiding this comment.
It is possible that this specific one wasn't needed, I just took a "big hammer" approach and made everything 0.0.0.0
There was a problem hiding this comment.
I'm also a bit surprised by this. Did you find this change necessary? I could see it causing confusion in the future. I'd also expect localhost here.
There was a problem hiding this comment.
I'll try with localhost in a followup PR to see.
There was a problem hiding this comment.
The hostname localhost is broken inside target containers (never investigated; it's a longstanding bug that'd be really nice to have fixed.)
| # Additional variables for per-experiment links: | ||
| # - `experiment`: the name of the experiment | ||
| report: | ||
| metrics_dashboard: "https://app.datadoghq.com/dashboard/ykh-ua8-vcu/SMP-Regression-Detector-Metrics?fromUser=true&refresh_mode=paused&tpl_var_run-id%5B0%5D={{ job_id }}&view=spans&from_ts={{ start_time_ms }}&to_ts={{ end_time_ms }}&live=false" |
There was a problem hiding this comment.
Would you mind also adding a logs link?
jszwedko
left a comment
There was a problem hiding this comment.
I left a few nits/questions, but overall 👍
| lading-version: ${{ steps.experimental-meta.outputs.LADING_VERSION }} | ||
| steps: | ||
| - uses: actions/checkout@v3 | ||
| - uses: actions/checkout@v4 |
There was a problem hiding this comment.
We actually withheld this version bump in #18490 but I think it's ok to do for this workflow.
|
|
||
|
|
There was a problem hiding this comment.
Nit: extra whitespace 😄
| seed: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, | ||
| 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131] | ||
| addr: "localhost:8282" | ||
| addr: "0.0.0.0:8282" |
There was a problem hiding this comment.
I'm also a bit surprised by this. Did you find this change necessary? I could see it causing confusion in the future. I'd also expect localhost here.
Regression Detector ResultsRun ID: d19fc108-249e-44bd-940e-5d7232856a2a Metrics dashboard Baseline: 52759aa Performance changes are noted in the perf column of each table:
No significant changes in experiment optimization goalsConfidence level: 90.00% There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.
|
| perf | experiment | goal | Δ mean % | Δ mean % CI | links |
|---|---|---|---|---|---|
| ➖ | syslog_log2metric_splunk_hec_metrics | ingress throughput | +1.17 | [+1.06, +1.28] | |
| ➖ | fluent_elasticsearch | ingress throughput | +1.13 | [+0.63, +1.62] | |
| ➖ | splunk_hec_route_s3 | ingress throughput | +1.01 | [+0.61, +1.42] | |
| ➖ | http_text_to_http_json | ingress throughput | +0.95 | [+0.83, +1.07] | |
| ➖ | syslog_regex_logs2metric_ddmetrics | ingress throughput | +0.54 | [+0.38, +0.71] | |
| ➖ | http_to_http_json | ingress throughput | +0.11 | [+0.05, +0.18] | |
| ➖ | http_to_http_noack | ingress throughput | +0.09 | [+0.03, +0.16] | |
| ➖ | syslog_loki | ingress throughput | +0.06 | [-0.02, +0.13] | |
| ➖ | splunk_hec_to_splunk_hec_logs_noack | ingress throughput | +0.00 | [-0.10, +0.10] | |
| ➖ | splunk_hec_indexer_ack_blackhole | ingress throughput | -0.00 | [-0.08, +0.08] | |
| ➖ | splunk_hec_to_splunk_hec_logs_acks | ingress throughput | -0.00 | [-0.13, +0.12] | |
| ➖ | http_to_s3 | ingress throughput | -0.01 | [-0.28, +0.26] | |
| ➖ | otlp_grpc_to_blackhole | ingress throughput | -0.10 | [-0.21, +0.01] | |
| ➖ | otlp_http_to_blackhole | ingress throughput | -0.13 | [-0.25, +0.00] | |
| ➖ | http_to_http_acks | ingress throughput | -0.13 | [-1.44, +1.17] | |
| ➖ | datadog_agent_remap_blackhole_acks | ingress throughput | -0.34 | [-0.49, -0.18] | |
| ➖ | datadog_agent_remap_datadog_logs | ingress throughput | -0.38 | [-0.61, -0.15] | |
| ➖ | socket_to_socket_blackhole | ingress throughput | -0.85 | [-0.89, -0.80] | |
| ➖ | syslog_splunk_hec_logs | ingress throughput | -0.88 | [-1.04, -0.73] | |
| ➖ | datadog_agent_remap_datadog_logs_acks | ingress throughput | -1.00 | [-1.20, -0.81] | |
| ➖ | datadog_agent_remap_blackhole | ingress throughput | -1.08 | [-1.21, -0.96] | |
| ➖ | syslog_log2metric_humio_metrics | ingress throughput | -1.18 | [-1.34, -1.03] | |
| ➖ | syslog_humio_logs | ingress throughput | -1.79 | [-1.89, -1.68] | |
| ➖ | syslog_log2metric_tag_cardinality_limit_blackhole | ingress throughput | -1.91 | [-2.03, -1.79] | |
| ➖ | http_elasticsearch | ingress throughput | -2.75 | [-2.94, -2.56] | |
| ➖ | file_to_blackhole | egress throughput | -5.12 | [-12.09, +1.85] |
Explanation
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
* Update smp job submission and associated versions * Adds experiment.yamls and config.yaml * Remove VECTOR_REQUIRE_HEALTHY to better accomodate startup races between lading and target * format * Removes unused dotfile-wannabe * Attempt to run off PR * Eval should_run always * Get PR number correctly for PRs * New merge-base cmd * New git ref for merge-base cmd * Only checkout PR if its not a PR * Full ref to master * Show me all valid git refs * Fetch full git repo * Skip merge-equeue metadata step for pull request * normalize to 0.0.0.0 instead of localhost * Normalize to 0.0.0.0 * Revert changes to manually run this on PR * Removes workload checks * Removes trailing empty whitespace lines --------- Co-authored-by: George Hahn <george.hahn@datadoghq.com>
* Update smp job submission and associated versions * Adds experiment.yamls and config.yaml * Remove VECTOR_REQUIRE_HEALTHY to better accomodate startup races between lading and target * format * Removes unused dotfile-wannabe * Attempt to run off PR * Eval should_run always * Get PR number correctly for PRs * New merge-base cmd * New git ref for merge-base cmd * Only checkout PR if its not a PR * Full ref to master * Show me all valid git refs * Fetch full git repo * Skip merge-equeue metadata step for pull request * normalize to 0.0.0.0 instead of localhost * Normalize to 0.0.0.0 * Revert changes to manually run this on PR * Removes workload checks * Removes trailing empty whitespace lines --------- Co-authored-by: George Hahn <george.hahn@datadoghq.com>
No description provided.