Skip to content

feat(component spec validation tests): add partial support for component_discarded_events_total metrics#16935

Closed
davidhuie-dd wants to merge 1 commit intodh/error-metrics-validationfrom
dh/errors-dropped
Closed

feat(component spec validation tests): add partial support for component_discarded_events_total metrics#16935
davidhuie-dd wants to merge 1 commit intodh/error-metrics-validationfrom
dh/errors-dropped

Conversation

@davidhuie-dd
Copy link
Contributor

#16842

This introduces a test function that can verify the component_discarded_events_total metric total according to a new test event type: TestEvent::Interrupted.

I don't actually introduce any actual test harness support for this metric here since it is going to be an involved process per component. We'll have to bring up the full integration test harness, then interrupt the event transmission process midway through in order to trigger these errors. I imagine that we will only do this to our most important integrations due to the high cost of implementing support for this metric.

@davidhuie-dd davidhuie-dd requested a review from a team March 23, 2023 21:08
@github-actions
Copy link

Regression Detector Results

Run ID: 27df08c6-2c2d-4399-94bd-ff28fd12814a
Baseline: cbbec4b
Comparison: 819428f
Total vector CPUs: 7

Explanation

A regression test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine quickly if vector performance is changed and to what degree by a pull request.

The table below, if present, lists those experiments that have experienced a statistically significant change in mean optimization goal performance between baseline and comparison SHAs with 90.00% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±5.00% change in their mean optimization goal are discarded. An experiment is erratic if its coefficient of variation is greater than 0.1. The abbreviated table will be omitted if no interesting change is observed.

No interesting changes in experiment optimization goals with confidence ≥ 90.00% and |Δ mean %| ≥ 5.00%.

Fine details of change detection per experiment.
experiment goal Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
syslog_regex_logs2metric_ddmetrics ingress throughput 69.52KiB/CPU-s 2.00 100.00% 3.4MiB/CPU-s 397.42KiB/CPU-s 4.89KiB/CPU-s 0.0 0.114086 3.47MiB/CPU-s 310.32KiB/CPU-s 3.82KiB/CPU-s 0.0 0.087339 True True
datadog_agent_remap_datadog_logs ingress throughput 635.86KiB/CPU-s 1.94 100.00% 32.07MiB/CPU-s 1.19MiB/CPU-s 15.05KiB/CPU-s 0.0 0.037228 32.69MiB/CPU-s 912.11KiB/CPU-s 11.22KiB/CPU-s 0.0 0.027243 False False
syslog_log2metric_splunk_hec_metrics ingress throughput 169.02KiB/CPU-s 1.88 100.00% 8.79MiB/CPU-s 418.2KiB/CPU-s 5.15KiB/CPU-s 0.0 0.046472 8.95MiB/CPU-s 422.8KiB/CPU-s 5.2KiB/CPU-s 0.0 0.046117 False False
syslog_humio_logs ingress throughput 164.34KiB/CPU-s 1.82 100.00% 8.81MiB/CPU-s 218.92KiB/CPU-s 2.69KiB/CPU-s 0.0 0.024263 8.97MiB/CPU-s 212.24KiB/CPU-s 2.61KiB/CPU-s 0.0 0.023102 False False
syslog_loki ingress throughput 101.45KiB/CPU-s 1.21 100.00% 8.19MiB/CPU-s 345.64KiB/CPU-s 4.25KiB/CPU-s 0.0 0.041214 8.29MiB/CPU-s 302.53KiB/CPU-s 3.72KiB/CPU-s 0.0 0.035643 False False
socket_to_socket_blackhole ingress throughput 129.6KiB/CPU-s 0.97 100.00% 13.06MiB/CPU-s 399.54KiB/CPU-s 4.92KiB/CPU-s 0.0 0.029862 13.19MiB/CPU-s 421.32KiB/CPU-s 5.18KiB/CPU-s 0.0 0.031188 False False
otlp_grpc_to_blackhole ingress throughput 9.35KiB/CPU-s 0.91 100.00% 1.0MiB/CPU-s 51.26KiB/CPU-s 645.98B/CPU-s 0.0 0.04996 1.01MiB/CPU-s 47.73KiB/CPU-s 601.48B/CPU-s 0.0 0.046102 False False
syslog_splunk_hec_logs ingress throughput 71.74KiB/CPU-s 0.81 100.00% 8.69MiB/CPU-s 271.49KiB/CPU-s 3.34KiB/CPU-s 0.0 0.030506 8.76MiB/CPU-s 300.13KiB/CPU-s 3.69KiB/CPU-s 0.0 0.033455 False False
file_to_blackhole egress throughput 45.73KiB/CPU-s 0.69 20.39% 6.47MiB/CPU-s 4.4MiB/CPU-s 125.54KiB/CPU-s 0.0 0.679184 6.52MiB/CPU-s 4.21MiB/CPU-s 124.75KiB/CPU-s 2.432886 0.645331 True True
http_to_http_acks ingress throughput 36.05KiB/CPU-s 0.68 53.88% 5.18MiB/CPU-s 2.76MiB/CPU-s 34.74KiB/CPU-s 0.0 0.531945 5.22MiB/CPU-s 2.73MiB/CPU-s 34.46KiB/CPU-s 0.0 0.524028 True False
otlp_http_to_blackhole ingress throughput 8.12KiB/CPU-s 0.53 99.98% 1.5MiB/CPU-s 126.21KiB/CPU-s 1.55KiB/CPU-s 0.0 0.081922 1.51MiB/CPU-s 125.52KiB/CPU-s 1.54KiB/CPU-s 0.0 0.081045 False False
syslog_log2metric_humio_metrics ingress throughput 9.74KiB/CPU-s 0.16 89.91% 5.96MiB/CPU-s 331.41KiB/CPU-s 4.08KiB/CPU-s 0.0 0.054309 5.97MiB/CPU-s 350.55KiB/CPU-s 4.31KiB/CPU-s 0.0 0.057354 False False
enterprise_http_to_http ingress throughput 9.16KiB/CPU-s 0.07 97.61% 13.62MiB/CPU-s 292.08KiB/CPU-s 3.59KiB/CPU-s 0.0 0.020949 13.62MiB/CPU-s 152.26KiB/CPU-s 1.87KiB/CPU-s 0.0 0.010913 False False
splunk_hec_indexer_ack_blackhole ingress throughput -316.26B/CPU-s -0.00 5.43% 13.62MiB/CPU-s 259.93KiB/CPU-s 3.2KiB/CPU-s 0.0 0.018642 13.62MiB/CPU-s 261.2KiB/CPU-s 3.21KiB/CPU-s 0.0 0.018733 False False
splunk_hec_to_splunk_hec_logs_acks ingress throughput -382.08B/CPU-s -0.00 4.58% 13.61MiB/CPU-s 379.65KiB/CPU-s 4.67KiB/CPU-s 0.0 0.02723 13.61MiB/CPU-s 367.58KiB/CPU-s 4.52KiB/CPU-s 0.0 0.026366 False False
fluent_elasticsearch ingress throughput -67.81B/CPU-s -0.00 10.05% 45.41MiB/CPU-s 30.44KiB/CPU-s 379.24B/CPU-s 0.0 0.000654 45.41MiB/CPU-s 30.52KiB/CPU-s 380.29B/CPU-s 0.0 0.000656 False False
splunk_hec_to_splunk_hec_logs_noack ingress throughput -2.75KiB/CPU-s -0.02 46.54% 13.62MiB/CPU-s 243.42KiB/CPU-s 2.99KiB/CPU-s 0.0 0.017455 13.62MiB/CPU-s 265.78KiB/CPU-s 3.27KiB/CPU-s 0.0 0.019062 False False
http_to_http_noack ingress throughput -4.05KiB/CPU-s -0.03 48.85% 13.61MiB/CPU-s 333.78KiB/CPU-s 4.11KiB/CPU-s 0.0 0.023945 13.61MiB/CPU-s 373.71KiB/CPU-s 4.6KiB/CPU-s 0.0 0.026817 False False
http_to_http_json ingress throughput -10.02KiB/CPU-s -0.07 92.73% 13.57MiB/CPU-s 300.6KiB/CPU-s 3.7KiB/CPU-s 0.0 0.021627 13.56MiB/CPU-s 339.91KiB/CPU-s 4.18KiB/CPU-s 0.0 0.024472 False False
http_text_to_http_json ingress throughput -77.52KiB/CPU-s -0.30 100.00% 25.05MiB/CPU-s 686.98KiB/CPU-s 8.45KiB/CPU-s 0.0 0.026775 24.98MiB/CPU-s 705.37KiB/CPU-s 8.68KiB/CPU-s 0.0 0.027575 False False
datadog_agent_remap_blackhole ingress throughput -379.51KiB/CPU-s -1.19 100.00% 31.24MiB/CPU-s 1.16MiB/CPU-s 14.66KiB/CPU-s 0.0 0.037245 30.87MiB/CPU-s 1.16MiB/CPU-s 14.56KiB/CPU-s 0.0 0.037424 False False
datadog_agent_remap_datadog_logs_acks ingress throughput -634.7KiB/CPU-s -1.90 100.00% 32.63MiB/CPU-s 1006.37KiB/CPU-s 12.38KiB/CPU-s 0.0 0.030115 32.01MiB/CPU-s 1.25MiB/CPU-s 15.7KiB/CPU-s 0.0 0.038925 False False
datadog_agent_remap_blackhole_acks ingress throughput -602.03KiB/CPU-s -1.94 100.00% 30.32MiB/CPU-s 1.34MiB/CPU-s 16.84KiB/CPU-s 0.0 0.044069 29.73MiB/CPU-s 1.84MiB/CPU-s 23.18KiB/CPU-s 0.0 0.061874 False False
splunk_hec_route_s3 ingress throughput -253.74KiB/CPU-s -2.13 100.00% 11.65MiB/CPU-s 576.53KiB/CPU-s 7.09KiB/CPU-s 0.0 0.048339 11.4MiB/CPU-s 700.57KiB/CPU-s 8.62KiB/CPU-s 0.0 0.060016 False False

Copy link
Contributor

@neuronull neuronull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

.expect("should not fail to encode input event");
}
TestEvent::Interrupted {
interrupted: _,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can these two cases be combined since they are identical?

@neuronull neuronull requested a review from tobz March 24, 2023 15:53
@jszwedko jszwedko added meta: blocked Anything that is blocked to the point where it cannot be worked on. domain: observability Anything related to monitoring/observing Vector labels Mar 29, 2023
@bits-bot
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


David Huie seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@neuronull
Copy link
Contributor

The code this touches has changed on master with changes I've made to the area so this PR is no longer suitable to merge as is and the logic it introduced is already covered.

@neuronull neuronull closed this Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: observability Anything related to monitoring/observing Vector meta: blocked Anything that is blocked to the point where it cannot be worked on.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants