fix(config): Avoid parsing configuration files without interpolating secrets by jszwedko · Pull Request #20985 · vectordotdev/vector

jszwedko · 2024-08-01T22:08:52Z

Closes: #20974

Reverts #17759 and solves the issue fixed by that PR by, instead of deserializing configuration twice, to set the global log_schema, defer fetching values from the log_schema until after deserialization (previously some config structs would attempt to fetch during deserialization, but the log_schema may not have been deserialized by that point).

Attempting to pull from log_schema during deserialization is an easy bug to reintroduce, but I don't see an easy way to prohibit it.

I think it is easiest to review the first two commits independently.

Which caused #20974 Reverts: #17759 Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com>

…serialization Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com>

Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com>

datadog-vectordotdev · 2024-08-01T22:29:49Z

Datadog Report

Branch report: jszwedko/fix-schema-defaults
Commit report: c7315f5
Test service: vector

✅ 0 Failed, 25 Passed, 0 Skipped, 25.53s Total Time

bruceg

I have one suggestion for a helper function but otherwise this LGTM.

bruceg · 2024-08-06T18:57:06Z

src/sources/file.rs

+        let host_key = self
+            .host_key
+            .clone()
+            .unwrap_or(log_schema().host_key().cloned().into())


This code sequence being repeated 8(?) times makes me think there should be a helper function for it.

…-defaults

Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com>

neko-dd · 2024-08-06T21:30:30Z

website/cue/reference/components/sources/base/docker_logs.cue

 			"""
 		required: false
-		type: string: default: "host"
+		type: string: {}


Is there no default value anymore?

It'll default to log_schema.host_key as mentioned in the option description. The field itself just doesn't have an explicit default.

(or rather it defaults to "null" which will cause it to fallback to the value of log_schema.host_key)

github-actions · 2024-08-09T22:51:28Z

Regression Detector Results

Run ID: 7a90d66a-eb2d-4c9f-bc4c-eb0175c79f0f Metrics dashboard

Baseline: 270bdc5
Comparison: 93e423f

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

Significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

perf	experiment	goal	Δ mean %	Δ mean % CI	links
✅	syslog_log2metric_humio_metrics	ingress throughput	+6.16	[+5.99, +6.34]

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf	experiment	goal	Δ mean %	Δ mean % CI	links
❌	file_to_blackhole	egress throughput	-20.86	[-26.97, -14.75]

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI
✅	syslog_log2metric_humio_metrics	ingress throughput	+6.16	[+5.99, +6.34]
➖	otlp_http_to_blackhole	ingress throughput	+3.77	[+3.64, +3.90]
➖	datadog_agent_remap_blackhole_acks	ingress throughput	+2.94	[+2.81, +3.08]
➖	otlp_grpc_to_blackhole	ingress throughput	+1.70	[+1.59, +1.82]
➖	http_elasticsearch	ingress throughput	+1.64	[+1.48, +1.81]
➖	syslog_humio_logs	ingress throughput	+1.34	[+1.19, +1.48]
➖	datadog_agent_remap_datadog_logs	ingress throughput	+1.11	[+0.89, +1.34]
➖	socket_to_socket_blackhole	ingress throughput	+1.07	[+0.97, +1.16]
➖	datadog_agent_remap_blackhole	ingress throughput	+0.71	[+0.61, +0.81]
➖	http_to_http_acks	ingress throughput	+0.71	[-0.61, +2.03]
➖	syslog_splunk_hec_logs	ingress throughput	+0.70	[+0.61, +0.78]
➖	http_to_s3	ingress throughput	+0.59	[+0.32, +0.86]
➖	fluent_elasticsearch	ingress throughput	+0.56	[+0.06, +1.05]
➖	http_to_http_noack	ingress throughput	+0.09	[+0.02, +0.15]
➖	http_to_http_json	ingress throughput	+0.04	[-0.01, +0.09]
➖	splunk_hec_indexer_ack_blackhole	ingress throughput	+0.01	[-0.07, +0.09]
➖	splunk_hec_to_splunk_hec_logs_acks	ingress throughput	+0.00	[-0.10, +0.11]
➖	splunk_hec_to_splunk_hec_logs_noack	ingress throughput	-0.00	[-0.10, +0.10]
➖	syslog_loki	ingress throughput	-0.13	[-0.21, -0.05]
➖	datadog_agent_remap_datadog_logs_acks	ingress throughput	-0.20	[-0.40, +0.01]
➖	syslog_log2metric_tag_cardinality_limit_blackhole	ingress throughput	-0.37	[-0.50, -0.24]
➖	syslog_regex_logs2metric_ddmetrics	ingress throughput	-1.34	[-1.52, -1.16]
➖	splunk_hec_route_s3	ingress throughput	-1.43	[-1.73, -1.12]
➖	syslog_log2metric_splunk_hec_metrics	ingress throughput	-2.09	[-2.22, -1.97]
➖	http_text_to_http_json	ingress throughput	-4.71	[-4.90, -4.52]
❌	file_to_blackhole	egress throughput	-20.86	[-26.97, -14.75]

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

…secrets (vectordotdev#20985) * fix(config): Avoid parsing configuration files without secrets Which caused vectordotdev#20974 Reverts: vectordotdev#17759 Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> * Update configuration structs to default to log_schema fields after deserialization Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> * Add changelog entry Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> * Regenerate docs Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> --------- Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com>

jszwedko added 3 commits August 1, 2024 14:23

fix(config): Avoid parsing configuration files without secrets

b4c8f75

Which caused #20974 Reverts: #17759 Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com>

Update configuration structs to default to log_schema fields after de…

e912cc4

…serialization Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com>

Add changelog entry

7ce61ab

Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com>

jszwedko requested a review from a team as a code owner August 1, 2024 22:08

github-actions bot added the domain: sources Anything related to the Vector's sources label Aug 1, 2024

jszwedko mentioned this pull request Aug 1, 2024

Panic on secrets value in clickhouse sink endpoint #20974

Closed

bruceg changed the title ~~fx(config): Avoid parsing configuration files without interpolating secrets~~ fix(config): Avoid parsing configuration files without interpolating secrets Aug 6, 2024

bruceg approved these changes Aug 6, 2024

View reviewed changes

jszwedko added 2 commits August 6, 2024 12:00

Merge remote-tracking branch 'origin/master' into jszwedko/fix-schema…

07f40d9

…-defaults

Regenerate docs

cba07d0

Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com>

jszwedko requested a review from a team August 6, 2024 19:33

jszwedko requested a review from a team as a code owner August 6, 2024 19:34

github-actions bot added the domain: external docs Anything related to Vector's external, public documentation label Aug 6, 2024

neko-dd reviewed Aug 6, 2024

View reviewed changes

jszwedko added this pull request to the merge queue Aug 9, 2024

Merged via the queue into master with commit 93e423f Aug 9, 2024

jszwedko deleted the jszwedko/fix-schema-defaults branch August 9, 2024 23:15

byronwolfman mentioned this pull request Jul 31, 2025

Vector panics during configuration loading if a secret is used for a configuration option that has additional validation (e.g. URIs) #23481

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(config): Avoid parsing configuration files without interpolating secrets#20985

fix(config): Avoid parsing configuration files without interpolating secrets#20985
jszwedko merged 5 commits intomasterfrom
jszwedko/fix-schema-defaults

jszwedko commented Aug 1, 2024

Uh oh!

datadog-vectordotdev bot commented Aug 1, 2024 •

edited

Loading

Uh oh!

bruceg left a comment

Uh oh!

bruceg Aug 6, 2024

Uh oh!

neko-dd Aug 6, 2024

Uh oh!

jszwedko Aug 6, 2024

Uh oh!

jszwedko Aug 6, 2024

Uh oh!

github-actions bot commented Aug 9, 2024

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jszwedko commented Aug 1, 2024

Uh oh!

datadog-vectordotdev bot commented Aug 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Datadog Report

Uh oh!

bruceg left a comment

Choose a reason for hiding this comment

Uh oh!

bruceg Aug 6, 2024

Choose a reason for hiding this comment

Uh oh!

neko-dd Aug 6, 2024

Choose a reason for hiding this comment

Uh oh!

jszwedko Aug 6, 2024

Choose a reason for hiding this comment

Uh oh!

jszwedko Aug 6, 2024

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 9, 2024

Regression Detector Results

Significant changes in experiment optimization goals

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

datadog-vectordotdev bot commented Aug 1, 2024 •

edited

Loading