Skip to content

[receiver/vcenter] Fix nil pointer dereference in recordVMStats#47005

Merged
andrzej-stencel merged 7 commits into
open-telemetry:mainfrom
singhvibhanshu:fix/46977
Apr 13, 2026
Merged

[receiver/vcenter] Fix nil pointer dereference in recordVMStats#47005
andrzej-stencel merged 7 commits into
open-telemetry:mainfrom
singhvibhanshu:fix/46977

Conversation

@singhvibhanshu

Copy link
Copy Markdown
Member

Fixes #46977

Signed-off-by: singhvibhanshu <singhvibhanshu@hotmail.com>
@singhvibhanshu singhvibhanshu requested a review from a team as a code owner March 18, 2026 17:40

@axw axw left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@singhvibhanshu is it possible to add a unit test for this, to avoid regressions?

@singhvibhanshu

singhvibhanshu commented Mar 19, 2026

Copy link
Copy Markdown
Member Author

yes @axw, I can surely add a unit test this shortly.
rn i have college, i will add the test in the class itself.
thanks!

@snowch

snowch commented Mar 19, 2026

Copy link
Copy Markdown

@singhvibhanshu there were a few other issues I ran into:

index 03e5b2e..57ea880 100644
--- a/receiver/vcenterreceiver/client.go
+++ b/receiver/vcenterreceiver/client.go
@@ -328,7 +328,23 @@ func (vc *vcenterClient) PerfMetricsQuery(
 	vc.pm.Sort = true
 	sample, err := vc.pm.SampleByName(ctx, spec, names, objs)
 	if err != nil {
-		return nil, err
+		// Batch query failed -- fall back to querying objects one at a time.
+		// An orphaned VM can cause the entire batch SOAP call to fail.
+		resultsByRef := map[string]*performance.EntityMetric{}
+		for _, obj := range objs {
+			s, sErr := vc.pm.SampleByName(ctx, spec, names, []vt.ManagedObjectReference{obj})
+			if sErr != nil {
+				continue
+			}
+			r, rErr := vc.pm.ToMetricSeries(ctx, s)
+			if rErr != nil {
+				continue
+			}
+			for i := range r {
+				resultsByRef[r[i].Entity.Value] = &r[i]
+			}
+		}
+		return &perfMetricsQueryResult{resultsByRef: resultsByRef}, nil
 	}
 	result, err := vc.pm.ToMetricSeries(ctx, sample)
 	if err != nil {
diff --git a/receiver/vcenterreceiver/metrics.go b/receiver/vcenterreceiver/metrics.go
index 78b7f5f..bd3dfb4 100644
--- a/receiver/vcenterreceiver/metrics.go
+++ b/receiver/vcenterreceiver/metrics.go
@@ -258,6 +258,10 @@ func (v *vcenterMetricScraper) recordVMStats(
 	vm *mo.VirtualMachine,
 	hs *mo.HostSystem,
 ) {
+	// Guard against VMs with incomplete data.
+	if vm.Config == nil || vm.Summary.Storage == nil {
+		return
+	}
 	diskUsed := vm.Summary.Storage.Committed
 	diskFree := vm.Summary.Storage.Uncommitted
 
diff --git a/receiver/vcenterreceiver/processors.go b/receiver/vcenterreceiver/processors.go
index aed6608..02d0cb2 100644
--- a/receiver/vcenterreceiver/processors.go
+++ b/receiver/vcenterreceiver/processors.go
@@ -289,6 +289,13 @@ func (v *vcenterMetricScraper) buildVMMetrics(
 		return crRef, groupInfo, fmt.Errorf("no collected ComputeResource for VM [%s]'s ComputeResource ref: %s", vm.Name, crRef)
 	}
 
+	// Guard against VMs with incomplete data -- powered-off, suspended, or
+	// template VMs may have nil Config, nil Summary.Storage, or nil
+	// Summary.Runtime fields from the vSphere API.
+	if vm.Config == nil || vm.Summary.Storage == nil {
+		return crRef, groupInfo, nil // powered-off/template VM — skip silently
+	}
+
 	// Get related VM host info
 	hsRef := vm.Summary.Runtime.Host
 	if hsRef == nil {

@singhvibhanshu

Copy link
Copy Markdown
Member Author

@snowch while writing a unit test for regression locally, I found that solely checking vm.Config == nil and vm.Summary.Storage == nil still leaves us exposed to panics. If vm itself is nil, or if the HostSystem (hs) / hs.Summary.Hardware comes back nil later in the pipeline, the collector still crashes.

I am still looking in it and will update the PR soon.

Signed-off-by: singhvibhanshu <singhvibhanshu@hotmail.com>
@singhvibhanshu

singhvibhanshu commented Mar 28, 2026

Copy link
Copy Markdown
Member Author

First of all apologies for responding so late.

@axw,
I have added the tests to avoid regression.

  • added tests in metrics_test.go to ennsure incomplete host data donot panic.

@snowch,
I have also addressed the issue which you have ran into.

  • previously, PerfMetricsQuery uses SampleByName to call all objects, and if batch failed, the fn returned immediately. I changed it, now when batch call fails, it fall back to querying objects one by one. If one object fails, it skip only that object and continue collecting metrics for the rest.
  • added fallback test in client_test.go for batch perf query failure.

Comment thread receiver/vcenterreceiver/client.go Outdated
Signed-off-by: singhvibhanshu <singhvibhanshu@hotmail.com>
@singhvibhanshu

Copy link
Copy Markdown
Member Author

@axw,
I have dropped the PerfMetricQuery fallback logic from here and will be opening a separate follow-up PR related to that change shortly.

@andrzej-stencel andrzej-stencel merged commit c19f0ed into open-telemetry:main Apr 13, 2026
191 checks passed
@otelbot

otelbot Bot commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

Thank you for your contribution @singhvibhanshu! 🎉 We would like to hear from you about your experience contributing to OpenTelemetry by taking a few minutes to fill out this survey. If you are getting started contributing, you can also join the CNCF Slack channel #opentelemetry-new-contributors to ask for guidance and get help.

AndrewCharlesHay pushed a commit to AndrewCharlesHay/opentelemetry-collector-contrib that referenced this pull request Apr 23, 2026
…-telemetry#47005)

Fixes open-telemetry#46977

---------

Signed-off-by: singhvibhanshu <singhvibhanshu@hotmail.com>
gracewehner pushed a commit to gracewehner/opentelemetry-collector-contrib that referenced this pull request Apr 29, 2026
…-telemetry#47005)

Fixes open-telemetry#46977

---------

Signed-off-by: singhvibhanshu <singhvibhanshu@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[receiver/vcenter] nil pointer dereference panic in recordVMStats (metrics.go:261)

6 participants