Skip to content

Conversation

@kolluria
Copy link
Contributor

@kolluria kolluria commented Jun 25, 2025

What this PR does / why we need it:
This PR adds profiling endpoints to the vsphere-csi-controller and vsphere-syncer containers. These endpoints will be useful for collecting performance samples, especially when running in scaled environments, to help identify bottlenecks.

  • vsphere-csi-controller: profiling server on port 9500
  • vsphere-syncer: profiling server on port 9501

In scaled environments, performance bottlenecks are harder to detect. Enabling profiling allows us to gather heap, CPU, and goroutine profiles during live workloads and improve debugging and optimization workflows.

Testing done:
Verified that the containers are up and running with the patch -

vsphere-csi-controller-6fd74487db-7m946   7/7     Running   8 (5m39s ago)   16h
vsphere-csi-controller-6fd74487db-cfzl2   7/7     Running   2 (6m42s ago)   16h
vsphere-csi-controller-6fd74487db-jt27f   7/7     Running   8 (5m18s ago)   16h
vsphere-csi-webhook-854fbc5cb5-dq85j      1/1     Running   0               7d4h
vsphere-csi-webhook-854fbc5cb5-lnv7l      1/1     Running   0               3d19h
vsphere-csi-webhook-854fbc5cb5-nzfl4      1/1     Running   0               7d4h

Verified that we're able to collect samples from the endpoints -

curl http://localhost:9501/debug/pprof/heap?seconds=10 > heap.pprof
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3041    0  3041    0     0    297      0 --:--:--  0:00:10 --:--:--   755

Verified that the samples can be analysed using pprof -

pprof heap.pprof
File: vsphere-syncer
Build ID: e30f45bbc054ff3ab986af5d86589b35db32fdb6
Type: inuse_space
Time: 2025-06-26 15:59:03 IST
Duration: 10.08s, Total samples = 3.48MB
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top10
Showing nodes accounting for 2028.13kB, 56.90% of 3564.65kB total
Showing top 10 nodes out of 34
      flat  flat%   sum%        cum   cum%
 1770.33kB 49.66% 49.66%  1770.33kB 49.66%  io.ReadAll
 -768.26kB 21.55% 28.11%  -768.26kB 21.55%  go.uber.org/zap/zapcore.newCounters (inline)
     514kB 14.42% 42.53%      514kB 14.42%  bufio.NewWriterSize
  512.06kB 14.37% 56.90%   512.06kB 14.37%  internal/profile.init.func2
         0     0% 56.90%  -768.26kB 21.55%  go.uber.org/zap.(*Logger).WithOptions
         0     0% 56.90%  -768.26kB 21.55%  go.uber.org/zap.Config.Build
         0     0% 56.90%  -768.26kB 21.55%  go.uber.org/zap.Config.buildOptions.WrapCore.func5
         0     0% 56.90%  -768.26kB 21.55%  go.uber.org/zap.Config.buildOptions.func1
         0     0% 56.90%  -768.26kB 21.55%  go.uber.org/zap.New
         0     0% 56.90%  -768.26kB 21.55%  go.uber.org/zap.optionFunc.apply

Special notes for your reviewer:
Profiling is exposed over HTTP endpoints (no auth by default), so consider implications for production use.

Release note:

Expose pprof profiling endpoints on vsphere-csi-controller (port 9500) and vsphere-syncer (port 9501).

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 25, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @kolluria. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 25, 2025
@kolluria kolluria force-pushed the enable-profile-server branch from 792a266 to 10b29cc Compare June 25, 2025 17:34
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 25, 2025
@kolluria kolluria marked this pull request as ready for review June 26, 2025 11:04
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 26, 2025
@kolluria kolluria force-pushed the enable-profile-server branch from 10b29cc to 69ad3c7 Compare June 26, 2025 11:09
@divyenpatel
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 26, 2025
Copy link
Member

@divyenpatel divyenpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 26, 2025
Copy link
Member

@divyenpatel divyenpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 26, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: divyenpatel, kolluria

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 9efa6ed into kubernetes-sigs:master Jun 26, 2025
12 checks passed
@kolluria kolluria deleted the enable-profile-server branch October 4, 2025 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants