Skip to content

Add support for dataset,runtime,webhook pprof#985

Merged
cheyang merged 1 commit intofluid-cloudnative:masterfrom
wangyanghack:master
Jan 6, 2022
Merged

Add support for dataset,runtime,webhook pprof#985
cheyang merged 1 commit intofluid-cloudnative:masterfrom
wangyanghack:master

Conversation

@wangyanghack
Copy link
Contributor

Ⅰ. Describe what this PR does

add instant live visualization of statistics for performance analysis

Ⅱ. Does this pull request fix one issue?

fixes #780

Ⅲ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

I cannot decide on nodeport number. please give me some advice. thanks.

@cheyang
Copy link
Collaborator

cheyang commented Aug 24, 2021

Thanks very much for your great contributions. If you'd like to add documentation, it will be great!

@wangyanghack
Copy link
Contributor Author

Thanks very much for your great contributions. If you'd like to add documentation, it will be great!
Thanks very much for your response. I will add documentation recently.
And Can you give me some advice to decide which number nodeport use or configurable? thanks

@cheyang
Copy link
Collaborator

cheyang commented Aug 26, 2021

And Can you give me some advice to decide which number nodeport use or configurable? thanks

I think you can make nodeport configurable and the default port could be 50000.

Could you rebase you code so that it can fix the CI. Thanks.

@wangyanghack wangyanghack force-pushed the master branch 4 times, most recently from 3db5999 to 8d9bac0 Compare August 29, 2021 14:50
@codecov
Copy link

codecov bot commented Aug 29, 2021

Codecov Report

Merging #985 (7fdf4e0) into master (ab54f1e) will increase coverage by 0.01%.
The diff coverage is 79.16%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #985      +/-   ##
==========================================
+ Coverage   70.48%   70.50%   +0.01%     
==========================================
  Files         221      222       +1     
  Lines       12122    12146      +24     
==========================================
+ Hits         8544     8563      +19     
- Misses       2659     2662       +3     
- Partials      919      921       +2     
Impacted Files Coverage Δ
pkg/utils/kubeclient/exec.go 25.97% <ø> (ø)
pkg/utils/pprof.go 79.16% <79.16%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ab54f1e...7fdf4e0. Read the comment docs.

@wangyanghack
Copy link
Contributor Author

wangyanghack commented Aug 30, 2021

I think you can make nodeport configurable and the default port could be 50000.
Could you rebase you code so that it can fix the CI. Thanks.

I have fixed nodeport and passed the CI.

@cheyang cheyang requested a review from xieydd September 1, 2021 09:35
Copy link
Member

@yangyuliufeng yangyuliufeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested this pr and found no problems.
I suggest changing the naming of some files such as the svc name.
Also, I'm worried that it's unsafe to expose program indicators through HTTP.

apiVersion: v1
kind: Service
metadata:
name: fluid-pod-alluxioruntime-controller
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not suitable to be 'fluid-pod-alluxioruntime-controller', how about 'fluid-alluxioruntime-controller-ststic'

Comment on lines +49 to +51
statistics:
enabled: false
nodePort: 50002
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to put it in runtime.controllerStatistics, not three pieces.
我觉得统一写一个runtime.controllerStatistics比较好,不要分三块

@wangyanghack
Copy link
Contributor Author

wangyanghack commented Sep 13, 2021

Also, I'm worried that it's unsafe to expose program indicators through HTTP.

@yangyuliufeng @cheyang I cannot decide where tls crt and key should be.
My plan is :

  1. execute a certificate.sh to generate tls crt and key
  2. execute kubectl create secret tls tls-secret --cert=path/to/tls.cert --key=path/to/tls.key to generate a secret
  3. pod of fluid volumemount this secret to use tls crt and key

But my plan has a problem: it cannot execute certificate.sh and create secret automatically when helm install.
Please give me some advice. Thanks very much.

@@ -0,0 +1,16 @@
{{ if .Values.dataset.controller.statistics.enabled -}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the service is not necessary. I think we can connect to ip address of pod directly.

"net/http"
"os"

"github.com/arl/statsviz"
Copy link
Collaborator

@cheyang cheyang Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using "net/http/pprof" ? Kubernetes is also using this. And we can avoid importing new package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for your good advice. I will use pprof to implement it.

@cheyang
Copy link
Collaborator

cheyang commented Sep 14, 2021

Also, I'm worried that it's unsafe to expose program indicators through HTTP.

@yangyuliufeng @cheyang I cannot decide where tls crt and key should be.
My plan is :

  1. execute a certificate.sh to generate tls crt and key
  2. execute kubectl create secret tls tls-secret --cert=path/to/tls.cert --key=path/to/tls.key to generate a secret
  3. pod of fluid volumemount this secret to use tls crt and key

But my plan has a problem: it cannot execute certificate.sh and create secret automatically when helm install.
Please give me some advice. Thanks very much.

Thank you for your contribution! My suggestion is that if we can leverage the implementation in Kubernetes. Btw, is it enabled by default? https://github.com/kubernetes/kubernetes/blob/ea0764452222146c47ec826977f49d7001b0ea8c/staging/src/k8s.io/apiserver/pkg/server/routes/profiling.go#L30

@cheyang
Copy link
Collaborator

cheyang commented Sep 14, 2021

I cannot decide on nodeport number. please give me some advice. thanks.

I think service to expose is not necessary for debugging.

@cheyang
Copy link
Collaborator

cheyang commented Oct 3, 2021

@wangyanghack could you refer to https://github.com/banzaicloud/logging-operator/blob/master/main.go#L122-L129? thanks in advance!

@wangyanghack wangyanghack force-pushed the master branch 2 times, most recently from e6315e1 to 209a813 Compare October 5, 2021 02:24
@wangyanghack
Copy link
Contributor Author

wangyanghack commented Oct 5, 2021

@wangyanghack could you refer to https://github.com/banzaicloud/logging-operator/blob/master/main.go#L122-L129? thanks in advance!

Thank you very much for your careful guidance and patience.
I have followed your advice to add pprof. Please review it and give me more advice.

@wangyanghack
Copy link
Contributor Author

wangyanghack commented Nov 8, 2021

@cheyang please take a moment to review my code. I will apologize in advance if it is not still satisfied with you again. Thanks a lot.

@cheyang
Copy link
Collaborator

cheyang commented Dec 6, 2021

@wangyanghack Please fix the conflict, thanks.

@wangyanghack wangyanghack reopened this Dec 18, 2021
@wangyanghack wangyanghack reopened this Dec 29, 2021
@wangyanghack wangyanghack changed the title add instant live visualization of statistics for performance analysis Add support for dataset,runtime,webhook pprof Dec 29, 2021
@wangyanghack wangyanghack force-pushed the master branch 4 times, most recently from 733278d to 368bdaf Compare December 29, 2021 12:38
@wangyanghack wangyanghack reopened this Dec 29, 2021
}))

if pprofAddr != "" {
glog.Infof("Enabling pprof with address %s", pprofAddr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the same log system in the controller. I suggest not to use glog here rahter than github.com/go-logr/logr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

}
}))

if pprofAddr != "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about to create a common function to launch pprof? This piece of code has been repeated several times. Just a suggestion, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I will create a common function now

@@ -0,0 +1,47 @@
package utils
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add license header. Thanks.

Addr: pprofAddr,
Handler: mux,
}
glog.Infof("Starting pprof HTTP server at %s", pprofServer.Addr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use logr instead of glog. Thanks.

Copy link
Member

@xieydd xieydd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangyanghack We can add test for new func to improve the diff hit of codecov, thanks.

@wangyanghack wangyanghack force-pushed the master branch 2 times, most recently from acc6f84 to 6e33863 Compare January 5, 2022 12:46
Signed-off-by: 汪洋 <1264017613@qq.com>
@cheyang cheyang merged commit 97a099c into fluid-cloudnative:master Jan 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURES] Instant live visualization of Fluid components statistics (GC, MemStats, etc.).

4 participants