Add health check on API server by merenbach · Pull Request #522 · argoproj/argo-cd

merenbach · 2018-08-20T18:49:44Z

Closes #520. Leaving out repo server health checks since we may need to do a SPIKE to determine scope.

Visit 127.0.0.1:8080/healthz (or any production /healthz endpoint) to check the health of the API server. The endpoint will return the text ok and and a 200 status code if all is well:

$ curl -i 127.0.0.1:8080/healthz
HTTP/1.1 200 OK
Date: Mon, 20 Aug 2018 22:22:46 GMT
Content-Length: 3
Content-Type: text/plain; charset=utf-8

ok

...and a 503 status code otherwise:

$ curl -i 127.0.0.1:8080/healthz
HTTP/1.1 503 Service Unavailable
Date: Mon, 20 Aug 2018 22:26:51 GMT
Content-Length: 0

jessesuen · 2018-08-21T23:04:50Z

manifests/components/04d_argocd-server-deployment.yaml

A 3 second interval is a bit too aggressive. Lets up this to 30 seconds. Also it doesn't make sense for the initialDelaySeconds to be less than the readiness' delay. Lets start liveness at 30.

jessesuen · 2018-08-21T23:32:53Z

manifests/components/04d_argocd-server-deployment.yaml

We can be more aggressive for readiness than liveness since it only happens during server startup. Plus our server comes up very quickly. We should the same healthz endpoint to verify the pod can talk to k8s before claiming ready:

httpGet: path: /healthz port: 8080 initialDelaySeconds: 2 periodSeconds: 1 failureThreshold: 30

This reverts commit 40f490797645ed0f30d05785748e3919dea31b7f.

This reverts commit 650688dd2ee4a533e29b7df69e0bbb2436eead6b.

jessesuen · 2018-08-21T23:39:43Z

@merenbach did you test this end-to-end? How does this handle the case where the API server is served over HTTPS vs. HTTP?

merenbach · 2018-08-22T17:43:50Z

@jessesuen this is now tested with Docker images locally. The probes are in place and come up as intended in minikube when the argocd-server deployment is created and the argocd-server service starts. I believe it's working fine with HTTPS, per the following output based on a service created with an HTTPS-only manifest:

NAME                 TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)         AGE
argocd-repo-server   ClusterIP   10.110.146.142   <none>        8081/TCP        18m
argocd-server        NodePort    10.111.16.67     <none>        443:31797/TCP   5s
WDHL169c4b885:argo-cd amerenbach$ kubectl proxy -n argocd argocd-server
Starting to serve on 127.0.0.1:8001

I'm able to visit in a browser and the proxy seems to be handling this fine. Please let me know if this is what we were looking for.

jessesuen · 2018-08-22T19:54:10Z

manifests/components/04d_argocd-server-deployment.yaml

+            port: 8080
+          initialDelaySeconds: 2
+          periodSeconds: 1
+          failureThreshold: 30


My understanding of readiness was wrong. Readiness is not just during server startup. It applies throughout the lifetime of the pod, so we cannot be so aggressive. Lets remove liveness entirely and simply have readiness with the following settings:

readinessProbe: httpGet: path: /healthz port: 8080 initialDelaySeconds: 3 periodSeconds: 30

jessesuen · 2018-08-22T22:11:30Z

per the following output based on a service created with an HTTPS-only manifest:

Can you paste the kubectl get pod argocd-server output with this in place? The readiness column is the most relevant.

merenbach · 2018-08-22T22:39:01Z

@jessesuen Here's a Bourne script I threw together to do an e2e test on this feature:

#! /bin/sh -x

build_containers() {
	docker pull argoproj/argocd-ui

	USERNAME=$1
	for CONTAINER in argocd-application-controller argocd-repo-server argocd-server argocd-ui
	do
		docker tag "argoproj/${CONTAINER}" "${USERNAME}/${CONTAINER}"
		docker push "${USERNAME}/${CONTAINER}"
	done
}

apply_manifest() {
	cat manifests/components/*.yaml | sed "s/argoproj\/\([^:]*\):.*$/andrewdm\/\1:latest/g" | tee | kubectl -n argocd apply -f -
}

fullstatus() {
	kubectl -n argocd get pods
	kubectl -n argocd get deployments
	kubectl -n argocd get services
}

## Note the following manifest changes made by hand before this script is run:
# $ git diff
# diff --git a/manifests/components/04e_argocd-server-service.yaml b/manifests/components/04e_argocd-server-service.yaml
# index 3df6cbc..9b2daa9 100644
# --- a/manifests/components/04e_argocd-server-service.yaml
# +++ b/manifests/components/04e_argocd-server-service.yaml
# @@ -4,11 +4,12 @@ kind: Service
#  metadata:
#    name: argocd-server
#  spec:
# +  type: NodePort
#    ports:
# -  - name: http
# -    protocol: TCP
# -    port: 80
# -    targetPort: 8080
# +  # - name: http
# +  #   protocol: TCP
# +  #   port: 80
# +  #   targetPort: 8080
#    - name: https
#      protocol: TCP
#      port: 443

fullstatus

build_containers andrewdm
apply_manifest
kubectl -n argocd rollout status deployment/argocd-server
kubectl -n argocd get pods -l app=argocd-server
# can get pod name with: kubectl -n argocd get pods --no-headers -l app=argocd-server -o jsonpath='{.items[0].metadata.name}'

fullstatus

kubectl proxy -n argocd argocd-server &
sleep 5

curl -w "\n" -s 127.0.0.1:8001/healthz && echo 'Reached health endpoint successfully' || echo 'An error occurred'

pkill -f 'kubectl proxy'

Here's the output:

+ fullstatus
+ kubectl -n argocd get pods
No resources found.
+ kubectl -n argocd get deployments
No resources found.
+ kubectl -n argocd get services
No resources found.
+ build_containers andrewdm
+ docker pull argoproj/argocd-ui
Using default tag: latest
latest: Pulling from argoproj/argocd-ui
Digest: sha256:fdf1dae7a1d7a233788ac463fd6422dfa6a4d1b9b7c91c12397708b5a79c246a
Status: Image is up to date for argoproj/argocd-ui:latest
+ USERNAME=andrewdm
+ for CONTAINER in argocd-application-controller argocd-repo-server argocd-server argocd-ui
+ docker tag argoproj/argocd-application-controller andrewdm/argocd-application-controller
+ docker push andrewdm/argocd-application-controller
The push refers to repository [docker.io/andrewdm/argocd-application-controller]
166f2e176f83: Layer already exists 
ded1db22cea2: Layer already exists 
55b09646bfa3: Layer already exists 
7367f69251d8: Layer already exists 
8568818b1f7f: Layer already exists 
latest: digest: sha256:cc8a86b1ad5755c3de7593892b2bf4bbc970ec8f6b1265ec4146d2fdf8b95301 size: 1377
+ for CONTAINER in argocd-application-controller argocd-repo-server argocd-server argocd-ui
+ docker tag argoproj/argocd-repo-server andrewdm/argocd-repo-server
+ docker push andrewdm/argocd-repo-server
The push refers to repository [docker.io/andrewdm/argocd-repo-server]
166f2e176f83: Layer already exists 
ded1db22cea2: Layer already exists 
55b09646bfa3: Layer already exists 
7367f69251d8: Layer already exists 
8568818b1f7f: Layer already exists 
latest: digest: sha256:cc8a86b1ad5755c3de7593892b2bf4bbc970ec8f6b1265ec4146d2fdf8b95301 size: 1377
+ for CONTAINER in argocd-application-controller argocd-repo-server argocd-server argocd-ui
+ docker tag argoproj/argocd-server andrewdm/argocd-server
+ docker push andrewdm/argocd-server
The push refers to repository [docker.io/andrewdm/argocd-server]
166f2e176f83: Layer already exists 
ded1db22cea2: Layer already exists 
55b09646bfa3: Layer already exists 
7367f69251d8: Layer already exists 
8568818b1f7f: Layer already exists 
latest: digest: sha256:cc8a86b1ad5755c3de7593892b2bf4bbc970ec8f6b1265ec4146d2fdf8b95301 size: 1377
+ for CONTAINER in argocd-application-controller argocd-repo-server argocd-server argocd-ui
+ docker tag argoproj/argocd-ui andrewdm/argocd-ui
+ docker push andrewdm/argocd-ui
The push refers to repository [docker.io/andrewdm/argocd-ui]
51953d49ea9a: Layer already exists 
717b092b8c86: Layer already exists 
latest: digest: sha256:fdf1dae7a1d7a233788ac463fd6422dfa6a4d1b9b7c91c12397708b5a79c246a size: 740
+ apply_manifest
+ sed 's/argoproj\/\([^:]*\):.*$/andrewdm\/\1:latest/g'
+ tee
+ kubectl -n argocd apply -f -
+ cat manifests/components/01a_application-crd.yaml manifests/components/01b_appproject-crd.yaml manifests/components/02a_argocd-cm.yaml manifests/components/02b_argocd-secret.yaml manifests/components/02c_argocd-rbac-cm.yaml manifests/components/03a_application-controller-sa.yaml manifests/components/03b_application-controller-role.yaml manifests/components/03c_application-controller-rolebinding.yaml manifests/components/03d_application-controller-deployment.yaml manifests/components/04a_argocd-server-sa.yaml manifests/components/04b_argocd-server-role.yaml manifests/components/04c_argocd-server-rolebinding.yaml manifests/components/04d_argocd-server-deployment.yaml manifests/components/04e_argocd-server-service.yaml manifests/components/05a_argocd-repo-server-deployment.yaml manifests/components/05b_argocd-repo-server-service.yaml
customresourcedefinition.apiextensions.k8s.io/applications.argoproj.io configured
customresourcedefinition.apiextensions.k8s.io/appprojects.argoproj.io configured
configmap/argocd-cm configured
secret/argocd-secret unchanged
configmap/argocd-rbac-cm unchanged
serviceaccount/application-controller unchanged
role.rbac.authorization.k8s.io/application-controller-role unchanged
rolebinding.rbac.authorization.k8s.io/application-controller-role-binding unchanged
deployment.apps/application-controller created
serviceaccount/argocd-server unchanged
role.rbac.authorization.k8s.io/argocd-server-role unchanged
rolebinding.rbac.authorization.k8s.io/argocd-server-role-binding unchanged
deployment.apps/argocd-server created
service/argocd-server created
deployment.apps/argocd-repo-server created
service/argocd-repo-server created
+ kubectl -n argocd rollout status deployment/argocd-server
Waiting for deployment "argocd-server" rollout to finish: 0 of 1 updated replicas are available...
deployment "argocd-server" successfully rolled out
+ kubectl -n argocd get pods -l app=argocd-server
NAME                             READY     STATUS    RESTARTS   AGE
argocd-server-8578f88df7-7p66r   2/2       Running   0          9s
+ fullstatus
+ kubectl -n argocd get pods
NAME                                      READY     STATUS    RESTARTS   AGE
application-controller-7dcdddf77f-zp49l   1/1       Running   0          10s
argocd-repo-server-74f5fcdf7-89wgs        1/1       Running   0          10s
argocd-server-8578f88df7-7p66r            2/2       Running   0          10s
+ kubectl -n argocd get deployments
NAME                     DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
application-controller   1         1         1            1           10s
argocd-repo-server       1         1         1            1           10s
argocd-server            1         1         1            1           10s
+ kubectl -n argocd get services
NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
argocd-repo-server   ClusterIP   10.99.168.16    <none>        8081/TCP        10s
argocd-server        NodePort    10.102.208.29   <none>        443:32508/TCP   10s
+ sleep 5
+ kubectl proxy -n argocd argocd-server
Starting to serve on 127.0.0.1:8001
+ curl -w '\n' -s 127.0.0.1:8001/healthz
ok
+ echo 'Reached health endpoint successfully'
Reached health endpoint successfully
+ pkill -f 'kubectl proxy'

merenbach · 2018-08-22T22:43:06Z

@jessesuen also adding a pod describe:

$ kubectl -n argocd describe pods -l app=argocd-server
Name:           argocd-server-8578f88df7-7p66r
Namespace:      argocd
Node:           minikube/10.0.2.15
Start Time:     Wed, 22 Aug 2018 15:38:08 -0700
Labels:         app=argocd-server
                pod-template-hash=4134944893
Annotations:    <none>
Status:         Running
IP:             172.17.0.5
Controlled By:  ReplicaSet/argocd-server-8578f88df7
Init Containers:
  copyutil:
    Container ID:  docker://8500f5a9a9288983f318a175072e28bb90dbd7789e053a2e18d03f63e2fc28f3
    Image:         andrewdm/argocd-server:latest
    Image ID:      docker-pullable://andrewdm/argocd-application-controller@sha256:cc8a86b1ad5755c3de7593892b2bf4bbc970ec8f6b1265ec4146d2fdf8b95301
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
      /argocd-util
      /shared
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 22 Aug 2018 15:38:12 -0700
      Finished:     Wed, 22 Aug 2018 15:38:12 -0700
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /shared from static-files (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from argocd-server-token-dx8tw (ro)
  ui:
    Container ID:  docker://60642e037245165caf2ea04162b60227d8b52006510a6ea411151f9bef7aa774
    Image:         andrewdm/argocd-ui:latest
    Image ID:      docker-pullable://andrewdm/argocd-ui@sha256:fdf1dae7a1d7a233788ac463fd6422dfa6a4d1b9b7c91c12397708b5a79c246a
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
      -r
      /app
      /shared
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 22 Aug 2018 15:38:14 -0700
      Finished:     Wed, 22 Aug 2018 15:38:15 -0700
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /shared from static-files (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from argocd-server-token-dx8tw (ro)
Containers:
  argocd-server:
    Container ID:  docker://1a112368f84e7fd5b34350e0acbe9fe397e4ee5f662e52afbc783427f3628f25
    Image:         andrewdm/argocd-server:latest
    Image ID:      docker-pullable://andrewdm/argocd-application-controller@sha256:cc8a86b1ad5755c3de7593892b2bf4bbc970ec8f6b1265ec4146d2fdf8b95301
    Port:          <none>
    Host Port:     <none>
    Command:
      /argocd-server
      --staticassets
      /shared/app
      --repo-server
      argocd-repo-server:8081
    State:          Running
      Started:      Wed, 22 Aug 2018 15:38:17 -0700
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /shared from static-files (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from argocd-server-token-dx8tw (ro)
  dex:
    Container ID:  docker://ea38e5bc8c96437e252589087b4f9f747479ff726344fd108883447940c1180e
    Image:         quay.io/coreos/dex:v2.10.0
    Image ID:      docker-pullable://quay.io/coreos/dex@sha256:218f898d8f0cbbb190c76404bb13d599ac64c64384a999472e2278ed4e34496f
    Port:          <none>
    Host Port:     <none>
    Command:
      /shared/argocd-util
      rundex
    State:          Running
      Started:      Wed, 22 Aug 2018 15:38:17 -0700
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /shared from static-files (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from argocd-server-token-dx8tw (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          True 
  PodScheduled   True 
Volumes:
  static-files:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  argocd-server-token-dx8tw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  argocd-server-token-dx8tw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason                 Age   From               Message
  ----    ------                 ----  ----               -------
  Normal  Scheduled              4m    default-scheduler  Successfully assigned argocd-server-8578f88df7-7p66r to minikube
  Normal  SuccessfulMountVolume  4m    kubelet, minikube  MountVolume.SetUp succeeded for volume "static-files"
  Normal  SuccessfulMountVolume  4m    kubelet, minikube  MountVolume.SetUp succeeded for volume "argocd-server-token-dx8tw"
  Normal  Pulling                4m    kubelet, minikube  pulling image "andrewdm/argocd-server:latest"
  Normal  Pulled                 4m    kubelet, minikube  Successfully pulled image "andrewdm/argocd-server:latest"
  Normal  Created                4m    kubelet, minikube  Created container
  Normal  Started                4m    kubelet, minikube  Started container
  Normal  Pulling                4m    kubelet, minikube  pulling image "andrewdm/argocd-ui:latest"
  Normal  Pulled                 4m    kubelet, minikube  Successfully pulled image "andrewdm/argocd-ui:latest"
  Normal  Created                4m    kubelet, minikube  Created container
  Normal  Started                4m    kubelet, minikube  Started container
  Normal  Pulling                4m    kubelet, minikube  pulling image "andrewdm/argocd-server:latest"
  Normal  Pulled                 4m    kubelet, minikube  Successfully pulled image "andrewdm/argocd-server:latest"
  Normal  Created                4m    kubelet, minikube  Created container
  Normal  Started                4m    kubelet, minikube  Started container
  Normal  Pulled                 4m    kubelet, minikube  Container image "quay.io/coreos/dex:v2.10.0" already present on machine
  Normal  Created                4m    kubelet, minikube  Created container
  Normal  Started                4m    kubelet, minikube  Started container

This reverts commit 8d9e4fa.

* feat: Implement Server-Side Diffs Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * trigger build Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * chore: remove unused function Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * make HasAnnotationOption more generic Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * add server-side-diff printer option Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * remove managedFields during server-side-diff Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * add ignore mutation webhook logic Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * fix configSet Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * Fix comparison Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * merge typedconfig in typedpredictedlive Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * handle webhook diff conflicts Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * Fix webhook normalization logic Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * address review comments 1/2 Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * address review comments 2/2 Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * fix lint Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * remove kubectl getter from cluster-cache Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> * fix query param verifier instantiation Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * Add server-side-diff unit tests Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> --------- Signed-off-by: Leonardo Luz Almeida <leonardo_almeida@intuit.com> Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>

merenbach requested a review from jessesuen August 20, 2018 23:12

merenbach changed the title ~~[WIP] Add health check on API server~~ Add health check on API server Aug 20, 2018

jessesuen requested changes Aug 21, 2018

View reviewed changes

Andrew Merenbach added 25 commits August 21, 2018 16:37

Add app health endpoints

bfa2819

Update generated files

e4fbadb

Revert "Update generated files"

643b23a

This reverts commit 40f490797645ed0f30d05785748e3919dea31b7f.

Revert "Add app health endpoints"

7e72c4d

This reverts commit 650688dd2ee4a533e29b7df69e0bbb2436eead6b.

Add dedicated health endpoint

e56de4f

Update generated files

30b797d

Slim down basic server

5c6ac41

Update generated files

949df94

Update health server creation

64a9f7d

Fix import, endpoint casing

4152322

Flesh out basic health check

b0c61f4

Add additional endpoints, fix check, thanks @jessesuen

c9ceed7

Fix errors

390b58d

Update generated files

3a9d672

Simplify health check, update endpoint

2995ee5

Update generated files

0c6593e

Factor out health check code

4918f21

Update generated files

fc83987

Rm health endpoint

7c07edf

Add healthz utility

c3d10be

Log error instead of printing it

64c2e61

Update comment

d92dfb6

Add liveness, readiness probes to manifest for API server

359993e

Add health check test

58e7ead

Tweak timeouts, endpoints in probes, thanks @jessesuen

1f336e4

jessesuen reviewed Aug 22, 2018

View reviewed changes

Tweak probes, thanks @jessesuen

ad1d988

jessesuen approved these changes Aug 23, 2018

View reviewed changes

merenbach merged commit 8d9e4fa into argoproj:master Aug 23, 2018

merenbach added a commit that referenced this pull request Aug 23, 2018

Revert "Add health check on API server (#522)"

6c10801

This reverts commit 8d9e4fa.

This was referenced Aug 23, 2018

Revert "Add health check on API server" #530

Closed

Revert readiness probe for now #531

Closed

mukulikak added this to the 0.8 milestone Aug 27, 2018

merenbach deleted the 520-add-health-probes branch October 24, 2018 21:08

leoluz mentioned this pull request Mar 13, 2025

chore: incorporate gitops-engine lib preserving history #22341

Closed

leoluz mentioned this pull request Sep 23, 2025

chore: gitops-engine migration #24710

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add health check on API server#522

Add health check on API server#522
merenbach merged 26 commits intoargoproj:masterfrom
merenbach:520-add-health-probes

merenbach commented Aug 20, 2018 •

edited

Loading

Uh oh!

jessesuen Aug 21, 2018

Uh oh!

jessesuen Aug 21, 2018

Uh oh!

jessesuen commented Aug 21, 2018

Uh oh!

merenbach commented Aug 22, 2018

Uh oh!

jessesuen Aug 22, 2018

Uh oh!

jessesuen commented Aug 22, 2018

Uh oh!

merenbach commented Aug 22, 2018

Uh oh!

merenbach commented Aug 22, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

merenbach commented Aug 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jessesuen Aug 21, 2018

Choose a reason for hiding this comment

Uh oh!

jessesuen Aug 21, 2018

Choose a reason for hiding this comment

Uh oh!

jessesuen commented Aug 21, 2018

Uh oh!

merenbach commented Aug 22, 2018

Uh oh!

jessesuen Aug 22, 2018

Choose a reason for hiding this comment

Uh oh!

jessesuen commented Aug 22, 2018

Uh oh!

merenbach commented Aug 22, 2018

Uh oh!

merenbach commented Aug 22, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

merenbach commented Aug 20, 2018 •

edited

Loading