Skip to content

Commit 8cae922

Browse files
committed
Refactor doc, add state diagram for cluster, fix typo, style and grammar
1 parent 190eb90 commit 8cae922

File tree

8 files changed

+199
-240
lines changed

8 files changed

+199
-240
lines changed

docs/architecture.md

Lines changed: 55 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -20,45 +20,62 @@ under the License.
2020
# Design & Architecture
2121

2222
**Spark-Kubernetes-Operator** (Operator) acts as a control plane to manage the complete
23-
deployment lifecycle of Spark applications. The Operator can be installed on a Kubernetes
24-
cluster using Helm. In most production environments it is typically deployed in a designated
25-
namespace and controls Spark deployments in one or more managed namespaces. The custom resource
26-
definition (CRD) that describes the schema of a SparkApplication is a cluster wide resource.
27-
For a CRD, the declaration must be registered before any resources of that CRDs kind(s) can be
28-
used, and the registration process sometimes takes a few seconds.
29-
30-
Users can interact with the operator using the kubectl or k8s API. The Operator continuously
31-
tracks cluster events relating to the SparkApplication custom resources. When the operator
32-
receives a new resource update, it will take action to adjust the Kubernetes cluster to the
33-
desired state as part of its reconciliation loop. The initial loop consists of the following
34-
high-level steps:
23+
deployment lifecycle of Spark applications and clusters. The Operator can be installed on Kubernetes
24+
cluster(s) using Helm. In most production environments it is typically deployed in a designated
25+
namespace and controls Spark workload in one or more managed namespaces.
26+
Spark Operator enables user to describe Spark application(s) or cluster(s) as
27+
[Custom Resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/).
28+
29+
The Operator continuously tracks events related to the Spark custom resources in its reconciliation
30+
loops:
31+
32+
For SparkApplications:
3533

3634
* User submits a SparkApplication custom resource(CR) using kubectl / API
3735
* Operator launches driver and observes its status
38-
* Operator observes driver-spawn resources (e.g. executors) till app terminates
36+
* Operator observes driver-spawn resources (e.g. executors) and record status till app terminates
3937
* Operator releases all Spark-app owned resources to cluster
40-
* The SparkApplication CR can be (re)applied on the cluster any time - e.g. to issue proactive
41-
termination of an application. The Operator makes continuous adjustments to imitate the
42-
desired state until the
43-
current state becomes the desired state. All lifecycle management operations are realized
44-
using this very simple
45-
principle in the Operator.
46-
47-
The Operator is built with the Java Operator SDK and uses the Native Kubernetes Integration for
48-
launching Spark deployments and submitting jobs under the hood. The Java Operator SDK is a
49-
higher level
50-
framework and related tooling to support writing Kubernetes Operators in Java. Both the Java
51-
Operator SDK and Spark’s native
52-
kubernetes integration itself is using the Fabric8 Kubernetes Client to interact with the
53-
Kubernetes API Server.
54-
55-
## State Transition
56-
57-
[<img src="resources/state.png">](resources/state.png)
58-
59-
* Spark application are expected to run from submitted to succeeded before releasing resources
60-
* User may configure the app CR to time-out after given threshold of time
61-
* In addition, user may configure the app CR to skip releasing resources after terminated. This is
62-
typically used at dev phase: pods / configmaps. etc would be kept for debugging. They have
63-
ownerreference to the Application CR and therefore can still be cleaned up when the owner
64-
SparkApplication CR is deleted.
38+
39+
For SparkClusters:
40+
41+
* User submits a SparkCluster custom resource(CR) using kubectl / API
42+
* Operator launches master and worker(s) based on CR spec and observes their status
43+
* Operator releases all Spark-cluster owned resources to cluster upon failure
44+
45+
The Operator is built with the [Java Operator SDK](https://javaoperatorsdk.io/) for
46+
launching Spark deployments and submitting jobs under the hood. It also uses
47+
[fabric8](https://fabric8.io/) client to interact with Kubernetes API Server.
48+
49+
## Application State Transition
50+
51+
[<img src="resources/application_state_machine.png">](resources/application_state_machine.png)
52+
53+
* Spark applications are expected to run from submitted to succeeded before releasing resources
54+
* User may configure the app CR to time-out after given threshold of time if it cannot reach healthy
55+
state after given threshold. The timeout can be configured for different lifecycle stages,
56+
when driver starting and when requesting executor pods. To update the default threshold,
57+
configure `.spec.applicationTolerations.applicationTimeoutConfig` for the application.
58+
* K8s resources created for an application would be deleted as the final stage of the application
59+
lifecycle by default. This is to ensure resource quota release for completed applications.
60+
* It is also possible to retain the created k8s resources for debug or audit purpose. To do so,
61+
user may set `.spec.applicationTolerations.resourceRetainPolicy` to `OnFailure` to retain
62+
resources upon application failure, or set to `Always` to retain resources regardless of
63+
application final state.
64+
- This controls the behavior of k8s resources created by Operator for the application, including
65+
driver pod, config map, service, and PVC(if enabled). This does not apply to resources created
66+
by driver (for example, executor pods). User may configure SparkConf to
67+
include `spark.kubernetes.executor.deleteOnTermination` for executor retention. Please refer
68+
[Spark docs](https://spark.apache.org/docs/latest/running-on-kubernetes.html) for details.
69+
- The created k8s resources have `ownerReference` to their related `SparkApplication` custom
70+
resource, such that they could be garbage collected when the `SparkApplication` is deleted.
71+
- Please be advised that k8s resources would not be retained if the application is configured to
72+
restart. This is to avoid resource quota usage increase unexpectedly or resource conflicts
73+
among multiple attempts.
74+
75+
## Cluster State Transition
76+
77+
[<img src="resources/cluster_state_machine.png">](resources/application_state_machine.png)
78+
79+
* Spark clusters are expected to be always running after submitted.
80+
* Similar to Spark applications, K8s resources created for a cluster would be deleted as the final
81+
stage of the cluster lifecycle by default.

docs/configuration.md

Lines changed: 89 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -43,17 +43,102 @@ Spark Operator supports different ways to configure the behavior:
4343
To enable hot properties loading, update the **helm chart values file** with
4444

4545
```
46-
4746
operatorConfiguration:
4847
spark-operator.properties: |+
4948
spark.operator.dynamic.config.enabled=true
5049
# ... all other config overides...
5150
dynamicConfig:
5251
create: true
52+
```
53+
54+
## Metrics
55+
56+
Spark operator,
57+
following [Apache Spark](https://spark.apache.org/docs/latest/monitoring.html#metrics),
58+
has a configurable metrics system based on
59+
the [Dropwizard Metrics Library](https://metrics.dropwizard.io/4.2.25/). Note that Spark Operator
60+
does not have Spark UI, MetricsServlet
61+
and PrometheusServlet from org.apache.spark.metrics.sink package are not supported. If you are
62+
interested in Prometheus metrics exporting, please take a look at below
63+
section [Forward Metrics to Prometheus](#Forward-Metrics-to-Prometheus)
64+
65+
### JVM Metrics
66+
67+
Spark Operator collects JVM metrics
68+
via [Codahale JVM Metrics](https://javadoc.io/doc/com.codahale.metrics/metrics-jvm/latest/index.html)
69+
70+
- BufferPoolMetricSet
71+
- FileDescriptorRatioGauge
72+
- GarbageCollectorMetricSet
73+
- MemoryUsageGaugeSet
74+
- ThreadStatesGaugeSet
5375

76+
### Kubernetes Client Metrics
77+
78+
| Metrics Name | Type | Description |
79+
|-----------------------------------------------------------|------------|--------------------------------------------------------------------------------------------------------------------------|
80+
| kubernetes.client.http.request | Meter | Tracking the rates of HTTP request sent to the Kubernetes API Server |
81+
| kubernetes.client.http.response | Meter | Tracking the rates of HTTP response from the Kubernetes API Server |
82+
| kubernetes.client.http.response.failed | Meter | Tracking the rates of HTTP requests which have no response from the Kubernetes API Server |
83+
| kubernetes.client.http.response.latency.nanos | Histograms | Measures the statistical distribution of HTTP response latency from the Kubernetes API Server |
84+
| kubernetes.client.http.response.<ResponseCode> | Meter | Tracking the rates of HTTP response based on response code from the Kubernetes API Server |
85+
| kubernetes.client.http.request.<RequestMethod> | Meter | Tracking the rates of HTTP request based type of method to the Kubernetes API Server |
86+
| kubernetes.client.http.response.1xx | Meter | Tracking the rates of HTTP Code 1xx responses (informational) received from the Kubernetes API Server per response code. |
87+
| kubernetes.client.http.response.2xx | Meter | Tracking the rates of HTTP Code 2xx responses (success) received from the Kubernetes API Server per response code. |
88+
| kubernetes.client.http.response.3xx | Meter | Tracking the rates of HTTP Code 3xx responses (redirection) received from the Kubernetes API Server per response code. |
89+
| kubernetes.client.http.response.4xx | Meter | Tracking the rates of HTTP Code 4xx responses (client error) received from the Kubernetes API Server per response code. |
90+
| kubernetes.client.http.response.5xx | Meter | Tracking the rates of HTTP Code 5xx responses (server error) received from the Kubernetes API Server per response code. |
91+
| kubernetes.client.<ResourceName>.<Method> | Meter | Tracking the rates of HTTP request for a combination of one Kubernetes resource and one http method |
92+
| kubernetes.client.<NamespaceName>.<ResourceName>.<Method> | Meter | Tracking the rates of HTTP request for a combination of one namespace-scoped Kubernetes resource and one http method |
93+
94+
### Forward Metrics to Prometheus
95+
96+
In this section, we will show you how to forward Spark Operator metrics
97+
to [Prometheus](https://prometheus.io).
98+
99+
* Modify the metrics properties section in the file
100+
`build-tools/helm/spark-kubernetes-operator/values.yaml`:
101+
102+
```properties
103+
metrics.properties:|+
104+
spark.metrics.conf.operator.sink.prometheus.class=org.apache.spark.kubernetes.operator.metrics.
105+
sink.PrometheusPullModelSink
54106
```
55107

56-
## Config Metrics Publishing Behavior
108+
* Install Spark Operator
109+
110+
```bash
111+
helm install spark-kubernetes-operator -f build-tools/helm/spark-kubernetes-operator/values.yaml build-tools/helm/spark-kubernetes-operator/
112+
```
113+
114+
* Install Prometheus via Helm Chart
115+
116+
```bash
117+
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
118+
helm install prometheus prometheus-community/prometheus
119+
```
120+
121+
* Find and Annotate Spark Operator Pods
122+
123+
```bash
124+
kubectl get pods -l app.kubernetes.io/name=spark-kubernetes-operator
125+
NAME READY STATUS RESTARTS AGE
126+
spark-kubernetes-operator-598cb5d569-bvvd2 1/1 Running 0 24m
127+
128+
kubectl annotate pods spark-kubernetes-operator-598cb5d569-bvvd2 prometheus.io/scrape=true
129+
kubectl annotate pods spark-kubernetes-operator-598cb5d569-bvvd2 prometheus.io/path=/prometheus
130+
kubectl annotate pods spark-kubernetes-operator-598cb5d569-bvvd2 prometheus.io/port=19090
131+
```
132+
133+
* Check Metrics via Prometheus UI
134+
135+
```bash
136+
kubectl get pods | grep "prometheus-server"
137+
prometheus-server-654bc74fc9-8hgkb 2/2 Running 0 59m
138+
139+
kubectl port-forward --address 0.0.0.0 pod/prometheus-server-654bc74fc9-8hgkb 8080:9090
140+
```
57141

58-
Spark Operator uses the same source & sink interface as Apache Spark. You may
59-
use existing Spark metrics sink for both applications and the operator.
142+
open your browser with address `localhost:8080`. Click on Status Targets tab, you should be able
143+
to find target as below.
144+
[<img src="resources/prometheus.png">](resources/prometheus.png)

docs/metrics_logging.md

Lines changed: 0 additions & 109 deletions
This file was deleted.

0 commit comments

Comments
 (0)