Skip to content

Commit

Permalink
Add documentation for release v1.5.0 (#181)
Browse files Browse the repository at this point in the history
Signed-off-by: saintube <[email protected]>
Co-authored-by: shenxin <[email protected]>
  • Loading branch information
saintube and shenxin authored Jun 19, 2024
1 parent 02c1203 commit f8d81e7
Show file tree
Hide file tree
Showing 110 changed files with 20,906 additions and 86 deletions.
287 changes: 287 additions & 0 deletions blog/2024-06-18-release/index.md

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/designs/descheduler-framework.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ The existing [descheduler](https://github.com/kubernetes-sigs/descheduler) in th

We also noticed that the K8s descheduler community also found these problems and proposed corresponding solutions such as [#753 Descheduler framework Proposal](https://github.com/kubernetes-sigs/descheduler/issues/753) and [PoC #781](https://github.com/kubernetes-sigs/descheduler/pull/781). The K8s descheduler community tries to implement a descheduler framework similar to the k8s scheduling framework. This coincides with our thinking.

On the whole, these solutions solved most of our problems, but we also noticed that the related implementations were not merged into the main branch. But we review these implementations and discussions, and we believe this is the right direction. Considering that Koordiantor has clear milestones for descheduler-related features, we will implement Koordinator's own descheduler independently of the upstream community. We try to use some of the designs in the [#753 PR](https://github.com/kubernetes-sigs/descheduler/issues/753) proposed by the community and we will follow the Koordinator's compatibility principle with K8s to maintain compatibility with the upstream community descheduler when implementing. Such as independent implementation can also drive the evolution of the upstream community's work on the descheduler framework. And when the upstream community has new changes or switches to the architecture that Koordinator deems appropriate, Koordinator will follow up promptly and actively.
On the whole, these solutions solved most of our problems, but we also noticed that the related implementations were not merged into the main branch. But we review these implementations and discussions, and we believe this is the right direction. Considering that koordinator has clear milestones for descheduler-related features, we will implement Koordinator's own descheduler independently of the upstream community. We try to use some of the designs in the [#753 PR](https://github.com/kubernetes-sigs/descheduler/issues/753) proposed by the community and we will follow the Koordinator's compatibility principle with K8s to maintain compatibility with the upstream community descheduler when implementing. Such as independent implementation can also drive the evolution of the upstream community's work on the descheduler framework. And when the upstream community has new changes or switches to the architecture that Koordinator deems appropriate, Koordinator will follow up promptly and actively.

### Goals

Expand All @@ -26,7 +26,7 @@ On the whole, these solutions solved most of our problems, but we also noticed t

#### Descheduler profile

The current descheduler configuration is too simple to support disabling or enabling plugins or supporting custom plugin configurations. The [PR #587](https://github.com/kubernetes-sigs/descheduler/pull/587) introducing descheduler profiles with v1alpha2 api version. We will use this proposal as Koordiantor Descheduler's configuration API.
The current descheduler configuration is too simple to support disabling or enabling plugins or supporting custom plugin configurations. The [PR #587](https://github.com/kubernetes-sigs/descheduler/pull/587) introducing descheduler profiles with v1alpha2 api version. We will use this proposal as koordinator Descheduler's configuration API.

- The descheduler profile API supports user specify which extension points are enabled/disabled, alongside specifying plugin configuration. Including ability to configure multiple descheduling profiles.
- The descheduling framework configuration can be converted into an internal representation.
Expand Down
2 changes: 1 addition & 1 deletion docs/designs/enhanced-scheduler-extension.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Although Kubernetes Scheduler provides the scheduling framework to help develope

#### Story 1

Koordiantor supports users to use `Reservation` CRD to reserve resources. We expect Reservation CRD objects to be scheduled like Pods. In this way, the native scheduling capabilities of Kubernetes and other extended scheduling capabilities can be reused. This requires a mechanism to disguise the Reservation CRD object as a Pod, and to extend some scheduling framework extension points to support updating the Reservation Status.
koordinator supports users to use `Reservation` CRD to reserve resources. We expect Reservation CRD objects to be scheduled like Pods. In this way, the native scheduling capabilities of Kubernetes and other extended scheduling capabilities can be reused. This requires a mechanism to disguise the Reservation CRD object as a Pod, and to extend some scheduling framework extension points to support updating the Reservation Status.

#### Story 2

Expand Down
74 changes: 60 additions & 14 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,14 @@

Koordinator requires **Kubernetes version >= 1.18**.

Koordinator need collect metrics from kubelet read-only port(default is disabled).
you can get more info form [here](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/).

For the best experience, koordinator recommands **linux kernel 4.19** or higher.
Koordinator may collect metrics from kubelet read-only port (disabled by default, where the secure port is used).
You can get more info form [here](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/).

For the best experience, koordinator recommends **linux kernel 4.19** or higher.

## Install with helms

Koordinator can be simply installed by helm v3.5+, which is a simple command-line tool and you can get it from [here](https://github.com/helm/helm/releases).
Koordinator can be simply installed by helm v3.5+, which is a simple command-line tool, and you can get it from [here](https://github.com/helm/helm/releases).

```bash
# Firstly add koordinator charts repository if you haven't do this.
Expand All @@ -20,7 +19,7 @@ $ helm repo add koordinator-sh https://koordinator-sh.github.io/charts/
$ helm repo update

# Install the latest version.
$ helm install koordinator koordinator-sh/koordinator --version 1.4.1
$ helm install koordinator koordinator-sh/koordinator --version 1.5.0
```

## Upgrade with helm
Expand All @@ -33,7 +32,7 @@ $ helm repo add koordinator-sh https://koordinator-sh.github.io/charts/
$ helm repo update

# Upgrade the latest version.
$ helm upgrade koordinator koordinator-sh/koordinator --version 1.4.1 [--force]
$ helm upgrade koordinator koordinator-sh/koordinator --version 1.5.0 [--force]
```

Note that:
Expand All @@ -42,7 +41,7 @@ Note that:
to make sure that you have understand the breaking changes in the new version.
2. If you want to drop the chart parameters you configured for the old release or set some new parameters,
it is recommended to add `--reset-values` flag in `helm upgrade` command.
Otherwise you should use `--reuse-values` flag to reuse the last release's values.
Otherwise, you should use `--reuse-values` flag to reuse the last release's values.

## Optional: download charts manually

Expand All @@ -57,12 +56,14 @@ $ helm install/upgrade koordinator /PATH/TO/CHART
### Prerequisite

- Containerd >= 1.7.0 and enable NRI. Please make sure NRI is enabled in containerd. If not, please refer to [Enable NRI in Containerd](https://github.com/containerd/containerd/blob/main/docs/NRI.md)
- Koordinator >= 1.4
- Koordinator >= 1.3

### Configurations

NRI mode resource management is *Enabled* by default. You can use it without any modification on the koordlet config. You can also disable it to set `enable-nri-runtime-hook=false` in koordlet start args. It doesn't matter if all prerequisites are not meet. You can use all other features as expected.

For more details, please refer to [NRI Mode Resource Management](https://github.com/koordinator-sh/koordinator/blob/main/docs/proposals/20230608-nri-mode-resource-management.md).

## Options

Note that installing this chart directly means it will use the default template values for Koordinator.
Expand All @@ -82,7 +83,7 @@ The following table lists the configurable parameters of the chart and their def
| `manager.log.level` | Log level that koord-manager printed | `4` |
| `manager.replicas` | Replicas of koord-manager deployment | `2` |
| `manager.image.repository` | Repository for koord-manager image | `koordinatorsh/koord-manager` |
| `manager.image.tag` | Tag for koord-manager image | `v1.4.0` |
| `manager.image.tag` | Tag for koord-manager image | `v1.5.0` |
| `manager.resources.limits.cpu` | CPU resource limit of koord-manager container | `1000m` |
| `manager.resources.limits.memory` | Memory resource limit of koord-manager container | `1Gi` |
| `manager.resources.requests.cpu` | CPU resource request of koord-manager container | `500m` |
Expand All @@ -97,7 +98,7 @@ The following table lists the configurable parameters of the chart and their def
| `scheduler.log.level` | Log level that koord-scheduler printed | `4` |
| `scheduler.replicas` | Replicas of koord-scheduler deployment | `2` |
| `scheduler.image.repository` | Repository for koord-scheduler image | `koordinatorsh/koord-scheduler` |
| `scheduler.image.tag` | Tag for koord-scheduler image | `v1.4.0` |
| `scheduler.image.tag` | Tag for koord-scheduler image | `v1.5.0` |
| `scheduler.resources.limits.cpu` | CPU resource limit of koord-scheduler container | `1000m` |
| `scheduler.resources.limits.memory` | Memory resource limit of koord-scheduler container | `1Gi` |
| `scheduler.resources.requests.cpu` | CPU resource request of koord-scheduler container | `500m` |
Expand All @@ -109,7 +110,7 @@ The following table lists the configurable parameters of the chart and their def
| `scheduler.hostNetwork` | Whether koord-scheduler pod should run with hostnetwork | `false` |
| `koordlet.log.level` | Log level that koordlet printed | `4` |
| `koordlet.image.repository` | Repository for koordlet image | `koordinatorsh/koordlet` |
| `koordlet.image.tag` | Tag for koordlet image | `v1.4.0` |
| `koordlet.image.tag` | Tag for koordlet image | `v1.5.0` |
| `koordlet.resources.limits.cpu` | CPU resource limit of koordlet container | `500m` |
| `koordlet.resources.limits.memory` | Memory resource limit of koordlet container | `256Mi` |
| `koordlet.resources.requests.cpu` | CPU resource request of koordlet container | `0` |
Expand All @@ -131,7 +132,6 @@ Feature-gate controls some influential features in Koordinator:
| `PodMutatingWebhook` | Whether to open a mutating webhook for Pod **create** | `true` | Don't inject koordinator.sh/qosClass, koordinator.sh/priority and don't replace koordinator extend resources ad so on |
| `PodValidatingWebhook` | Whether to open a validating webhook for Pod **create/update** | `true` | It is possible to create some Pods that do not conform to the Koordinator specification, causing some unpredictable problems |


If you want to configure the feature-gate, just set the parameter when install or upgrade. Such as:

```bash
Expand All @@ -140,6 +140,17 @@ $ helm install koordinator https://... --set featureGates="PodMutatingWebhook=tr

If you want to enable all feature-gates, set the parameter as `featureGates=AllAlpha=true`.

### Optional: install or upgrade specific CRDs

If you want to skip specific CRDs during the installation or the upgrade, you can set the parameter `crds.<crdPluralName>` to `false` and install or upgrade them manually.

```bash
# skip install specific CRD
$ helm install koordinator https://... --set crds.managed=true,crds.noderesourcetopologies=false
# only upgrade specific CRDs
$ helm upgrade koordinator https://... --set crds.managed=false,crds.recommendations=true,crds.clustercolocationprofiles=true,crds.elasticquotaprofiles=true,crds.elasticquotas=true,crds.devices=true,crds.podgroups=true
```

### Optional: the local image for China

If you are in China and have problem to pull image from official DockerHub, you can use the registry hosted on Alibaba Cloud:
Expand All @@ -154,7 +165,42 @@ $ helm install koordinator https://... --set imageRepositoryHost=registry.cn-bei

When using a custom CNI (such as Weave or Calico) on EKS, the webhook cannot be reached by default. This happens because the control plane cannot be configured to run on a custom CNI on EKS, so the CNIs differ between control plane and worker nodes.

To address this, the webhook can be run in the host network so it can be reached, by setting `--set manager.hostNetwork=true` when use helm install or upgrade.
To address this, the webhook can be run in the host network, so it can be reached, by setting `--set manager.hostNetwork=true` when use helm install or upgrade.

### Installation parameters for Alibaba Cloud ACK

To install or upgrade Koordinator on Alibaba Cloud ACK, you need to skip some CRDs because the ACK cluster should already prepare them, and they cannot be takeover by the Helm.

e.g. You may get the error below:

```bash
$ helm install koordinator koordinator-sh/koordinator --version 1.5.0
Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists. Unable to continue with install: CustomResourceDefinition "reservations.scheduling.koordinator.sh" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "koordinator"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "default"
```

To resolve the conflict error, you can install or upgrade with the conflict CRDs disabled. Please refer to [Optional: install or upgrade specific CRDs](#optional-install-or-upgrade-specific-crds).

1. (Optional) Check the Koordinator-related CRDs already deployed by ACK.

```bash
# for the latest CRDs, please refer to https://github.com/koordinator-sh/koordinator/blob/main/charts/koordinator/crds/crds.yaml
$ kubectl get crd | grep "nodemetrics\|noderesourcetopologies\|elasticquotas\|podgroup\|reservations"
elasticquotas.scheduling.sigs.k8s.io 1970-01-01T00:00:00Z
nodemetrics.slo.koordinator.sh 1970-01-01T00:00:00Z
noderesourcetopologies.topology.node.k8s.io 1970-01-01T00:00:00Z
podgroups.scheduling.sigs.k8s.io 1970-01-01T00:00:00Z
reservations.scheduling.koordinator.sh 1970-01-01T00:00:00Z
```

2. Install or upgrade Koordinator without the deployed CRDs.

```bash
# install without the deployed CRDs
$ helm install koordinator https://... --set crds.managed=true,crds.nodemetrics=false,crds.noderesourcetopologies=false,crds.elasticquotas=false,crds.podgroups=false,crds.reservations=false

# upgrade without the deployed CRDs
$ helm upgrade koordinator https://... --set crds.managed=true,crds.nodemetrics=false,crds.noderesourcetopologies=false,crds.elasticquotas=false,crds.podgroups=false,crds.reservations=false
```

## Uninstall

Expand Down
4 changes: 0 additions & 4 deletions docs/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ Welcome to Koordinator!

Koordinator is a QoS-based scheduling for efficient orchestration of microservices, AI, and big data workloads on Kubernetes. It aims to improve the runtime efficiency and reliability of both latency sensitive workloads and batch jobs, simplify the complexity of resource-related configuration tuning, and increase pod deployment density to improve resource utilizations.


## Key Features

Koordinator enhances the kubernetes user experiences in the workload management by providing the following:
Expand All @@ -22,7 +21,6 @@ Koordinator enhances the kubernetes user experiences in the workload management
- Flexible job scheduling mechanism to support workloads in specific areas, e.g., big data, AI, audio and video.
- A set of tools for monitoring, troubleshooting and operations.


## Koordinator vs. Other Concept

### Koordinator QoS vs Kubernetes QoS
Expand All @@ -44,5 +42,3 @@ Here are some recommended next steps:

- Start to [install Koordinator](./installation).
- Learn Koordinator's [Overview](architecture/overview).


Loading

0 comments on commit f8d81e7

Please sign in to comment.