Scale to Zero

Infrastructure to scale any resource to and from zero

Design:

Resource - A generic name for the component which is being scaled by this mechanism. Will mostly include pod/s (that will be scaled when needed) wrapped by one or many deployment/daemon set/replica set, and k8s service/s that can be used to route incoming requests.

Resource-scaler - An implementation of the ResourceScaler interface defined in scaler-types. It is transplanted inside the autoscaler and dlx using Go plugins. They use them to perform actions on the specific resource.
For example, when the autoscaler decides it needs to scale some resource to zero, it executes the resource-scaler's SetScale function which has the knowledge how to scale to zero its specific resource.

The autoscaler - Responsible for periodically checking whether some resources should be scaled to zero. This is performed by by querying the custom metrics API. Upon deciding a resource should be scaled to zero, it uses the internal resource-scaler module to scale the resource to zero. The resource-scaler will first route all incoming traffic to the DLX, which in terms of K8s is done by changing a service selector, after that, it will scale the resource to zero.

The DLX - Responsible for receiving and buffering requests of scaled to zero resources, Upon receiving an incoming request it creates a buffer for the messages, and tells the resource-scaler to scale the service back from zero. The resource-scaler will scale the resource back up and then route the traffic back to the service (by modifying the k8s service selector).

Prerequisites

Custom metrics API implementation:

The Autoscaler makes decisions based on data queried from Kubernetes custom metrics API. There are several possible tools that implement it, we internally use Prometheus with the Prometheus-Adapter but you can use which ever you want! You can find some recommended implementations here

Getting Started

The infrastructure is designed to be generic, flexible and extendable, so as to serve any resource we'd wish to scale to/from zero. All you have to do is implement the specific resource-scaler for your resource. The interface between your resource-scaler and the scale-to-zero infrastructure's components is defined in scaler-types

Note: Incompatibility between this scaler vendor dir and your resource-scale vendor dir may break things, therefore it's suggested to put your resource-scaler in its own repo

Examples:

Nuclio functions resource-scaler
Iguazio's app service resource-scaler

Installing

Go plugins is the magic that glues the resource-scaler and this infrastructure components together.
First you'll need to build the resource-scaler as a Go plugin, for example:

GOOS=linux GOARCH=amd64 go build -buildmode=plugin -a -installsuffix cgo -ldflags="-s -w" -o ./plugin.so ./resourcescaler.go

The autoscaler/dlx looks for the plugin using this path (from the execution directory) ./plugins/*.so so you should move the binary artifact of the build command (the plugin.so file) to the plugins directory It is much easier to do everything using Dockerfiles, here are some great examples:

Nuclio function Autoscaler dockerfile
Nuclio function DLX dockerfile
Iguazio's app service Autoscaler dockerfile
Iguazio's app service DLX dockerfile

You can install the components using the scaler helm chart
$ helm install --name my-release v3io-stable/scaler

Versioning

We use SemVer for versioning. For the versions available, see the releases on this repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Scale to Zero

Design:

Prerequisites

Getting Started

Installing

Versioning

Files

README.md

Latest commit

History

README.md

File metadata and controls

Scale to Zero

Design:

Prerequisites

Getting Started

Installing

Versioning