Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 61 additions & 22 deletions content/en/docs/reference/features/tablet-throttler.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ weight: 21
aliases: ['/docs/user-guides/tablet-throttler/','/docs/reference/tablet-throttler/']
---

VTTablet runs a cooperative throttling service. This service probes the shard's MySQL topology and observes replication lag on servers. This throttler is derived from GitHub's [freno](https://github.com/github/freno).
VTTablet runs a cooperative throttling service. This service probes the shard's MySQL topology and observes health, measure by replication lag, on servers. This throttler is derived from GitHub's [freno](https://github.com/github/freno).

_Note: the Vitess documentation is transitioning from the term "Master" (with regard to MySQL replication) to "Primary". this document reflects this transition._

## Why throttler: maintaining low replication lag
## Why throttler: maintaining shard health via low replication lag

Vitess uses MySQL with asynchronous or semi-synchronous replication. In these modes, each shard has a primary instance that applies changes and logs them to the binary log. The replicas for that shard will get binary log entries from the primary, potentially acknowledge them (if semi-synchronous replication is enabled), and apply them. A running replica normally applies the entries as soon as possible, unless it is stopped or configured to delay. However, if the replica is busy, then it may not have the resources to apply events in a timely fashion, and can therefore start lagging. For example, if the replica is serving traffic, it may lack the necessary disk I/O or CPU to avoid lagging behind the primary.

Expand All @@ -24,19 +24,26 @@ Some common database operations include mass writes to the database, including t
- Purging of old data
- Purging of tables as part of safe table `DROP` operation

Other operations include mass reads from the database:

- An ETL reading content of entire tables
- VReplication scanning an entire keyspace data and binary logs

These operations can easily incur replication lag. However, these operations are typically not time-limited. It is possible to rate-limit them to reduce database load.

This is where a throttler becomes useful. A throttler can detect when replication lag is low, a cluster is healthy, and operations can proceed. It can also detect when replication lag is high and advise applications to hold the next operation.

Applications are expected to break down their tasks into small sub-tasks. For example, instead of deleting `1,000,000` rows, an application should only delete `50` at a time. Between these sub-tasks, the application should check in with the throttler.

The throttler is only intended for use with operations such as the above mass write cases. It should not be used for ongoing, normal OLTP queries.
The throttler is only intended for use with operations such as the above mass write/read cases. It should not be used for ongoing, normal OLTP queries.

## Throttler overview

Each `vttablet` runs an internal throttler service, and provides API endpoints to the throttler. Only the primary throttler is doing actual work at any given time. The throttlers on the replicas are mostly dormant, and wait for their turn to become "leaders," such as when the tablet transitions into `MASTER` (primary) type.
Each `vttablet` runs an internal throttler service, and provides API endpoints to the throttler. Each tablet, including the primary, measures its own "self" health, discussed later.

### Cluster health:

The primary tablet's throttler continuously does the following things:
In addition, the primary tablet is responsible for the overall health of the cluster/shard:

- The throttler confirms it is still the primary tablet for its shard.
- Every `10sec`, the throttler uses the topology server to refresh the shard's tablets list.
Expand All @@ -60,6 +67,12 @@ The user may override the heartbeat interval by sending `-heartbeat_interval` fl

Thus, the aggregated interval can be off, by default, by some `500ms`. This makes it inaccurate for evaluations that require high resolution lag evaluation. This resolution is sufficient for throttling purposes.

### Self health

Each tablet runs a local health check against its backend database, again in the form of evaluating replication lag from `_vt.heartbeat`. Intervals are identical to the cluster health interval illustrated above.

### Response codes

The throttler allows clients and applications to `check` for throttle advice. The check is an `HTTP` request, `HEAD` method, or `GET` method. Throttler returns one of the following HTTP response codes as an answer:

- `200` (OK): The application may write to the data store. This is the desired response.
Expand All @@ -78,27 +91,27 @@ When the throttler sees no relevant replicas in the shard, it allows writes by r

## Configuration


- The throttler is currently disabled by default. Use the `vttablet` option `-enable-lag-throttler` to enable the throttler.
When the throttler is disabled, it still serves `/throttler/check` API and responds with `HTTP 200 OK` to all requests.
- The throttler is currently **disabled** by default. Use the `vttablet` option `-enable-lag-throttler` to enable the throttler.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep the flags consistent and always use underscores? we have enable-lag-throttler but throttle_threshold.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! So this is all over the place, we have dashes and underscored spread across many flags. I'm just wondering how do we go about this with existing flags? Let me see if there's something in golang that treats both as the same, as it it's an uppercase/lowercase thing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that this is not a documentation issue -- the flag is named -enable-lag-throttler and we should normalize the flag name in code.

Copy link
Copy Markdown
Contributor Author

@shlomi-noach shlomi-noach Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of digging brought up this. So what I'll do is for some variables support both dash-based and underscore-based names.

When the throttler is disabled, it still serves `/throttler/check` and `/throttler/check-self` API endpoints, and responds with `HTTP 200 OK` to all requests.
When the throttler is enabled, it implicitly also runs heartbeat injections.
- Use the `vttablet` flag `-throttle_threshold` to set a lag threshold value. The default threshold is `1sec` and is set upon tablet startup. For example, to set a half-second lag threshold, use the flag `-throttle_threshold=0.5s`.



- To set the tablet types that the throttler queries for lag, use the `vttablet` flag `-throttle_tablet_types="replica,rdonly"`. The default tablet type is `replica`; this type is always implicitly included in the tablet types list. You may add any other tablet type. Any type not specified is ignored by the throttler.

## API & usage

Applicaitons use the API `/throttler/check`.
Applications use these API endpoints:

- `/throttler/check`, for apps that wish to write mass amounts of data to a shard, and wish to maintain the overall health of the shard.
- `/throttler/check-self`, for apps that wish to perform some operation (e.g. a massive _read_) on a specific tablet and only wish to maintain the health of that tablet.

- Applications may indicate their identity via `?app=<name>` parameter.
- Applications may also declare themselves to be _low priority_ via `?p=low` param. Managed online schema migrations (`gh-ost`, `pt-online-schema-change`) do so, as does the table purge process.

Examples:

- `gh-ost` uses this throttler endpoint: `/throttler/check?app=gh-ost&p=low`
- A data backfill application may use this parameter: `/throttler/check?app=backfill` (using _normal_ priority)
- A data backfill application will identify as such, and use _normal_ priority: `/throttler/check?app=my_backfill` (priority not indicated in URL therefore assumed to be _normal_)
- An app reading a massive amount of data directly from a replica tablet will use `/throttler/check-self?app=my_data_reader`

A `HEAD` request is sufficient. A `GET` request also provides a `JSON` output. For example:

Expand All @@ -112,7 +125,7 @@ Tablet also provides `/throttler/status` endpoint. This is useful for monitoring

**Example: Healthy primary tablet**

The following command gets throttler status on a tablet hosted on `tablet1`, serving on port `15100`.
The following command gets throttler status on a primary tablet hosted on `tablet1`, serving on port `15100`.

```shell
$ curl -s http://tablet1:15100/throttler/status | jq .
Expand All @@ -128,22 +141,37 @@ This API call returns the following JSON object:
"IsOpen": true,
"IsDormant": false,
"AggregatedMetrics": {
"mysql/local": {
"Value": 0.193576
"mysql/self": {
"Value": 0.749837
},
"mysql/shard": {
"Value": 0.749887
}
},
"MetricsHealth": {}
"MetricsHealth": {
"mysql/self": {
"LastHealthyAt": "2021-01-24T19:03:19.141933727+02:00",
"SecondsSinceLastHealthy": 0
},
"mysql/shard": {
"LastHealthyAt": "2021-01-24T19:03:19.141974429+02:00",
"SecondsSinceLastHealthy": 0
}
}
}

```

The primary tablet serves two types of metrics:

- `mysql/shard`: an aggregated lag on relevant replicas in this shard. This is the metric to check when writing massive amounts of data to this server.
- `mysql/self`: the health of the specific primary MySQL server backed by this tablet.

`"IsLeader": true` indicates this tablet is active, is the `primary`, and is running probes.
`"IsDormant": false,` means that an application has recently issued a `check`, and the throttler is probing for lag at high frequency.

**Example: replica tablet**

The following command gets throttler status on a tablet hosted on `tablet2`, serving on port `15100`.
The following command gets throttler status on a replica tablet hosted on `tablet2`, serving on port `15100`.

```shell
$ curl -s http://tablet2:15100/throttler/status | jq .
Expand All @@ -157,12 +185,23 @@ This API call returns the following JSON object:
"Shard": "80-c0",
"IsLeader": false,
"IsOpen": true,
"IsDormant": true,
"AggregatedMetrics": {},
"MetricsHealth": {}
"IsDormant": false,
"AggregatedMetrics": {
"mysql/self": {
"Value": 0.346409
}
},
"MetricsHealth": {
"mysql/self": {
"LastHealthyAt": "2021-01-24T19:04:25.038290475+02:00",
"SecondsSinceLastHealthy": 0
}
}
}
```

The replica tablet only presents `mysql/self` metric (measurement of its own backend MySQL's lag). It does not serve checks for the shard in general.


## Resources

Expand Down