Skip to content
This repository has been archived by the owner on Dec 1, 2018. It is now read-only.

Proposal: Introduce Oldtimer #1124

Merged
merged 1 commit into from
Apr 29, 2016
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
228 changes: 228 additions & 0 deletions docs/proposals/old-timer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
# Heapster Oldtimer

## Overview

Prior to the Heapster refactor, the Heapster model presented aggregations of
metrics over certain time periods (the last hour and day). Post-refactor, the
concern of presenting an interface for historical metrics was to be split into
a separate Heapster component: Oldtimer.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be built into Heapster? Or maybe it'll be running in a separate Pod/Replication Controller/Service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, you'd probably want to run it separately from Heapster so that you could scale it, but we might want to also provide an easy option to run it as part of Heapster as well (some sort of "allinone" installation for quickly getting up and running).

Copy link
Contributor

@mwielgus mwielgus Apr 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to include it in the default deployment. I think that for 99% deployments a single instance will be enough. It should rather not consume lots of memory.
Anyone who needs to perform large-scale data analysis will do it in some other way (not via this very basic api).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


Oldtimer will run as part of the main Heapster executable, and will present
Copy link
Contributor

@mwielgus mwielgus Apr 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oldtimer should rather run as a separate executable/container in the main Heapster pod.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, ok, I think I misunderstood you above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would effectively mean that it would have to be on a different port. Why not just allow it to run in the same executable, if you're going to be running it as part of the same pod anyway, so that https://heapster.kube-system/api/v1/model is for the model, while https://heapster.kube-system/api/v1/historical is for oldtimer?

common interfaces for retrieving historical metrics over longer periods of time
than the Heapster model, and will allow fetching aggregations of metrics (e.g.
averages, 95 percentile, etc) over different periods of time. It will do this
by querying the sink to which it is storing metrics.

Note: even though we are retrieving metrics, this document refers to the
metrics storage locations as "sinks" to be consistent with the rest
of Heapster.

## Motivation

There are two major motivations for exposing historical metrics information:

1. Using aggregated historical data to make size-related decisions
(for example, idling requires looking for traffic over a long time period)

2. Providing a common interface for users to view historical metrics

Before the Heapster refactoring (see the
[Heapster Long Term Vision Proposal](https://github.com/kubernetes/heapster/blob/master/docs/proposals/vision.md)),
Heapster supported querying metrics aggregated over certain extended time
periods (the last hour and day) via the Heapster model.

However, since the Heapster model is stored in-memory, and not persisted to
disk, this historical data would be "lost" whenever Heapster was restarted.
This made it unreliable for use by system components which need a historical
view.

Since we already persist metrics into a sink, it does not make sense for
Heapster itself to persist long-term metrics to disk itself. Instead, we can
just query the sink directly.

## API

Oldtimer will present an api somewhat similar to the normal Heapster model.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you highlight the differences? Is this the same API, but with different prefix and bucketing options? Or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add in a brief overview of the differences.

The structure of the URLs is designed to mirror those exposed by the model API.
When used simply to retrieve historical data points, Oldtimer will return the
same types as the model API. When the used to retrieve aggregations, Oldtimer
will return special data types detailed under the "Return Types" section.

### Paths

`/api/v1/historical/{prefix}/metrics/`: Returns a list of all available
metrics.

`/api/v1/historical{prefix}/metrics/{metric-name}?start=X&end=Y`: Returns a set
of (Timestamp, Value) pairs for the requested {prefix}-level metric, over the
given time range.

`/api/v1/historical{prefix}/metrics-aggregated/{aggregations}/{metric-name}?start=X&end=Y&bucket=B`
Returns the requested {prefix}-level metric, aggregated with the given
aggregation over the requested time period (potentially split into several
different bucket of duration `B`). `{aggregations}` may be a comma-separated
list of aggregations to retrieve multiple at once.

Where `{prefix}` is normally either empty (cluster-level),
`/namespaces/{namespace}` (namespace-level),
`/namespaces/{namespace}/pods/{pod-name}` (pod-level),
`/namespaces/{namespace}/pod-list/{pod-list}` (multi-pod-level), or
`/namespaces/{namespace}/pods/{pod-name}/containers/{container-name}`
(container-level).

Additionally, since pod names are not temporally unique (i.e. it is possible to
delete a pod, and then create a new, completely different pod with the same
name), `{prefix}` may also be `/pod-id/{pod-id}` (pod-level metrics),
`/pod-id-list/{pod-id-list}` (multi-pod-level), or
`/pod-id/{pod-id}/containers/{container-name}` (container-level metrics).

In addition, when `{prefix}` is not empty, there will be a url of the form:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what is this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Listing the available items is to be roughly in line with the Heapster model API. In general, I think both APIs should follow the same conventions.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, sounds good.

`/api/v1/historical/{prefix-without-final-element}` which allows fetching the
list of available nodes/namespaces/pods/containers.

Note that queries by pod name will return metrics from the latest pod with the
given name. This may require an extra trip to the database in some cases, in
order to determine which pod id that actually is. For this reason, if a
component knows the pod ids for which it is querying, using these is preferred
to using the pod names. The pod-name-based API is retained for the sake of
easy queries and to match up with the model API.

### Parameter Types

The `start` and `end` parameters are defined the same way as for the model:
each should be a timestamp formatted according to RFC 3339, if no start time is
specified, it defaults to zero in Unix epoch time, and if no end time is
specified, all data after the start time will be considered.

The `bucket` (bucket duration) parameter is a number followed by any of the
following suffixes:

- `ms`: milliseconds
- `s`: seconds
- `m`: minutes
- `h`: hours
- `d`: days

### Return Types

For requests which simply fetch data points or list available objects, the
return format will be the same as that used in the Heapster model API.

The the case of aggregations, a different set of types is used: each bucket is
represented by a `MetricAggregationBucket`, which contains the timestamp for
that bucket (the start of the bucket), the count of entries in that bucket (if
requested) as an unsigned integer, as well as each of the other requested
aggregations, in the form of a `MetricValue` (which just holds an unsigned int
or a float).

All buckets for a particular metric are grouped together in a
`MetricAggregationResult`, which also holds the bucket size (duration) for the
buckets. If multiple pods are requested, the result will be returned as a
`MetricAggregationResultList`, similarly to the `MetricResultList` for the
model API.

```go
type MetricValue struct {
IntValue *uint64
FloatValue *float64
}

type MetricAggregationBucket struct {
Timestamp time.Time
Count *uint64

Average *MetricValue
Maximum *MetricValue
Minimum *MetricValue
Median *MetricValue
Percentiles map[uint64]MetricValue
}

type MetricAggregationResult struct {
Buckets []MetricAggregationBucket
BucketSize time.Duration
}

type MetricAggregationResultList struct {
Items []MetricAggregationResult
}
```

### Aggregations

Several different aggregations will be supported. Aggregations should be
performed in the metrics sink. If more aggregations later become supported
across all metrics sinks, the list can be expanded.

- Average (arithmetic mean): `/metrics-aggregated/average`
- Maximum: `/metrics-aggregated/max`
- Minimum: `/metrics-aggregated/min`
- Percentile: `/metrics-aggregated/{number}-perc`
- Median: `/metrics-aggregated/median`
- Count: `/metrics-aggregated/count`

Note: to support all the existing sinks, the supported percentiles will be
limitted to 50, 95, and 99. If additional percentile values later become
supported by other sinks, this list may be expanded (see the Sink Support
section below).

### Example

Suppose that one wanted to retrieve the 95th percentile of CPU usage for a
given pod over the past 30 days, in 1 hour intervals, along with the maximum
usage for each interval. Call the pod "somepod", in the namespace "somens".
To fetch the results, you'd perform:

```
GET /api/v1/historical/namespaces/somens/pods/somepod/metrics-aggregated/95-perc,average/cpu/usage?start=2016-03-20T10:57:37-04:00&bucket=1h
```

Which would then return:

```json
{
"bucketSize": "3600000000000",
"buckets": [
{
"timestamp": "2016-03-20T10:57:37-04:00",
"average": "32",
"percentiles": {
"95": "27"
}
},
...
]
}
```

## Sink Support and Functionality

When Oldtimer receives a request, it will compose a query to the sink, send the
query to the sink, and the transform the results into the appropriate API
formats. Note that Oldtimer is designed to retrieve information that was
originally written by Heapster itself. Any information read by Oldtimer must
have been stored according to the Heapster storage schema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which sinks are you planning to support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I'd definitely support Hawkular, and I'm also willing to write the initial support for other sinks that people think are important. I've tried to make sure all the features talked about in the proposal work across InfluxDB, Hawkular, GCM, and OpenTSDB (the main metrics sinks, AFAICT).

Copy link
Contributor Author

@DirectXMan12 DirectXMan12 Apr 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to have a set of features that all the current metrics sinks support, and which seem likely for any future metrics sinks to support (if you think I should trim of any features in favor of this goal, please let me know).

All computations, filtering, etc should be performed in the sink. Oldtimer
should only be composing queries. Ergo, the feature set of Oldtimer must
represent the lowest-common-denominator of features supported by the sinks.
Oldtimer is meant to be an API for performing basic aggregations supported by
all of the sinks, and is not meant to be a general purpose query tool.

At the time of writing of this proposal, the following sinks were considered:
Hawkular, InfluxDB, GCM, and OpenTSDB. However, the aggregations supported are
fairly basic, so if new sinks are added, it should be fairly likely that they
support the required Oldtimer features.

## Scaling and Performance Considerations

Since Oldtimer itself does not store any data, it should have a fairly low
memory footprint. The current plan is to have Oldtimer run as part of the main
Heapster executable. However, in the future it may be advantageous to have the
ability to split Oldtimer out into a separate executable in order to scale it
independently of Heapster.

The metrics sinks themselves should already have clustering support, and thus
can be scaled if needed. Since Oldtimer queries the metrics sinks themselves,
response latency should depend mainly on how quickly the sinks can respond to
queries.