Proposal: Introduce Oldtimer

This is a proposal for Oldtimer, the Heapster historical metrics access component. Oldtimer was original proposed in the vision statement, but was not specified in any particular detail previously.
kubernetes-retired · Apr 11, 2016 · 2cd3494 · 2cd3494
1 parent de510e4
commit 2cd3494
Showing 1 changed file with 144 additions and 0 deletions.
diff --git a/docs/proposals/old-timer.md b/docs/proposals/old-timer.md
@@ -0,0 +1,144 @@
+# Heapster Oldtimer
+
+## Overview
+
+Prior to the Heapster refactor, the Heapster model presented aggregations of
+metrics over certain time periods (the last hour and day).  Post-refactor, the
+concern of presenting an interface for historical metrics was to be split into
+a separate Heapster component: Oldtimer.
+
+Oldtimer will present common interfaces for retrieving historical metrics over
+longer periods of time than the Heapster model, and will allow fetching
+aggregations of metrics (e.g. averages, 95 percentile, etc) over different
+periods of time.  It will do this by querying the sink to which it is storing
+metrics.
+
+Note: even though we are retrieving metrics, this document refers to the
+metrics storage locations as "sinks" to be consistent with the rest
+of Heapster.
+
+## Motivation
+
+There are two major motivations for exposing historical metrics information:
+
+1. Using aggregated historical data to make size-related decisions
+   (for example, idling requires looking for traffic over a long time period)
+
+2. Providing a common interface to for users to view historical metrics
+
+Before the Heapster refactoring (see the "Heapster Long Term Vision" proposal),
+Heapster supported querying metrics aggregated over certain extended time
+periods (the last hour and day) via the Heapster model.
+
+However, since the Heapster model is stored in-memory, and not persisted to
+disk, this historical data would be "lost" whenever Heapster was restarted.
+This made it unreliable for use by system components which need a historical
+view.
+
+Since we already persist metrics into a sink, it does not make sense for
+Heapster itself to persist long-term metrics to disk itself.  Instead, we can
+just query the sink directly.
+
+## Design
+
+### API
+
+Oldtimer will present an api somewhat similar to the normal Heapster model.
+The urls will take the forms:
+
+`/api/v1/old-timer/{prefix}/metrics/`: Returns a list of all available metrics.
+
+`/api/v1/old-timer{prefix}/metrics/{metric-name}?start=X&end=Y`: Returns a set
+of (Timestamp, Value) pairs for the requested {prefix}-level metric, over the
+given time range.
+
+`/api/v1/old-timer/{prefix}/metrics/{metric-name}/{aggregation-name}?start=X&end=Y&bucket=B`:
+Returns the requested {prefix}-level metric, aggregated with the given
+aggregation over the requested time period (potentially split into several
+different bucket of duration `B`).  `{aggregation}` may be a comma-separated
+list of aggregations to retrieve multiple at once.
+
+Where `{prefix}` is either empty (cluster-level), `/namespaces/{namespace}`
+(namespace-level), `/namespaces/{namespace}/pods/{pod-name}` (pod-level),
+`/namespaces/{namespace}/pod-list/{pod-list}` (multi-pod-level), or
+`/namespaces/{namespace}/pods/{pod-name}/containers/{container-name}`
+(container-level).
+
+In addition, when `{prefix}` is not empty, there will be a url of the form:
+`/api/v1/old-timer/{prefix-without-final-element}` which allows fetching the
+list of available nodes/namespaces/pods/containers.
+
+The `start` and `end` parameters are defined the same way as for the model.
+The `bucket` (bucket duration) parameter is a number followed by any of the
+following suffixes:
+
+- `ms`: milliseconds
+- `s`: seconds
+- `m`: minutes
+- `h`: hours
+- `d`: days
+
+### Functionality
+
+When Oldtimer receives a request at one of the given URLs, it will compose a
+query to the configured metrics sink, execute that query, and return the
+results.  The return format for normal requests will be the same as that
+returned by the Heapster model.
+
+In the case of aggregations, the normal `MetricsResult` and `MetricsResultList`
+are wrapped in order to differentiate between different aggregations.  Each
+metric point represents one bucket (if no buckets are requested, only one point
+is returned).  The timestamp in the case of aggregations is the timestamp of
+the start of that bucket.
+
+```go
+type MetricAggregationResult struct {
+    Average *MetricResult
+    Maximum *MetricResult
+    Minimum *MetricResult
+    Median *MetricResult
+    Count *MetricResult
+    Percentiles map[uint64]MetricResult
+}
+
+type MetricListAggregationResult struct {
+    Average *MetricResultList
+    Maximum *MetricResultList
+    Minimum *MetricResultList
+    Median *MetricResultList
+    Count *MetricResultList
+    Percentiles map[uint64]MetricResultList
+}
+```
+
+### Aggregations
+
+Several different aggregations will be supported.  Aggregations should be
+performed in the metrics sink.  If more aggregations later become supported
+across all metrics sinks, the list can be expanded (and the API version
+should probably be bumped, since the supported aggregations should be part of
+the API).
+
+- Average (arithmetic mean): `/{metric-name}/average`
+- Maximum: `/{metric-name}/max`
+- Minimum: `/{metric-name}/min`
+- Percentile: `/{metric-name}/{number}-perc`
+- Median: `/{metric-name}/median`
+- Count: `/{metric-name}/count`
+
+## Scaling and Performance Considerations
+
+Since Oldtimer itself does not store any data, it should be fairly easy to
+deploy multiple replicas of Oldtimer.  The metrics sinks themselves should
+already have clustering support, and thus can be scaled as well.  Since
+Oldtimer queries the metrics sinks themselves, response latency should
+depend mainly on how quickly the sinks can respond to queries.
+
+## Open Questions
+
+- Do the choice of percentiles need to be limited?  InfluxDB and Hawkular
+appear to support arbitrary percentile values in queries, while GCM v3 appears
+to support 99, 95, 50, 5, and OpenTSDB appears to support 50, 75, 90, 95, 99,
+and 999 (meaning the common values would be 50, 95, and 99).
+
+