Heapster long term vision #769

mwielgus · 2015-12-08T09:43:25Z

No description provided.

piosz · 2015-12-08T09:54:45Z

cc @bgrant0607 @davidopp @dchen1107 @fgrzadkowski @jimmidyson @jszczepkowski @vishh

vishh · 2015-12-08T20:27:24Z

docs/proposals/vision.md

+with 1h resolution), keeps them in memory and exposes via Heapster API. This API is mainly
+used by Horizontal Pod Autoscaler which asks for the most recent performance related 
+metrics to adjust the number of pods to the incoming traffic. The API is also used by KubeDash
+(which unfortunately didn’t get enough traction and will be replaced) and will be used 


This statement is vague. We as kube developers haven't put enough energy into making kubedash primetime ready. Why throw away a good piece of software, when no alternative exists?

Agree. I would just remove the bit about KubeDash.

OK, I can rephrase this sentence.

It was not my decision to start yet another UI. But this is the situation right now. Kubernetes Dashboard is actively developed (https://github.com/kubernetes/dashboard/graphs/contributors) by Google and Fujitsu, has some working prototype and will probably be delivered for 1.2 to some extent. On the other hand there is KubeDash that is "stable" from early October. It has some very specific requirements, like 1-day-log cpu usage average, which may or may not be relevant once Kubernetes Dashboard becomes the default/main Kubernetes user interface.

jimmidyson · 2015-12-08T20:44:28Z

docs/proposals/vision.md

+
+* [UC1] Read metrics from nodes and write them to an external storage.
+* [UC2] Expose metrics from the last 2-3 minutes (for HPA and GKE)
+* [UC3] Read Events from the API server and write them to a permanent storage


What is the reason that event storage is part of heapster rather than a separate tool to do that?

Ah I see the Eventer below.

Yep, right now metrics and events are combined into one tool but we are planning to split them.

The original plan was to combine events and metrics data and build more interesting signals for end users. I guess even if we split heapster into separate binaries, if we want to build such models, we will have to combine the data somewhere else.

AFAIK there are no immediate plans to for any event/metrics combining. Once we decide to do it we can revisit this item and decide what is the best:

having a separate component

having a sink that also listens to Kubernetes events

glueing the two binaries again (unlikely)

If you want to discuss this now please schedule a VC.

events are too heavy imo, I would vote to not try to amalgamate, but leave that to back-end systems to ETL and learn from operational data.

DirectXMan12 · 2015-12-10T16:44:57Z

docs/proposals/vision.md

+* [UC1] Read metrics from nodes and write them to an external storage.
+* [UC2] Expose metrics from the last 2-3 minutes (for HPA and GKE)
+* [UC3] Read Events from the API server and write them to a permanent storage
+* [UC4] Do some long-term (hours, days) metrics analysis to get stats (average, 95 percentile) 


The "max" value is also important here, because it lets you see if there was any activity at all (not particularly useful for things like CPU and memory, but for net, or certain custom metrics like hits per second, it could be used to determine if the pod was useful)

Currently, max, avg and 95%ile is made available for last minute, hour and day.

Yep, max will be there too.

piosz · 2015-12-10T18:33:13Z

cc @bryk

vishh · 2015-12-10T18:34:35Z

docs/proposals/vision.md

+etc. present in the system.
+
+There is also a HeapsterGKE API dedicated for GKE through which it’s possible to get a full 
+dump of all metrics (spanning last minute or two).


last two minutes as of now to be specific.

bryk · 2015-12-14T14:49:14Z

CC @joeatwork

k8s-bot · 2015-12-14T21:01:04Z

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

ncdc · 2015-12-17T16:43:36Z

cc @kubernetes/rh-cluster-infra @jeremyeder @timothysc @smarterclayton @mwringe

ncdc · 2015-12-17T16:43:44Z

cc @kubernetes/rh-scalability

vishh · 2015-12-18T20:21:35Z

docs/proposals/vision.md

+with 1h resolution), keeps them in memory and exposes via Heapster API. This API is mainly
+used by Horizontal Pod Autoscaler which asks for the most recent performance related 
+metrics to adjust the number of pods to the incoming traffic. The API is also used by KubeDash
+and will be used by the new UI (which will replace KubeDash) as well. 


Can you add a link to an issue that tracks the replacement?
We should place that issue in the kubedash repo as well to make it clear for the existing kubedash users.

vishh · 2015-12-18T20:47:19Z

docs/proposals/vision.md

+
+## Custom Metrics Status
+
+Heapster is not a generic solution for gathering arbitrary number of arbitrary-formated custom 


Why not? What if custom metrics are required for the GKE pipeline in the near future?

Because we cannot scale that much to say that we can take arbitrary number of custom metrics. 100+ metrics per pod with 1000 nodes and 30 pods on each node will probably require sharded/clustered Heapster for which we will likely not get "time budget" anytime soon.

You'd likely get more mileage out of contributing to existing projects, e.g. Prometheus, if this became a requirement.

If we assume that none of those custom-metrics are cached, do we still expect to have a huge resource impact to proxy metrics over to a sink?
@jimmidyson: What you suggest is an alternative for sure. Excepting a simple aggregation, I'm not suggesting any additional features. Prometheus is a monitoring system by itself, whereas heapster is only an aggregation agent.

timothysc · 2016-01-25T16:51:12Z

docs/proposals/vision.md

+(with support for CoreOS Fleet and flat file node lists). 
+
+Metrics collected by Heapster can be written into multiple kinds of storage - Influxdb, 
+OpenTSDB, Google Cloud Monitoring, Hawkular, Kaflka, Riemann, ElasticSearch (some of them are


s/Kaflka/Kafka

timothysc · 2016-01-25T20:40:31Z

re: events, this overlaps a lot with work folks want to do to enable direct sharding of data to kafka.

kubernetes/kubernetes#19637

mwielgus · 2016-02-02T14:14:26Z

re: kafka, thanks for the heads up.

mwielgus · 2016-02-02T14:16:45Z

Merging the proposal as is. Most of the proposal is already implemented in heapster-scalability branch. For other stuff I'm happy to set up different issues/document or have a vc. Please let me know if you feel a strong need to discuss a particular case.

k8s-bot · 2016-02-02T14:17:10Z

Jenkins GCE e2e

Build/test passed for commit c0d82aa.

Build Log

Heapster long term vision

mwielgus assigned piosz Dec 8, 2015

googlebot added the cla: yes label Dec 8, 2015

vishh reviewed Dec 8, 2015
View reviewed changes

piosz assigned fgrzadkowski and unassigned piosz Dec 8, 2015

jimmidyson reviewed Dec 8, 2015
View reviewed changes

mwielgus force-pushed the vision branch from 5e14d43 to 88b3105 Compare December 10, 2015 16:22

DirectXMan12 reviewed Dec 10, 2015
View reviewed changes

vishh reviewed Dec 10, 2015
View reviewed changes

joeatwork mentioned this pull request Dec 14, 2015

Install Heapster as a default addon coreos/coreos-kubernetes#187

Closed

vishh reviewed Dec 18, 2015
View reviewed changes

mwielgus mentioned this pull request Dec 23, 2015

Log sink for events #816

Merged

timothysc reviewed Jan 25, 2016
View reviewed changes

Heapster long term vision

c0d82aa

mwielgus force-pushed the vision branch from 88b3105 to c0d82aa Compare February 2, 2016 14:15

mwielgus added a commit that referenced this pull request Feb 2, 2016

Merge pull request #769 from mwielgus/vision

0e06b6b

Heapster long term vision

mwielgus merged commit 0e06b6b into kubernetes-retired:master Feb 2, 2016

DirectXMan12 mentioned this pull request Apr 18, 2016

Proposal: Introduce Oldtimer #1124

Merged

mwielgus mentioned this pull request Apr 18, 2016

Decide and document the scope of heapster's responsibilities #665

Closed

cblecker unassigned fgrzadkowski Nov 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Heapster long term vision #769

Heapster long term vision #769

mwielgus commented Dec 8, 2015

piosz commented Dec 8, 2015

vishh Dec 8, 2015

jimmidyson Dec 8, 2015

mwielgus Dec 8, 2015

mwielgus Dec 8, 2015

jimmidyson Dec 8, 2015

jimmidyson Dec 8, 2015

mwielgus Dec 8, 2015

vishh Dec 18, 2015

mwielgus Dec 21, 2015

timothysc Jan 25, 2016

DirectXMan12 Dec 10, 2015

vishh Dec 18, 2015

mwielgus Dec 21, 2015

piosz commented Dec 10, 2015

vishh Dec 10, 2015

bryk commented Dec 14, 2015

k8s-bot commented Dec 14, 2015

ncdc commented Dec 17, 2015

ncdc commented Dec 17, 2015

vishh Dec 18, 2015

vishh Dec 18, 2015

mwielgus Dec 21, 2015

jimmidyson Dec 21, 2015

vishh Dec 28, 2015

timothysc Jan 25, 2016

mwielgus Feb 2, 2016

timothysc commented Jan 25, 2016

mwielgus commented Feb 2, 2016

mwielgus commented Feb 2, 2016

k8s-bot commented Feb 2, 2016


		## Custom Metrics Status

		Heapster is not a generic solution for gathering arbitrary number of arbitrary-formated custom

Heapster long term vision #769

Heapster long term vision #769

Conversation

mwielgus commented Dec 8, 2015

piosz commented Dec 8, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

piosz commented Dec 10, 2015

Choose a reason for hiding this comment

bryk commented Dec 14, 2015

k8s-bot commented Dec 14, 2015

ncdc commented Dec 17, 2015

ncdc commented Dec 17, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timothysc commented Jan 25, 2016

mwielgus commented Feb 2, 2016

mwielgus commented Feb 2, 2016

k8s-bot commented Feb 2, 2016