Fetches control group resource information#10402
Fetches control group resource information#10402tylersmalley merged 10 commits intoelastic:masterfrom
Conversation
|
Test failure looks legitimate, I'm guessing we need to wait for the promise to resolve on previous status tests |
There was a problem hiding this comment.
do we need to update this to match? elastic/elasticsearch@00a8b87
There was a problem hiding this comment.
is it possible to check for the files once, and if they don't exist stop checking?
There was a problem hiding this comment.
Now preventing lookups when we determine there is no cgroups information. 655f54b
tsullivan
left a comment
There was a problem hiding this comment.
Looks like this grabs the cgroup info from /sys/fs/cgroup, unfortunately I think this does need to be tested from inside a Docker container as the ES team found there are inconsistencies in where Docker publishes the cgroup statistics: elastic/elasticsearch#22757
1ae1a6a to
ac70577
Compare
|
@tsullivan - I pushed a fix for the docker cgroup path. Here is a docker image on Dockerhub to help with testing: tylersmalley/kibana:6.0.0-alpha1-SNAPSHOT-1ae1a6a You can limit the CPU to exposed throttled metrics by passing And here are snapshots of the PR: When running, you can apply cgroup constraints to a process/group. To do so, run |
There was a problem hiding this comment.
I would recommend using the radix parameter of parseInt, because without it behavior could be unpredictable.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/parseInt
There was a problem hiding this comment.
radix parameter on these parseInts
|
Thanks for providing the Docker images to test with - that was extremely helpful. I left just one comment about something that popped up in a few places. I'm surprised that radix for parseInt thing isn't part of our lint rules 😺 |
|
Rebasing should fix the test failures |
There was a problem hiding this comment.
do you think there's any need for the path to be configurable? ie instead of a bool, cpu.cgroup.path = Joi.string()
There was a problem hiding this comment.
These configuration options align with logstash/ES outlined here: elastic/logstash#6797 (comment)
There was a problem hiding this comment.
👍 for consistency. it's not clear to me from the thread if override is meant to be a bool, the current (eventually deprecated) implementation in es looks like a string. if i'm missing something feel free to ignore.
There was a problem hiding this comment.
You're right, good find. https://github.com/elastic/elasticsearch-docker/blob/master/build/elasticsearch/bin/es-docker#L37
Will update
There was a problem hiding this comment.
how would you feel about logging this error instead of throwing? if something comes up we can still return non c-group stats
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
This was addressed in elastic/elasticsearch#23219 Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
Docker cgroups are mounted in the wrong place (i.e., inconsistently with /proc/self/cgroup). This commit adds an undocumented hack for working around, for now. Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
c61e7fd to
3d1caca
Compare
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
5812d7b to
71a60b1
Compare
|
Addressed feedback and pushed new builds for testing: DockerHub: tylersmalley/kibana:6.0.0-alpha1-SNAPSHOT-5812d7b Example: |
jbudz
left a comment
There was a problem hiding this comment.
Nice. One inline question below. And can we add docs for the new configuration values?
Otherwise LGTM:
cgroup: {
cpuacct: {
control_group: "/",
usage_nanos: 42023428539
},
cpu: {
control_group: "/",
cfs_period_micros: 100000,
cfs_quota_micros: -1,
stat: {
number_of_elapsed_periods: 0,
number_of_times_throttled: 0,
time_throttled_nanos: 0
}
}
}
| export function getMetrics({ event, config }) { | ||
| export async function getMetrics(event, config, server) { | ||
| const port = config.get('server.port'); | ||
| const timestamp = new Date().toISOString(); |
There was a problem hiding this comment.
do you think we should set timestamp after cgroup stats have been read? along the same lines, do you think we should only getMetrics once the previous getMetrics call is done so they don't queue up for frequent ops events?
There was a problem hiding this comment.
Having it right at the ops event is probably best as it's actually involved in aggregations (request count, etc). Capturing cgroups is also a very quick operation in that it's reading most of the files in parallel and they aren't actually on disk. We should revisit when we add CPU percentage.
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
|
@jbudz documented configuration options - placed near existing status config. |
|
@tsullivan - mind giving another look? |
pickypg
left a comment
There was a problem hiding this comment.
I tested this successfully on Windows (no Cgroup data, but it didn't explode either). I just have some questions about the settings.
| The minimum value is 100. | ||
| `status.allowAnonymous`:: *Default: false* If authentication is enabled, setting this to `true` allows | ||
| unauthenticated users to access the Kibana server status API and status page. | ||
| `cpu.cgroup.path.override`:: Override for cgroup cpu path |
There was a problem hiding this comment.
Could we show an example of this being overridden?
|
|
||
| cpu: Joi.object({ | ||
| cgroup: Joi.object({ | ||
| path: Joi.object({ |
There was a problem hiding this comment.
I don't feel like we normally specific "override" as the field to override, rather just cpu.cgroup.path would be set or unset (default). Specifically I'm looking at examples like server.basePath, plugins.paths, and path.data.
There was a problem hiding this comment.
The name was discussed here for Logstash/ES. While I agree we could omit "override" it maintains consistency with the other projects.
There was a problem hiding this comment.
Ah that's unfortunate, but I guess it's intentional to really show that you're going against defaults.
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
Adds control group data to status API and kbnServer.metrics
|
5.x: b8bb212 |

These are made available to the status API
api/statusandkbnServer.metrics.