Skip to content

[Monitoring/Beats] Telemetry Data from Beats#18833

Merged
tsullivan merged 5 commits intoelastic:masterfrom
tsullivan:telem/add-beats
May 8, 2018
Merged

[Monitoring/Beats] Telemetry Data from Beats#18833
tsullivan merged 5 commits intoelastic:masterfrom
tsullivan:telem/add-beats

Conversation

@tsullivan
Copy link
Member

@tsullivan tsullivan commented May 4, 2018

This adds anonymous Beats statistics found in the Monitoring data to the telemetry payload.

To test:

  1. Collect some Beats Monitoring data
  2. x-pack$ node scripts/api_debug.js telemetry --host=https://<kibana_host>:5601 --auth=username:password | jq '.[].stack_stats | { beats: .beats }'
    
  3. Example output:
    {
      "beats": {
        "count": 2,
        "versions": {
          "7.0.0-alpha1": 2
        },
        "types": {
          "metricbeat": 1,
          "filebeat": 1
        },
        "outputs": {
          "elasticsearch": 2
        },
        "eventsPublished": 1020,
        "hosts": 1
      }
    }
    

NOTE: module stats are still in-progress on the Beats collection side

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: it's possible to filter apm-server out of the search, but I figure it's useful for telemetry

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably filter it out since APM Server isn't really supported by monitoring and it will almost certainly have different metrics. If we change our minds later, it's easier to remove the filter than to add it and adjust everything.

Copy link
Member Author

@tsullivan tsullivan May 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6d010fb - I just added the filter to the query, not updated the mock data for the unit test

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Copy link
Member

@pickypg pickypg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks great, but I think we can optimize for the expensive scenarios.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably filter it out since APM Server isn't really supported by monitoring and it will almost certainly have different metrics. If we change our minds later, it's easier to remove the filter than to add it and adjust everything.


const results = await callCluster('search', params);

const hitsLength = get(results, 'hits.hits.length', -1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there is a UI involved, the easiest way to determine if you need to get the next page when you cannot trust the total size is to over-request. So in the request, request HITS_SIZE + 1. Then just process HITS_SIZE and, if the + 1 exists, then request the next page and repeat.

But in this case, there's no extra pain because either way we're stuck requesting that next page -- it may come back empty though. We should default this value to 0 and ignore the whole results payload in that case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const results = await callCluster('search', params);

const hitsLength = get(results, 'hits.hits.length', -1);
resultSet.push(results);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should pre-process each payload before saving it to minimize the amount of data that we maintain in the heap during this process. Otherwise we will have every unique instance sitting in heap until the end.

Perhaps handleBeatsResultSet could be augmented to have the clusters object passed in so that it can just pickup where it left off while also only expecting a single result set instead of the paged array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Are you thinking we pass in results from a single query as well as the clusters object to handleBeatsResultSet, which mutates clusters to have new processed stat data? The object mutation wouldn't worry me since the object's lifecycle exists just in this lib file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Copy link
Member

@pickypg pickypg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ruflin
Copy link
Contributor

ruflin commented May 8, 2018

The issue to track the progress on getting these metrics from Beats is here: elastic/beats#7027

@tsullivan
Copy link
Member Author

Thank you!

@tsullivan tsullivan merged commit 698dfa4 into elastic:master May 8, 2018
@tsullivan tsullivan deleted the telem/add-beats branch May 8, 2018 16:19
tsullivan added a commit to tsullivan/kibana that referenced this pull request May 8, 2018
* [Monitoring/Beats] Telemetry Data from Beats

* filter apm-server

* ignore results payload if hitsLength === 0

* process each payload as stats are saved to clusters object
@tsullivan
Copy link
Member Author

6.x/6.4: #18922

tsullivan added a commit that referenced this pull request May 9, 2018
* [Monitoring/Beats] Telemetry Data from Beats

* filter apm-server

* ignore results payload if hitsLength === 0

* process each payload as stats are saved to clusters object
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants