[Logstash] [Stack Monitoring] Add new metric to identify older version pipelines by justinkambic · Pull Request #34487 · elastic/kibana

justinkambic · 2019-04-03T20:46:02Z

EDIT - Adding testing instructions:

Testing

Assumptions: running this Kibana patch and a snapshot of ES 6.7
Configure Logstash 5.6 to send monitoring data. Start up with any pipeline like ./logstash -e 'input { stdin { } } output { stdout { } }'
- Before you run, execute ./bin/logstash-plugin install x-pack.
Load the Monitoring Overview page, wait for Logstash data to be visualized.
- At this point you should see the first two boxes, Overview and Nodes, but no Pipelines box
Configure Logstash 6.4+ (I used 6.7.0) to send monitoring data, start it up like above
- Once this instance starts logging to a monitoring index, you should see the Pipelines box appear. It will display 1 Pipeline even though there are two, because of the reasons discussed in the comments below. It seems like we're all ok with this as a limitation of older versions of Logstash.

Summary

Resolves #24279.

As noted in the issue referenced above, older versions of Logstash (pre-6.4.0) will not display in Monitoring's Overview page unless a newer version of LS is also running. The reason this happens is also explained in comments on that issue.

The goal here is to add a second metric that will detect monitoring documents from older Logstash instances. Essentially, if there are any instances without a pipeline.id field, we would treat that as a case where we should ignore the special logic that runs to hide LS until the monitoring index has accrued a sufficient amount of data.

Checklist

Use ~~strikethroughs~~ to remove checklist items you don't feel are applicable to this PR.

This was checked for cross-browser compatibility, including a check against IE11
Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
Documentation was added for features that require explanation or tutorials
Unit or functional tests were updated or added to match the most common scenarios
This was checked for keyboard-only and screenreader accessibility

For maintainers

This was checked for breaking API changes and was labeled appropriately
This includes a feature addition or change that requires a release note and was labeled appropriately

elasticmachine · 2019-04-03T20:46:05Z

Pinging @elastic/stack-monitoring

elasticmachine · 2019-04-03T21:35:57Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

elasticmachine · 2019-04-04T15:47:21Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

elasticmachine · 2019-04-04T17:21:53Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

chrisronline · 2019-04-05T01:51:15Z

Hey @justinkambic, Do you mind adding some testing steps here?

justinkambic · 2019-04-05T03:09:04Z

@chrisronline if you look at the test that failed CI you'll see that we're not counting things correctly using this method. The missing aggregation doesn't seem to catch things in the way I thought it would, likely because it's trying to look at logstash_stats.pipelines, which is an array field and seems to register as missing even if it's present.

I'm thinking instead it may be best to mimic the existing pipelines query utilizing nested and catch Logstashes that have a pipelines agg with 0 doc_count like below:

You can see the doc_count for the 5.6 instance is registering 0. Let me know if this sounds good to you.

chrisronline · 2019-04-05T14:27:17Z

I'm honestly not sure the best way of detecting this. This seems okay to me, but maybe @ycombinator has some thoughts

ycombinator · 2019-04-05T15:59:23Z

Yes, I think you're running into elastic/elasticsearch#9571 with trying to get the missing agg working on a nested field.

I think what @justinkambic proposed above should work but I want to spend some more time playing with the aggs.

justinkambic · 2019-04-05T16:00:54Z

I want to spend some more time playing with the aggs.

Please feel free, I'm open to better suggestions!

justinkambic · 2019-04-05T16:15:30Z

It's also worth noting that whatever solution we introduce, we haven't addressed the queue type counts so far. If I am correct we are safe in assuming any of these older pipelines are using in-memory queues, so we could safely count +1 mem queue per "missing" pipeline to keep the overview screen consistent with the pipeline count.

elasticmachine · 2019-04-05T16:30:47Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

ycombinator · 2019-04-08T16:54:49Z

@justinkambic @chrisronline As promised, I played with the aggs a bit more and came up with the following solution. It involves extending the existing LogstashPipelineNodeCountMetric metric class with a new top-level agg and corresponding changes in the calculation method, without requiring any other code changes/additions:

diff --git a/x-pack/plugins/monitoring/server/lib/metrics/logstash/classes.js b/x-pack/plugins/monitoring/server/lib/metrics/logstash/classes.js
index dade736cd5..d1a278960e 100644
--- a/x-pack/plugins/monitoring/server/lib/metrics/logstash/classes.js
+++ b/x-pack/plugins/monitoring/server/lib/metrics/logstash/classes.js
@@ -376,6 +376,27 @@ export class LogstashPipelineNodeCountMetric extends LogstashMetric {
                 }
               }
             }
+          },
+          no_pipelines: {
+            filter: {
+              bool: {
+                must_not: {
+                  nested: {
+                    path: 'logstash_stats.pipelines',
+                    query: {
+                      match_all: {}
+                    }
+                  }
+                }
+              }
+            },
+            aggs: {
+              node_count: {
+                cardinality: {
+                  field: this.field
+                }
+              }
+            }
           }
         }
       }
@@ -395,6 +416,15 @@ export class LogstashPipelineNodeCountMetric extends LogstashMetric {
         );
       });
 
+      const DEFAULT_PIPELINE_ID = 'main'; // FIXME: Move to common constants
+      const oldNodesCount = _.get(bucket, 'no_pipelines.node_count.value', 0)
+      if (oldNodesCount > 0) {
+        if (!pipelineNodesCounts.hasOwnProperty(DEFAULT_PIPELINE_ID)) {
+          pipelineNodesCounts[DEFAULT_PIPELINE_ID] = 0
+        }
+        pipelineNodesCounts[DEFAULT_PIPELINE_ID] += oldNodesCount
+      }
+
       return pipelineNodesCounts;
     };
   }

Let me know what you think?

chrisronline · 2019-04-08T17:39:32Z

Assuming that works (which I think it should), that looks much better!

justinkambic · 2019-04-08T17:55:55Z

Assuming that works (which I think it should), that looks much better!

Agreed - testing locally right now.

justinkambic · 2019-04-08T18:07:35Z

@ycombinator I think this looks pretty good - would you want to put up a PR with your patch? You're welcome to push to this one if you prefer.

…id-metric

…n string.

justinkambic · 2019-04-29T15:55:44Z

          </a>
          <a
-            ng-if="monitoringMain.instance"
+            ng-if="monitoringMain.instance && monitoringMain.pipelineCount > 0 && monitoringMain.shouldDisplayPipelineNav === true"


@ycombinator it's possible that the second condition in this ng-if is redundant now, but I did not test that.

elasticmachine · 2019-04-29T16:43:00Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

…anges.

elasticmachine · 2019-04-29T19:10:57Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request

ycombinator · 2019-04-29T23:57:56Z

I just re-tested this PR and I'm seeing a couple of issues when I run two Logstash nodes, one with v5.6.0 and the other with v6.7.1:

Navigating to the v6.7.1 Logstash node, then clicking the Pipelines tab shows me the node's pipelines but it takes away the Pipelines tab on that page, i.e. it only shows the Overview and Advanced tabs. Note that this is not true if you click on the Overview tab or the Advanced tab - you still all three tabs in those cases.

I switched my Kibana check out to the 6.7 branch and this bug isn't happening there so it's something specific to this PR.

On my v5.6.0 node, I ran a pipeline like so: ./bin/logstash -e 'input { stdin {} } output { stdout {} }'. On my v6.7.1, I ran a pipeline like so: ./bin/logstash -e 'input { stdin {} } output { stdout {} }' --pipeline.id=main2 (note the --pipeline.id argument specifying a non-default pipeline ID). In this scenario, I see the Pipelines box on the Cluster Overview page. However, clicking on it throws a 500 error. Kibana server logs show this:

server   error  [23:39:39.287] [error][monitoring-ui] TypeError: Cannot read property 'data' of undefined
 at processedResponse.pipelines.forEach.pipeline (/Users/shaunak/development/github/kibana/x-pack/plugins/monitoring/server/lib/logstash/get_pipelines.js:69:66)
 at Array.forEach (<anonymous>)
 at processPipelinesAPIResponse (/Users/shaunak/development/github/kibana/x-pack/plugins/monitoring/server/lib/logstash/get_pipelines.js:63:31)
 at handler (/Users/shaunak/development/github/kibana/x-pack/plugins/monitoring/server/routes/api/v1/logstash/pipelines/cluster_pipelines.js:49:32)

justinkambic · 2019-04-30T14:05:21Z

I'll try to see if I can find some more time to dig into this but it will probably be on the back burner again for a bit. Thanks for testing so thoroughly! 💯

cachedout · 2019-06-18T10:46:51Z

Hi @justinkambic. I'm just doing some team organization today and I came across this PR and I wanted to know if you think you'll have time to get back to it in the coming weeks or if there's anything that @elastic/stack-monitoring can do to help move it along? Thanks!

elasticmachine · 2019-06-18T11:07:53Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

justinkambic · 2019-06-18T13:21:42Z

@cachedout I have been busy with 7.2 tasks and getting our 7.3 Uptime work rolling. I am planning to have some time to dedicate to it in the next week or two. If it goes beyond that span we should probably talk about a handoff.

cachedout · 2019-06-18T13:34:00Z

@justinkambic Cool. Thanks!

HuangJiaRen · 2019-06-21T02:12:28Z

Error: Index .kibana belongs to a version of Kibana that cannot be automatically migrated. Reset it or use the X-Pack upgrade assistant.kibana7.1.1 version

ycombinator · 2019-06-21T02:25:40Z

Hi @HuangJiaRen, This pull request is not the appropriate place for your question. Please post your question at https://discuss.elastic.co/c/kibana.

cachedout · 2019-07-24T08:44:01Z

Hi again @justinkambic . I'm just checking back in on this. Do you think you'll be able to come back to this in the next few days or would you prefer to discuss a handoff?

justinkambic · 2019-08-12T14:16:15Z

Hey @cachedout - sorry for the delayed reply, was OOO for several weeks. Perhaps we should hand it off, I don't see my workload letting up any time soon and this has been on the back burner for quite some time.

cachedout · 2019-08-13T12:08:02Z

@igoristic Does this look like something you might be able to take on?

justinkambic · 2019-08-13T13:17:10Z

I'm happy to provide explanation of the approach we were using if it's helpful. IIRC this change was pretty close, but @ycombinator found a case where the changes weren't working like they should and I haven't been back to the work since then.

igoristic · 2019-08-19T17:33:11Z

@cachedout Just caught up on this issue now. This looks like something I can investigate, however, this might be irrelevant now since we no longer support 6.7 (and less than). I'll check to see if this is still a problem with 6.8

elasticmachine · 2019-09-19T23:08:51Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

chrisronline · 2020-02-27T16:06:18Z

Closing this due to inactivity

Add new metric to identify older pipelines.

43c6be4

justinkambic added Team:Logstash Team:Monitoring Stack Monitoring team v6.7.2 labels Apr 3, 2019

justinkambic self-assigned this Apr 3, 2019

justinkambic mentioned this pull request Apr 3, 2019

[Logstash] [Stack Monitoring] Add new metric to check for pipelines missing a pipeline.id field #34435

Closed

7 tasks

Update test snapshot.

f7cb116

Clean up code, update test snapshot.

d786ee4

justinkambic added the review label Apr 4, 2019

justinkambic requested a review from chrisronline April 4, 2019 19:14

justinkambic marked this pull request as ready for review April 4, 2019 19:14

justinkambic removed the review label Apr 4, 2019

Update pipelines counting logic.

de699c6

Augment LogstashPipelineNodeCountMetric for older LS nodes

ec9fc39

ycombinator mentioned this pull request Apr 8, 2019

Augment LogstashPipelineNodeCountMetric for older LS nodes justinkambic/kibana#4

Merged

justinkambic added 2 commits April 29, 2019 09:43

Merge branch '6.7' into logstash_6.7-monitoring-add-pipeline-without-…

b53d64a

…id-metric

Add function to determine if nav should show based on Logstash versio…

89a1b5b

…n string.

justinkambic commented Apr 29, 2019

View reviewed changes

justinkambic added the v6.7.3 label Apr 29, 2019

justinkambic added 4 commits April 29, 2019 14:14

Autoformat node advanced fixture to make it easier to diff updated data.

05b4348

Add updated data to fixture to help test pass with API changes.

fd1eb91

Run prettier on existing node detail fixture to make diffing easier.

5cb651b

Add updated data to node detail fixture to help test pass with API ch…

7b01025

…anges.

peterschretlen added v6.8.0 and removed v6.7.3 labels May 1, 2019

justinkambic removed the v6.8.0 label May 2, 2019

spalger changed the base branch from 6.7 to 6.8 September 19, 2019 22:20

chrisronline closed this Feb 27, 2020

Conversation

justinkambic commented Apr 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

Summary

Checklist

For maintainers

Uh oh!

elasticmachine commented Apr 3, 2019

Uh oh!

elasticmachine commented Apr 3, 2019

💔 Build Failed

Uh oh!

elasticmachine commented Apr 4, 2019

💔 Build Failed

Uh oh!

elasticmachine commented Apr 4, 2019

💔 Build Failed

Uh oh!

chrisronline commented Apr 5, 2019

Uh oh!

justinkambic commented Apr 5, 2019

Uh oh!

chrisronline commented Apr 5, 2019

Uh oh!

ycombinator commented Apr 5, 2019

Uh oh!

justinkambic commented Apr 5, 2019

Uh oh!

justinkambic commented Apr 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Apr 5, 2019

💔 Build Failed

Uh oh!

ycombinator commented Apr 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chrisronline commented Apr 8, 2019

Uh oh!

justinkambic commented Apr 8, 2019

Uh oh!

justinkambic commented Apr 8, 2019

Uh oh!

justinkambic Apr 29, 2019

Choose a reason for hiding this comment

Uh oh!

elasticmachine commented Apr 29, 2019

💔 Build Failed

Uh oh!

elasticmachine commented Apr 29, 2019

💚 Build Succeeded

Uh oh!

ycombinator commented Apr 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

justinkambic commented Apr 30, 2019

Uh oh!

cachedout commented Jun 18, 2019

Uh oh!

elasticmachine commented Jun 18, 2019

💔 Build Failed

Uh oh!

justinkambic commented Jun 18, 2019

Uh oh!

cachedout commented Jun 18, 2019

Uh oh!

HuangJiaRen commented Jun 21, 2019

Uh oh!

ycombinator commented Jun 21, 2019 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cachedout commented Jul 24, 2019

Uh oh!

justinkambic commented Aug 12, 2019

Uh oh!

cachedout commented Aug 13, 2019

Uh oh!

justinkambic commented Aug 13, 2019

Uh oh!

justinkambic commented Apr 3, 2019 •

edited

Loading

justinkambic commented Apr 5, 2019 •

edited

Loading

ycombinator commented Apr 8, 2019 •

edited

Loading

ycombinator commented Apr 29, 2019 •

edited

Loading

ycombinator commented Jun 21, 2019 via email •

edited

Loading