-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add the "Examining cluster metrics" section #16752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
537db2d
e2f55ea
86f61b0
c2b29f9
f4ca1e7
5ffc634
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,6 +13,6 @@ Before application developers can monitor their applications, the human operator | |
|
|
||
| .Procedure | ||
|
|
||
| . In the {product-title} web console, navigate to *Operators* -> *OperatorHub* and install the Prometheus Operator in the namespace where your application is. | ||
| . In the {product-title} web console, navigate to the *Operators* -> *OperatorHub* page and install the Prometheus Operator in the namespace where your application is. | ||
|
|
||
| . Navigate to *Catalog* -> *Developer Catalog* and install Prometheus, Alertmanager, Prometheus Rule, and Service Monitor in the same namespace. | ||
| . Navigate to the *Catalog* -> *Developer Catalog* page and install Prometheus, Alertmanager, Prometheus Rule, and Service Monitor in the same namespace. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Navigate to the Catalog -> Developer Catalog page reason: there is not link in the left navigation area links to /catalog page
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Implemented in #16978. |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,13 +5,13 @@ | |
| [id="contents-of-the-alerting-ui_{context}"] | ||
| = Contents of the Alerting UI | ||
|
|
||
| This section shows and explains the contents of the Alerting UI, a Web interface to the Alertmanager. | ||
| This section shows and explains the contents of the Alerting UI, a web interface to the Alertmanager. | ||
|
|
||
| The main three pages of the Alerting UI are the *Alerts*, the *Silences*, and the *YAML* pages. | ||
|
|
||
| The *Alerts* page is located in *Monitoring* -> *Alerts* of the {product-title} web console. | ||
| The *Alerts* page is accessible by clicking *Monitoring* -> *Alerts* in the {product-title} web console. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Alerts page is accessible by clicking Monitoring -> Alerting -> Alerts
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Implemented in #16978. |
||
|
|
||
| image::alerts-screen.png[] | ||
| image::monitoring-alerts-screen.png[] | ||
|
|
||
| . Filtering alerts by their names. | ||
| . Filtering the alerts by their states. To fire, some alerts need a certain condition to be true for the duration of a timeout. If a condition of an alert is currently true, but the timeout has not been reached, such an alert is in the *Pending* state. | ||
|
|
@@ -21,9 +21,9 @@ image::alerts-screen.png[] | |
| . Value of the Severity label of the alert. | ||
| . Actions you can do with the alert. | ||
|
|
||
| The *Silences* page is located in *Monitoring* -> *Silences* of the {product-title} web console. | ||
| The *Silences* page is accessible by clicking *Monitoring* -> *Silences* in the {product-title} web console. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Silences page is accessible by clicking Monitoring -> Alerting -> Silences
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Implemented in #16978. |
||
|
|
||
| image::silences-screen.png[] | ||
| image::monitoring-silences-screen.png[] | ||
|
|
||
| . Creating a silence for an alert. | ||
| . Filtering silences by their name. | ||
|
|
@@ -33,9 +33,9 @@ image::silences-screen.png[] | |
| . Number of alerts that are being silenced by the silence. | ||
| . Actions you can do with a silence. | ||
|
|
||
| The *YAML* page is located in *Monitoring* -> *Alerting* -> *YAML* of the OpenShift Container Platform web console. | ||
| The *YAML* page is accessible by clicking *Monitoring* -> *Alerting* -> *YAML* in the {product-title} web console. | ||
|
|
||
| image::yaml-screen.png[] | ||
| image::monitoring-yaml-screen.png[] | ||
|
|
||
| . Upload a file with Alertmanager configuration. | ||
| . Examine and edit the current Alertmanager configuration. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| // Module included in the following assemblies: | ||
| // | ||
| // * monitoring/cluster-monitoring/examining-cluster-metrics.adoc | ||
|
|
||
| [id="contents-of-the-metrics-ui_{context}"] | ||
| = Contents of the Metrics UI | ||
|
|
||
| This section shows and explains the contents of the Metrics UI, a web interface to Prometheus. | ||
|
|
||
| The *Metrics* page is accessible by clicking *Monitoring* -> *Metrics* in the {product-title} web console. | ||
|
|
||
| image::monitoring-metrics-screen.png[] | ||
|
|
||
| . Actions. | ||
| * Add query. | ||
| * Expand all query tables. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If one result table is collapse, the menu is "Collapse all query tables", if all tables are collapsed, the menu is "Expand all query tables"
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I changed to "Expand or collapse", but didn't use "(Based on the result table is collapsed or expanded)", it is too long and unnecessary. |
||
| * Delete all queries. | ||
| . Hide the plot. | ||
| . The interactive plot. | ||
| . The catalog of available metrics. | ||
| . Add query. | ||
| . Run queries. | ||
| . Query forms. | ||
| . Expand or collapse the form. | ||
| . The query. | ||
| . Clear query. | ||
| . Disable query. | ||
| . Actions for a specific query. | ||
| * Disable query. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if the query is disabled, the menu is "Enable query", It is the same for "Hide all series"
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I changed to "Enable or disable", but didn't use "(Based on the query is disabled or enabled )", it is too long and unnecessary. |
||
| * Hide all series of the query from the plot. | ||
| * Delete query. | ||
| . The metrics table for a query. | ||
| . Color assigned to the graph of the metric. Clicking the square shows or hides the metric's graph. | ||
|
|
||
| Additionally, there is a link to the old Prometheus interface next to the title of the page. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd say: You can access the old Prometheus interface by clicking the Prometheus UI link at the top of the page. Would someone know it's the old interface? Or is it just an alternate interface?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The perspective of this whole section is "what do we have in this interface", that's why for all items I just call them out and say what they are for. You suggestion is valid for a procedure module. |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| // Module included in the following assemblies: | ||
| // | ||
| // * monitoring/cluster-monitoring/examining-cluster-metrics.adoc | ||
|
|
||
| [id="exploring-the-visualized-metrics_{context}"] | ||
| = Exploring the visualized metrics | ||
|
|
||
| After running the queries, the metrics are displayed on the interactive plot. The X axis of the plot represents time. The Y axis represents the metrics values. Each metric is shown as a colored graph. You can manipulate the plot and explore the metrics. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd probably italicize X and Y, as first usage. From contributing guidelines: https://github.com/openshift/openshift-docs/blob/enterprise-4.1/contributing_to_docs/doc_guidelines.adoc#quick-markup-reference
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not a term though. It just refers to something on the plot and says what it represents. |
||
|
|
||
| .Procedure | ||
|
|
||
| . Initially, all metrics from all enabled queries are shown on the plot. You can select which metrics are shown. | ||
| * To hide all metrics from a query, click {kebab} for the query and click *Hide all series*. | ||
| * To hide a specific metric, go to the query table and click the colored square near the metric name. | ||
| . To zoom into the plot and change the shown time range, do one of the following: | ||
| + | ||
| -- | ||
| * Visually select the time range by clicking and dragging on the plot horizontally. | ||
| * Use the menu in the left upper corner to select the time range. | ||
| -- | ||
| + | ||
| To reset the time range, click *Reset Zoom*. | ||
| . To display outputs of all queries at a specific point in time, hold the mouse cursor on the plot at that point. The query outputs will appear in a pop-up box. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hold the mouse cursor on the plot at that point
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's enough to just be on the right point in the X axis (time), you can be anywhere in the Y axis (metric value). It's good if the users know that, that's why I used the wording "on the plot at that point [in time]". You don't need to put cursor directly over the graph ("the line"). |
||
| . For more detailed information about metrics of a specific query, expand the table of that query using the drop-down button. Every metric is shown with its current value. | ||
| . To hide the plot, click *Hide Graph*. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,7 +9,7 @@ You can find an alert and see information about it or its governing alerting rul | |
|
|
||
| .Procedure | ||
|
|
||
| . Open the {product-title} web console and navigate to *Monitoring* -> *Alerts*. | ||
| . Open the {product-title} web console and navigate to the *Monitoring* -> *Alerting* -> *Alerts* page. | ||
|
|
||
| . Optional: Filter the alerts by name using the *Filter alerts by name* field. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Filter alerts by name
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Implemented in #16978. |
||
|
|
||
|
|
@@ -21,7 +21,7 @@ You can find an alert and see information about it or its governing alerting rul | |
| + | ||
| To see alert details, click on the name of the alert. This is the page with alert details: | ||
| + | ||
| image::alert-overview.png[] | ||
| image::monitoring-alert-overview.png[] | ||
| + | ||
| The page has the graph with timeseries of the alert. It also has information about the alert, including: | ||
| + | ||
|
|
@@ -32,7 +32,7 @@ The page has the graph with timeseries of the alert. It also has information abo | |
| + | ||
| To see alerting rule details, click the button in the last column and select *View Alerting Rule*. This is the page with alerting rule details: | ||
| + | ||
| image::alerting-rule-overview.png[] | ||
| image::monitoring-alerting-rule-overview.png[] | ||
| + | ||
| The page has information about the alerting rule, including: | ||
| + | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| // Module included in the following assemblies: | ||
| // | ||
| // * monitoring/cluster-monitoring/examining-cluster-metrics.adoc | ||
|
|
||
| [id="running-metrics-queries_{context}"] | ||
| = Running metrics queries | ||
|
|
||
| You begin working with metrics by entering one or several Prometheus Query Language (PromQL) queries. | ||
jboxman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| .Procedure | ||
|
|
||
| . Open the {product-title} web console and navigate to the *Monitoring* -> *Metrics* page. | ||
|
|
||
| . In the query field, enter your PromQL query. | ||
| * To show all available metrics and PromQL functions, click *Insert Metric at Cursor*. | ||
| . For multiple queries, click *Add Query*. | ||
| . For deleting queries, click {kebab} for the query, then select *Delete query*. | ||
| . For keeping but not running a query, click the *Disable query* button. | ||
| . Once you finish creating queries, click the *Run Queries* button. The metrics from the queries are visualized on the plot. If a query is invalid, the UI shows an error message. | ||
| + | ||
| [NOTE] | ||
| ==== | ||
| Queries that operate on large amounts of data might timeout or overload the browser when drawing timeseries graphs. To avoid this, hide the graph and calibrate your query using only the metrics table. Then, after finding a feasible query, enable the plot to draw the graphs. | ||
| ==== | ||
| + | ||
| . Optional: The page URL now contains the queries you ran. To use this set of queries again in the future, save this URL. | ||
|
|
||
| .Additional resources | ||
|
|
||
| See the link:https://prometheus.io/docs/prometheus/latest/querying/basics/[Prometheus Query Language documentation]. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| [id="examining-cluster-metrics"] | ||
| = Examining cluster metrics | ||
| include::modules/common-attributes.adoc[] | ||
| :context: querying-metrics | ||
|
|
||
| toc::[] | ||
|
|
||
| {product-title} {product-version} provides a web interface to Prometheus, which enables you to run Prometheus Query Language (PromQL) queries and examine the metrics visualized on a plot. This functionality provides an extensive overview of the cluster state and enables you to troubleshoot problems. | ||
|
|
||
| include::modules/monitoring-contents-of-the-metrics-ui.adoc[leveloffset=+1] | ||
| include::modules/monitoring-running-metrics-queries.adoc[leveloffset=+1] | ||
| include::modules/monitoring-exploring-the-visualized-metrics.adoc[leveloffset=+1] | ||
|
|
||
| .Next steps | ||
|
|
||
| xref:../../monitoring/cluster-monitoring/prometheus-alertmanager-and-grafana.adoc#prometheus-alertmanager-and-grafana[Access the Prometheus, Alertmanager, and Grafana.] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Access the Prometheus, Alertmanager, and Grafana. interfaces? pages? Besides being grouped together, I'm not sure what the relationship is between these?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was asked (twice) to leave "interfaces" out of this wording due to some technicality (frankly I don't remember what technicality). |
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prometheus, Alerting UI, and Grafana web UIs
=>
Prometheus, Alerting, and Grafana web UIs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented in #16978.