Changes interval on grafana dashboards to match scrape interval #1669

joshleecreates · 2024-07-14T20:19:34Z

Changes

Changes interval on the Grafana dashboard to match scrape interval, which fixes the broken visualizations for RED metrics.

(The changes to the axisBorderShow property appear to be from the updated version of Grafana)

julianocosta89 · 2024-07-15T09:16:19Z

hey @joshleecreates, I was still not able to see the RED metrics in Grafana.
All charts are with No Data.

joshleecreates · 2024-07-15T14:28:53Z

hey @joshleecreates, I was still not able to see the RED metrics in Grafana. All charts are with No Data.

It worked once for me but I am now seeing the same. I suspect that when it worked I was working off of the branch with the span filtering rules to reduce cardinality, I'll test again after that is merged.

joshleecreates · 2024-07-16T18:07:32Z

I'm still seeing issues with Prometheus with the cardinality fix merged:

prometheus  | ts=2024-07-16T15:58:30.576Z caller=manager.go:163 level=info component="rule manager" msg="Starting rule manager..."

prometheus  | ts=2024-07-16T16:21:51.834Z caller=write_handler.go:134 level=error component=web msg="Out of order sample from remote write" err="out of order sample" series="{__name__=\"target_info\", container_id=\"3e0bddb9d06cf94fb84a2d772930cc64db46193213180c29951a1ee887391103\", docker_cli_cobra_command_path=\"docker compose\", host_arch=\"aarch64\", host_name=\"3e0bddb9d06c\", job=\"quoteservice\", os_description=\"6.6.32-linuxkit\", os_name=\"Linux\", os_type=\"linux\", os_version=\"#1 SMP Thu Jun 13 14:13:01 UTC 2024\", process_command=\"public/index.php\", process_command_args=\"[\\\"public/index.php\\\"]\", process_executable_path=\"/usr/local/bin/php\", process_owner=\"www-data\", process_pid=\"7\", process_runtime_name=\"cli\", process_runtime_version=\"8.3.9\", service_version=\"1.0.0+no-version-set\", telemetry_distro_name=\"opentelemetry-php-instrumentation\", telemetry_distro_version=\"1.0.3\", telemetry_sdk_language=\"php\", telemetry_sdk_name=\"opentelemetry\", telemetry_sdk_version=\"1.0.8\"}" timestamp=1721146894925

There's a 23 minute gap before the error appears. I checked in the beginning of that time frame and didn't see any span metrics (or Grafana errors).

I think the changes in this PR are necessary for Grafana but there is still something else going on with Prometheus.

puckpuck · 2024-07-19T04:31:22Z

That second issue is different from Prometheus, and it existed before. #1622 is the related issue for it.

I pushed a couple of fixes to your branch for that Prometheus issue, and to use a 2m interval instead of 1m. I guess with a 1m metric, we need at least 2 samples for rate to work.

…ns' into fix-grafana-visualizations

puckpuck · 2024-07-19T13:18:04Z

I think that Prometheus change fixed out of order samples, but now we are getting out of order exemplars.

prometheus  | ts=2024-07-19T13:05:38.103Z caller=write_handler.go:175 level=warn component=web msg="Error on ingesting out-of-order exemplars" num_dropped=1

Some searching tells me this particular thing is not yet solved in Prometheus, and there is an open issue for it.

puckpuck · 2024-07-23T03:44:26Z

I've had this branch running for 4+ days, and Prometheus is stable, with the dashboards working as intended.

We still have an error in Prometheus for out of order exemplars, but that is a known issue with Prometheus, that we should track separately.

@julianocosta89 can you take another look to see if this works for you?

julianocosta89 · 2024-07-23T09:25:31Z

🥳 thanks @joshleecreates and @puckpuck!
It seems we have charts working again

joshleecreates · 2024-07-23T13:03:22Z

Thanks @puckpuck for wrapping this up!

…-telemetry#1669) * Changes interval on grafana dashboards to match scrape interval * fix out of order sample * use 2m interval for spanmetrics * use 30m for out of order samples --------- Co-authored-by: Juliano Costa <[email protected]> Co-authored-by: Pierre Tessier <[email protected]>

Changes interval on grafana dashboards to match scrape interval

eea719a

joshleecreates requested a review from a team as a code owner July 14, 2024 20:19

github-actions bot added the helm-update-required Requires an update to the Helm chart when released label Jul 14, 2024

puckpuck linked an issue Jul 16, 2024 that may be closed by this pull request

Spanmetrics panel always returns no results #1623

Closed

Merge branch 'main' into fix-grafana-visualizations

ce23391

puckpuck added 2 commits July 19, 2024 00:29

fix out of order sample

4a82642

use 2m interval for spanmetrics

76f57f6

puckpuck linked an issue Jul 19, 2024 that may be closed by this pull request

Prometheus out of order sample from remote write #1622

Open

puckpuck added 3 commits July 19, 2024 00:34

Merge branch 'main' into fix-grafana-visualizations

88e08a5

use 30m for out of order samples

cd43731

Merge remote-tracking branch 'joshleecreates/fix-grafana-visualizatio…

132afa8

…ns' into fix-grafana-visualizations

puckpuck approved these changes Jul 23, 2024

View reviewed changes

julianocosta89 approved these changes Jul 23, 2024

View reviewed changes

Merge branch 'main' into fix-grafana-visualizations

4081f58

Merge branch 'main' into fix-grafana-visualizations

3ae5b4a

julianocosta89 merged commit d7a21a7 into open-telemetry:main Jul 23, 2024
28 checks passed

puckpuck mentioned this pull request Jul 25, 2024

[demo] - fix grafana demo dashboard open-telemetry/opentelemetry-helm-charts#1277

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes interval on grafana dashboards to match scrape interval #1669

Changes interval on grafana dashboards to match scrape interval #1669

joshleecreates commented Jul 14, 2024

julianocosta89 commented Jul 15, 2024

joshleecreates commented Jul 15, 2024

joshleecreates commented Jul 16, 2024

puckpuck commented Jul 19, 2024

puckpuck commented Jul 19, 2024

puckpuck commented Jul 23, 2024

julianocosta89 commented Jul 23, 2024

joshleecreates commented Jul 23, 2024

Changes interval on grafana dashboards to match scrape interval #1669

Changes interval on grafana dashboards to match scrape interval #1669

Conversation

joshleecreates commented Jul 14, 2024

Changes

julianocosta89 commented Jul 15, 2024

joshleecreates commented Jul 15, 2024

joshleecreates commented Jul 16, 2024

puckpuck commented Jul 19, 2024

puckpuck commented Jul 19, 2024

puckpuck commented Jul 23, 2024

julianocosta89 commented Jul 23, 2024

joshleecreates commented Jul 23, 2024