Skip to content

Define what constitutes breaking changes for Metrics #2864

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jsuereth opened this issue Oct 11, 2022 · 6 comments · Fixed by #3225
Closed

Define what constitutes breaking changes for Metrics #2864

jsuereth opened this issue Oct 11, 2022 · 6 comments · Fixed by #3225
Assignees
Labels
area:semantic-conventions Related to semantic conventions [label deprecated] triaged-accepted [label deprecated] Issue triaged and accepted by OTel community, can proceed with creating a PR spec:metrics Related to the specification/metrics directory

Comments

@jsuereth
Copy link
Contributor

We should outline what changes to metric streams (names, unit, type + attributes) constitutes a breaking change.

More details to follow

@jsuereth jsuereth added area:semantic-conventions Related to semantic conventions spec:metrics Related to the specification/metrics directory labels Oct 11, 2022
@jsuereth jsuereth self-assigned this Oct 11, 2022
@jsuereth
Copy link
Contributor Author

We'd like to tackle this question by focusing on what use cases for metrics should be preserved between OTEL instrumentation releases and expected interactions with schema_url / Telemetry Schemas.

For the first cut, we should investigate breaking behavior of:

  • Alerts (and alert thresholds)
  • Dashboards (common usage, simple queries)
  • Analytics / "data lake"

@MrAlias
Copy link
Contributor

MrAlias commented Oct 11, 2022

Does this plan to change the semantic convention stability guidelines?

@jsuereth
Copy link
Contributor Author

Yes. Specifically I want us to answer whether:

  • We can/should rely on schema_url migrations for stability or ask for major version bumps when the absence of schema_url usage would lead to a breakage. (I.e. we consider schema_url an "aide" vs. a requirement for backends)
  • Whether we loosen the definition of adding an attribute for metrics. Specifically adding an attribute that has the same value for all existing timeseries is actually a non-breaking change.
  • Discussion/guidance around things like "split", changing "unit" type, etc.

@rbailey7210 rbailey7210 added the [label deprecated] triaged-accepted [label deprecated] Issue triaged and accepted by OTel community, can proceed with creating a PR label Oct 14, 2022
@jsuereth
Copy link
Contributor Author

Here's an unfinished document that walks through the problem of stability and how I'd like to think about it. I only had time to discuss Alerting instability of metrics, specifically when is it ok to add new attributes to a metric timeseries.

Welcome people's thoughts and opinions and more scenarios to tackle in that document as we go forward.

@jsuereth
Copy link
Contributor Author

Updated a section for Metrics for our next meeting, including three topic points:

  • Should we consider explicit bucket boundary changes on histograms breaking?
  • Should we consider moving between integer / floating point values breaking?
  • Should we consider moving between Attribute types where stringified values remain the same breaking?

I have my thoughts on each, but want to discuss in the WG.

@jsuereth
Copy link
Contributor Author

jsuereth commented Nov 1, 2022

From WG + Spec SiG:

Should we consider moving between integer / floating point values breaking?

We don't enforce this in the specification today, and doing so is likely breaking or lots of churn/work. We'd like to nuance what is considered breaking to something akin to:

  • For a given timeseries, the type of points (integer / floating point) should remain the same for its lifetime.
  • When merging timeseries, if different types of points are encountered (both integer and floating point) the resulting time series should be floating point.

@trask trask moved this to Blockers for HTTP semconv stability in Semantic Conventions + Instrumentation Stability WG Jan 31, 2023
jsuereth added a commit to jsuereth/opentelemetry-specification that referenced this issue Feb 17, 2023
…by semconv that can be included in stability.
@jsuereth jsuereth moved this from Blocker for HTTP semconv stability to In Progress in Semantic Conventions + Instrumentation Stability WG Mar 20, 2023
jmacd added a commit that referenced this issue Apr 3, 2023
Fixes #2864
Fixes #2883

## Changes

- Explicitly define what is "enforced" by stability guarantees from
Semantic conventions.
- We enforce attribute key names + types, across resource, span, metric
and log
  - We enforce span names
  - We enforce metric names, units
- Expand allowed changes to semconv to include metric attributes that do
not increase timeseries count for a given metric.

## Context

Sig discussion thread/doc
[here](https://docs.google.com/document/d/1Nvcf1wio7nDUVcrXxVUN_f8MNmcs0OzVAZLvlth1lYY/edit?usp=sharing).

---------

Co-authored-by: Trask Stalnaker <[email protected]>
Co-authored-by: Johannes Tax <[email protected]>
Co-authored-by: Tigran Najaryan <[email protected]>
Co-authored-by: Patrice Chalin <[email protected]>
Co-authored-by: Sergey Kanzhelev <[email protected]>
Co-authored-by: Carlos Alberto Cortez <[email protected]>
Co-authored-by: Tyler Benson <[email protected]>
Co-authored-by: Joshua Carpeggiani <[email protected]>
Co-authored-by: Armin Ruech <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
Co-authored-by: Asaf Mesika <[email protected]>
Co-authored-by: Evan Mattson <[email protected]>
Co-authored-by: jack-berg <[email protected]>
Co-authored-by: Antoine Toulme <[email protected]>
Co-authored-by: Christian Neumüller <[email protected]>
Co-authored-by: Liudmila Molkova <[email protected]>
Co-authored-by: Reiley Yang <[email protected]>
Co-authored-by: Joshua MacDonald <[email protected]>
carlosalberto added a commit to carlosalberto/opentelemetry-specification that referenced this issue Oct 31, 2024
…try#3225)

Fixes open-telemetry#2864
Fixes open-telemetry#2883

## Changes

- Explicitly define what is "enforced" by stability guarantees from
Semantic conventions.
- We enforce attribute key names + types, across resource, span, metric
and log
  - We enforce span names
  - We enforce metric names, units
- Expand allowed changes to semconv to include metric attributes that do
not increase timeseries count for a given metric.

## Context

Sig discussion thread/doc
[here](https://docs.google.com/document/d/1Nvcf1wio7nDUVcrXxVUN_f8MNmcs0OzVAZLvlth1lYY/edit?usp=sharing).

---------

Co-authored-by: Trask Stalnaker <[email protected]>
Co-authored-by: Johannes Tax <[email protected]>
Co-authored-by: Tigran Najaryan <[email protected]>
Co-authored-by: Patrice Chalin <[email protected]>
Co-authored-by: Sergey Kanzhelev <[email protected]>
Co-authored-by: Carlos Alberto Cortez <[email protected]>
Co-authored-by: Tyler Benson <[email protected]>
Co-authored-by: Joshua Carpeggiani <[email protected]>
Co-authored-by: Armin Ruech <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
Co-authored-by: Asaf Mesika <[email protected]>
Co-authored-by: Evan Mattson <[email protected]>
Co-authored-by: jack-berg <[email protected]>
Co-authored-by: Antoine Toulme <[email protected]>
Co-authored-by: Christian Neumüller <[email protected]>
Co-authored-by: Liudmila Molkova <[email protected]>
Co-authored-by: Reiley Yang <[email protected]>
Co-authored-by: Joshua MacDonald <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions [label deprecated] triaged-accepted [label deprecated] Issue triaged and accepted by OTel community, can proceed with creating a PR spec:metrics Related to the specification/metrics directory
Development

Successfully merging a pull request may close this issue.

4 participants