Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add transition plan for upcoming breaking changes to the unstable HTTP semantic conventions #3404

Closed
wants to merge 5 commits into from

Conversation

trask
Copy link
Member

@trask trask commented Apr 17, 2023

Fixes #3362

Discussed in HTTP semantic convention WG and this is preferred over #3381 because

  • major version bump provides a clearer signal to users
  • major version bump makes it clearer that users can continue to use the previous major version until their backend supports the new major version
  • new major versions tend to be much more slowly adopted in production than new minor versions, also giving more time to backends

Changes

Adds transition plan for upcoming breaking changes to the unstable HTTP semantic conventions.

@trask trask force-pushed the transition-plan-2 branch 2 times, most recently from 34d23d5 to 507d010 Compare April 17, 2023 16:46
@trask trask requested a review from a team April 18, 2023 20:40
> (or prior) to this new version SHOULD bump their major version
> even though the instrumentation has not been declared stable.
> They SHOULD NOT release a stable version of that new major version until
> at least August 1, 2023 (this is to give backends some time to support the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need to push this to Nov 1. Otherwise looks good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a reason for this? One of the things we discussed was adoption cycles for client-side-software of major version bumps. My expectation is that the major-version bump will cause an inherent delay in the "core" of otel usage moving to this new version. As such, we're actually giving backends a decent chunk of time here. Yes some early adopters may start using the latest version, but the bulk of instrumentation won't be moving for some time after this date.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you asking for a reason for having a specific date at all or are you questioning the need for 6 months? Let me try to address both.

Firstly, why we need a date at all.

I don't know how quickly some instrumentations will react. If they are agile enough they may cause problems in mere weeks after we merge the new conventions. IMO, it is imperative that we give well-defined time period for vendors to prepare. We can't just say it is a major version increase so it will take (an unspecified) time before users adopt it.

Secondly, why I think 3 months is not enough.

Quarterly planning/execution periods are quite typical. 3 months is just not enough notice for anything that happens on a quarterly cycle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Secondly, why I think 3 months is not enough.

Quarterly planning/execution periods are quite typical. 3 months is just not enough notice for anything that happens on a quarterly cycle.

I think this is a valid point. Personally, I feel a combination of explicit grace period + major version bump + advance notice is already very generous. While I do have personal opinions, I suggest that we take a step back and have some framework for such kind of discussion.

For example, in Microsoft Azure folks follow 6 months semester planning cycle, @tigrannajaryan if we follow the thinking here it could leave us to "let's give 12 months". And if NASA is using 5 years cycle, would we say "let's give 10 years"? (https://www.zdnet.com/article/how-microsofts-azure-organization-makes-the-windows-sausage/ Feature priorities are decided sometimes in a six-month or a year-long boundary, he said. The biggest take-away: "It's not a tyranny of organizations anymore" when it comes to deciding on timing and feature sets.)

I would suggest the following framework to facilitate the discussion/decision here:

  • The community is not trying to optimize or wait for a single company/organization.
  • The goal is not to have a bar that is lower than every company's own bar, instead, there is a balance.
  • If certain vendor/community has difficulties, we can look at the individual case and make conscious adjustments/decision.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reyang I explained where I get 6 months from. So far 3 months is a number that looks arbitrarily chosen to me. Show me the calculation :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's all arbitrary to the extent that some people would prefer a shorter time and some people would prefer a longer time 😅

@tigrannajaryan I'm going to bump it to Oct 1, which is 6 months after the underlying issue (#3362) for this PR was created. This will still give (a little) time for stable releases and adoption before year end.

>
> HTTP instrumentations updating from
> [v1.20.0 of this document](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.20.0/specification/trace/semantic_conventions/http.md)
> (or prior) to this new version SHOULD bump their major version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As brought up by @Aneurysm9, @dyladan and others:
This will not work for instrumentation libraries that are at 0.x today since bumping them to 1.x would also indicate that the library itself would be considered stable now as per semver. This might not be accurate if the only change applied to the library is switching it over to a new semantic convention version. The instrumentation and its configuration might still be unstable at this point.
For 0.x libraries I think it should be expected by users that breaking changes are happening, also when there's no major version bump.

Copy link
Member

@arminru arminru Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense for instrumentation libraries to separately indicate which semantic convention version they use and comply with? E.g.

my.http.instrumentation version semantic convention schema
0.17 1.20.0
0.18 1.21.0
... ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense for instrumentation libraries to separately indicate which semantic convention version they use and comply with?

Just to clarify: the libraries are supposed to indicate this in the telemetry by using schema_url parameter when obtaining the Tracer/Meter/Logger. I think if they in addition indicate this in the documentation that would be good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to everything @arminru has said, to which I'd add that using 1.x-alpha or 1.x-RCy for an extended duration to indicate that generated telemetry had changed while the instrumentation's API remained less-than-solidified will be problematic for many dependency management systems that treat such pre-release versions as non-existent or not-latest. For instance, https://pkg.go.dev/go.opentelemetry.io/otel indicates that v1.14.0 is the latest release of the Go SDK even though v1.15.0-rc.2 has been available for over a month as we work through the metrics API stabilization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work for instrumentation libraries that are at 0.x today since bumping them to 1.x would also indicate that the library itself would be considered stable now as per semver.

This is a good point. We don't want to force instrumentations to declare themselves stable just because they want to adopt the new conventions. @trask @reyang thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @tedsuo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. If we are NOT going to force a major version update then we are back to my original request: we must prevent instrumentations from adopting the new conventions until a specific date so that all parties have time to prepare. Otherwise it is too easy for users to update to the next minor version and break stuff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update the transition plan to include two options:

  • either instrumentation can bump the major version (which will allow it to update to v1.21.0 right away, with no stable releases on the new major version until Oct 1)
  • or instrumentation should wait to update HTTP instrumentation to v1.21.0 until Oct 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to force instrumentations to declare themselves stable just because they want to adopt the new conventions. @trask @reyang thoughts?

+1. If the intention is to communicate breaking changes, I think any of the following version change is reasonable:

  • 0.1-alpha1 to 0.2-alpha1
  • 1.0-beta2 to 1.0-rc1
  • 1.0-rc1 to 1.0-rc2

These are okay, although I don't see it as a must-have:

  • 1.0-beta1 to 1.1-beta2
  • 1.0-rc1 to 2.0-alpha1

What about 0.38.0? Should it just go to 0.39.0? Would be nice if there was some way to differentiate this from any other minor version changes because even though users should expect changes to 0.x software, they are still very likely to be surprised by such an impactful change. Especially true since there is no compile-time indication of this type of break.

@trask trask force-pushed the transition-plan-2 branch from 40833e3 to bb8ac7d Compare April 24, 2023 23:06
Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for coming up with a good solution.

Comment on lines +31 to +32
> v1.21.0 of this document will introduce significant breaking changes to the (not yet stable)
> HTTP semantic conventions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the breaking changes were already released in 1.20.0 and 1.19.0:

  • Rename net.app.protocol.(name|version) to net.protocol.(name|version) (#3272)
  • Replace http.flavor with net.protocol.(name|version) (#3272)
  • Remove http.status_code attribute from the http.server.active_requests metric. (#3366)
  • Rename http.user_agent to user_agent.original. (#3190)

We could extend the guidance here to cover this, at least for those libraries that did not already update in the meantime.

@github-actions
Copy link

github-actions bot commented May 3, 2023

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label May 3, 2023
@trask
Copy link
Member Author

trask commented May 3, 2023

Closing this in favor of #3443

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Defining transition period for pre-stability HTTP semconv breaking changes
10 participants