Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GA request for signalfx / splunk-otel-js #126

Closed
jtmal-signalfx opened this issue Sep 7, 2021 · 29 comments
Closed

GA request for signalfx / splunk-otel-js #126

jtmal-signalfx opened this issue Sep 7, 2021 · 29 comments
Assignees

Comments

@jtmal-signalfx
Copy link

Which GDI repository do you wish to GA?
https://github.com/signalfx/splunk-otel-js

Does the repository follow the latest tagged minor release in GDI specification?
Yes, except:

  • referenced OTel JS is still in beta

How long has the GDI repository been public?
Since around March 2021.

Is the repository known to be used today?
Yes, couple hundred downloads per week via NPM.

Is there a date by which this approval is needed?
No.

Additional context
Nope.

@jtmal-signalfx
Copy link
Author

And the latest release to check all the boxes in gdi-specification: https://github.com/signalfx/splunk-otel-js/releases/tag/v0.12.0

@pellared pellared self-assigned this Sep 21, 2021
@pellared
Copy link
Contributor

pellared commented Sep 22, 2021

Configuration

Reviewing: https://github.com/signalfx/splunk-otel-js/blob/326819403700d2d30ffde47df072d489c9109b28/docs/advanced-config.md

Specs: https://github.com/signalfx/gdi-specification/blob/v1.0.0/specification/configuration.md

⚠️ It would be good if the format stays very similar across different languages/ecosystems. See these for reference:

One or more configuration variables MAY be needed to properly configure GDI
repositories. Configuration of these variables MUST be supported by environment
variables and MAY be supported by additional methods. GDI repositories MUST adopt
stable and SHOULD adopt experimental configuration variables in the
OpenTelemetry
Specification

before proposing variables to the GDI specification. If a new configuration
variable is needed by a GDI repository it MUST be brought to the GDI specification
as a GitHub issue. The GDI specification maintainers SHOULD consider
introducing needed configuration variables to the OpenTelemetry repository before
approving Splunk-specific configuration variables.

❌ "Support" column missing to indicate if the configuration is stable or experimental.

SPLUNK_MAX_ATTR_LENGTH - should we not support OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT which is defined by the OTel specification? I think it is going to be supported in next OTel release: https://github.com/open-telemetry/opentelemetry-js/blob/feea5167c15c41f0aeedc60959e36c18315c7ede/packages/opentelemetry-core/src/utils/environment.ts . Also, notice that the latest version (not released) of GDI spec requires 12000 as the default value.

SPLUNK_LOGS_INJECTION - the configuration is not brought to the GDI specification as a GitHub issue. is it proposed to OpenTelemetry specification?

If a GDI repository requires an immediate configuration variable that is not
available in the OpenTelemetry specification and not available in the GDI
specification, the repository MAY introduce a repository-specific configuration
variable until the GDI specification maintainers make a decision. Any
repository-specific configuration variable defined SHOULD be prefixed with
SPLUNK_ and MUST NOT be marked as stable unless added to the GDI
specification. Upon resolution by the GDI specification, the GDI repository MUST
change the repository-specific configuration variable by the repository's next minor
release. This change MAY result in a breaking change so caution should be
exhibited when considering repository-specific configuration variables.

Splunk-specific configuration variables defined in the GDI specification MUST
be prefixed with SPLUNK_. If a Splunk-specific configuration variable is
declared as stable in the GDI specification and later the OpenTelemetry
specification declares a similar variable as stable, the GDI specification
MUST adopt the OpenTelemetry configuration variable and SHOULD mark the GDI
specification configuration variable as deprecated by the next minor release.
In addition to defining Splunk-specification configuration variables, the GDI
specification MAY require specific OpenTelemetry configuration variables be
supported. If it does, the GDI specification MAY require certain values be
supported including a specific default value.

Whenever a configuration variable changes its name, a stable GDI repository
(version >= 1.0) MUST support both old and new names until the next major
release is done. The old configuration variable MUST NOT be removed in a minor
release. GDI repositories that are not yet stable (version < 1.0) SHOULD follow
this rule, but they are not required to. When it is detected that a user uses
the old configuration variable a warning SHOULD be logged: the warning SHOULD
state that the old variable is deprecated, the new one should be used instead,
and that the old one will be removed in the next major release (not stable GDI
repositories MAY remove deprecated features in any future release).

⚠️ SPLUNK_LOGS_INJECTION is not defined in GDI specification

Installation and configuration MUST optimize for customer experience and
time-to-value. Installation and configuration MAY provide a mechanism for
advanced or custom settings. As an example, the default installation for
instrumentation libraries should provide everything needed to configure W3C and
B3 as well as OTLP and Jaeger Thrift even though the default configuration is
set to W3C and OTLP. Installing all of these dependencies by default may result
in a large package, but makes it easy for users to switch settings via
configuration. An advanced installation process can be provided where the user
chooses what to install limiting the configuration options.

⚠️ You may consider having similar Advanced configuration section in the README.md like Java and Python distros. A similar structure makes life easier for the reader.

Environment variables

Name (default value) Description
SPLUNK_ACCESS_TOKEN () Access token added to exported data. [1]
SPLUNK_TRACE_RESPONSE_HEADER_ENABLED (true) Whether Server-Timing header is added to HTTP responses. [2]
  • [1]: Not user required if another system performs the authentication. For
    example, instrumentation libraries SHOULD send data to a locally running
    agent. The agent MAY define the access token that is used. If the
    instrumentation library is configured to send data directly to a Splunk
    back-end then this variable MUST be defined. Even when sending to an agent,
    instrumentation libraries MAY define this option (to support
    access_token_passthrough).
    This environment variable MUST work for otlp and jaeger-thrift-splunk
    exporters.
  • [2]: If stitching of RUM spans and APM spans is desired then this parameter
    MUST be set to true.

⚠️ Regarding the SPLUNK_ACCESS_TOKEN description: "SignalFx API" - I think it would be better to replace with "Splunk cloud". Consider taking the description from Java or Python distribution docs.
⚠️ SPLUNK_TRACE_RESPONSE_HEADER_ENABLED while the description is OK. I prefer the one used in Java and Python distribution docs.

I will make a PR in GDI specification to use the description from the distributions.

In addition to Splunk-specific environment variables, several requirements
beyond the OpenTelemetry specification exist.

OpenTelemetry Environment Variable

  • OTEL_RESOURCE_ATTRIBUTES
    • User MUST define service.name
      resource attribute.
    • Distribution MUST log a warning when the service.name resource attribute is not set. The
      warning message MUST clearly describe how to set the attribute or link to relevant documentation.
      E.g.
      service.name attribute is not set, your service is unnamed and will be difficult to identify.
      set your service name using the OTEL_RESOURCE_ATTRIBUTES environment variable.
      E.g. `OTEL_RESOURCE_ATTRIBUTES="service.name=<YOUR_SERVICE_NAME_HERE>"`
      
    • User SHOULD define deployment.environment
      resource attribute.
    • User SHOULD define service.version
  • OTEL_PROPAGATORS
    • Distribution MUST default to "tracecontext,baggage"
    • Distribution MUST support and document how to switch to b3multi
  • Span Collection Limits
    • Distribution MUST default to 1000 for OTEL_SPAN_LINK_COUNT_LIMIT (not OpenTelemetry default)
    • Distribution MUST be unset (unlimited) for all others (not OpenTelemetry default)
  • Zipkin exporter
    • Distribution MUST NOT list Zipkin exporter as supported (not supported by Smart Agent)
  • OTEL_TRACES_EXPORTER
    • Distribution MUST default to otlp
    • Distribution MUST offer jaeger-thrift-splunk that defaults to http://127.0.0.1:9080/v1/trace

❓ Is this done: "Distribution MUST log a warning when the service.name resource attribute is not set. The warning message MUST clearly describe how to set the attribute or link to relevant documentation." ?

❌ What about: "Distribution MUST default to 1000 for OTEL_SPAN_LINK_COUNT_LIMIT (not OpenTelemetry default)" ? Reference: https://github.com/open-telemetry/opentelemetry-js/blob/v0.25.0/packages/opentelemetry-core/src/utils/environment.ts

@pellared
Copy link
Contributor

pellared commented Sep 22, 2021

Legend:

❓ a question
❌ does not meet the GDI specs
⚠️ not breaking GDI specs however, may be worth addressing

@seemk
Copy link
Contributor

seemk commented Sep 24, 2021

The SPLUNK_LOGS_INJECTION env var is a remnant of the SFX lib. However it now only controls service name, environment and version injection as trace_id and span_id are injected via the logging instrumentations themselves. Rauno already pointed this out and think we can either get rid of it or move it to advanced configuration while enabling it by default.

@rauno56
Copy link

rauno56 commented Sep 26, 2021

Yes. We're in the process of removing those.

@theletterf
Copy link
Contributor

Please see: signalfx/splunk-otel-js#294 — docs are being modified there.

@pellared
Copy link
Contributor

pellared commented Sep 27, 2021

  • OTEL_TRACES_EXPORTER
    • Distribution MUST default to otlp

@flands @MrAlias Splunk OTel Java and Python are using the grpc protocol. Splunk OTel JS is currently using http/protobuf. Is it OK? The specification does not define which one should be used #131. My concern is that having different default settings may confuse the users.

@flands
Copy link
Contributor

flands commented Sep 27, 2021

We should default to gRPC -- opened: #132

@rauno56
Copy link

rauno56 commented Sep 28, 2021

Notes

@rauno56
Copy link

rauno56 commented Oct 19, 2021

@pellared, a heads-up about the last PR to be merged! ☝️

@pellared
Copy link
Contributor

AFAIK we still cannot go GA until OTLP is not stable. https://github.com/open-telemetry/opentelemetry-js/tree/main/experimental/packages

@rauno56
Copy link

rauno56 commented Oct 19, 2021

@pellared, this is not a issue actually because the only reason OTLP Exporter is not ready is the metrics support in it(they're working on separating it currently) and we are not using metrics exporter.

@MrAlias
Copy link
Contributor

MrAlias commented Oct 22, 2021

@pellared, this is not a issue actually because the only reason OTLP Exporter is not ready is the metrics support in it(they're working on separating it currently) and we are not using metrics exporter.

Has the OTel-js SIG officially stated this? I am hesitant to release something as stable without dependencies also being stable. When the Go SIG split apart this functionality in their OTLP exporter it meant packaging and API changes for the traces portion.

@rauno56
Copy link

rauno56 commented Oct 25, 2021

Yes. There's a PR open with that.
open-telemetry/opentelemetry-js#2485 (comment)

If that's the only blocker for us I'd love to get a tentative approval for this GA request here with the clause "All OTel deps have to be stable for GA" at least.

@pellared
Copy link
Contributor

Unassigning myself as I currently have no time to review 😞

@pellared pellared removed their assignment Oct 25, 2021
@jtmal-signalfx
Copy link
Author

We're waiting for OTel upstream to split the OTLP exporter into the stable part and the experimental part.

@jtmal-signalfx
Copy link
Author

PR still not merged in OTel.

@rauno56
Copy link

rauno56 commented Nov 5, 2021

It is now.

@rauno56
Copy link

rauno56 commented Dec 20, 2021

Once open-telemetry/opentelemetry-js#2626 is merged an released version 0.27 of the exporter will be released as 1.0 as well and we can bump the version among our deps as well.

@MrAlias, @pellared, how can we proceed?

@jtmal-signalfx
Copy link
Author

US people are off till Jan 3, and Robert isn't working on GA reviews anymore I think. I'll keep pinging Steve and Michał about this.

@MrAlias
Copy link
Contributor

MrAlias commented Jan 24, 2022

Repository

Teams

  • MUST have a maintainers team and teams MAY be shared between repositories
    • MUST have -maintainers suffix
    • MUST include at least two currently full-time Splunkers
    • MUST NOT include any non-full-time Splunkers
  • SHOULD have an approvers team and teams MAY be shared between repositories
    • MUST have -approvers suffix

Verified: https://github.com/orgs/signalfx/teams/gdi-js-approvers, https://github.com/orgs/signalfx/teams/gdi-js-maintainers

Permissions

  • MUST grant Admin permission
    level

    to admins team
  • MUST grant Maintain permission
    level

    to maintainers team
  • MUST grant Write permission
    level

    to approvers team and the signalfx/docs, signalfx/gdi-specification-approvers and signalfx/gdi-specification-maintainers teams
  • MUST NOT grant Write, Maintain, Admin permission
    level

    to any other user nor team
  • MUST NOT grant any permission (including Read) to any individual

Branch protection

  • MUST have a primary branch named main
  • MUST NOT allow anyone (including administrators) pushing directly to main
  • MUST require status checks to pass before merge to main
  • MUST require at least one CODEOWNER to approve a PR prior to merge
  • MUST require signed commits on main
  • MUST NOT allow merge commit (squash or rebase merging only)

I do not have access to these settings (permissions or branch protection). @flands can you please verify?

Dependencies

  • MUST lock the versions of all build dependencies (e.g. libraries, binaries,
    scripts, docker images) or vendor them; EXCEPTION: tools that are
    available out-of-the-box

Verified: package-lock.json

  • MUST have Dependabot configured

Verified: dependabot.yml

GitHub Actions

I do not have access to these settings. @flands can you please verify?

GitHub Applications

  • MUST have Jira GitHub App configured

@jtmal-signalfx or @rauno56 can you verify and post confirmation of this?

  • SHOULD have Codecov GitHub App configured

⚠️ I think it is configured here? @jtmal-signalfx or @rauno56 can you verify this?

  • SHOULD have Lychee Link Checker GitHub Action configured

❌ This does not look configured.

Required Files

  • MUST have a CHANGELOG.md updated for every release
    • The CHANGELOG.md is intended to be consumed by humans, and not machines.
    • The file SHOULD contain an Unreleased section at the top, which includes changes that
      have not yet been released.
    • The file MUST be in reverse chronological order, with the most recent
      releases at the top of the file, after the Unreleased section.
    • Each release SHOULD be separated by a line separator (---) from the other relases.
    • Each release SHOULD contain separate sections for each major functionality area (if applicable).
      The following sub-sections MAY be used, as appropriate or specified.
      • General - General comments about the release that users should know about.
      • Breaking Changes - Any changes that will break backward compatibility
        with previous versions. MUST list all breaking changes.
      • Bugfixes - Details of bugs that were fixed. SHOULD list all bug fixes.
      • Enhancements - New features that have been added to the repository. SHOULD
        list all new features.
    • The CHANGELOG.md SHOULD NOT list every PR, but only changes significant
      from an end-user point of view. Anyone who is interested in all the details
      of every change in the repository can use the git log for that.

Verified: CHANGELOG.md

Verified: CODE_OF_CONDUCT.md

Verified: CONTRIBUTING.md

  • MUST have a .github/CODEOWNERS file with a maintainers team
  • SHOULD have an approvers team in CODEOWNERS

Verified: CODEOWNERS

Verified: LICENSE

  • SHOULD have a MIGRATING.md if applicable

Verified: MIGRATING.md

  • MUST have a README.md
    • MUST have a badge on the README.md with build status
      • CI and PR builds and all tests/checks that are executed in them MUST be
        publicly accessible by anyone.
    • MUST have a badge on the README.md with GDI specification version supported
    • SHOULD have a badge on the README.md with code coverage, if appropriate.
    • SHOULD have badges on the README.md for other relevant things including artifacts
    • MUST have getting started information in README.md
    • MUST have troubleshooting information in README.md
    • MUST link to official Splunk docs in README.md
    • MUST have license information in README.md

❌ Missing "badge on the README.md with GDI specification version"

Otherwise verified: README.md

  • MAY have a RELEASING.md file that documents the release process, but this
    file MUST NOT document private processes. For repositories that use private release
    jobs, the RELEASING.md file SHOULD be absent or, if included, just contain the following note:

    The release process involves signing built packages and binaries and thus
    must be kept private and not exposed publicly.

Verified: RELEASING.md

  • MUST add the SECURITY.md
    • SHOULD add dependabot information to SECURITY.md if applicable

Verified: SECURITY.md

  • SHOULD NOT contain comprehensive application examples. Application examples
    showing multi-system interactions or even cross-language use cases SHOULD be
    put in the Splunk OpenTelemetry example
    repository
    .
    Smaller, developer focused, examples MAY be included in the repository if it
    is customary to do so for the coding language used.

IDK, looks fine to me: https://github.com/signalfx/splunk-otel-js/tree/v0.15.0/examples

GitHub Releases

  • MUST have a signature or a checksum with signature for all release artifacts
    • SHOULD use Splunk signing key

Looks to be the case for all current published releases (i.e. https://github.com/signalfx/splunk-otel-js/releases/download/v0.15.0/checksums.txt)

❌ Many releases do not use signed tags. These need to be recreated.

  • MUST state version of OpenTelemetry repository built against if applicable

Verified.

n/a

Instrumentation Libraries

  • MUST document all supported configuration parameters

Verified: https://github.com/signalfx/splunk-otel-js/blob/v0.15.0/docs/advanced-config.md#list-of-settings

  • MUST document how to configure manual instrumentation

Verified: https://github.com/signalfx/splunk-otel-js#manually-instrument-an-application

  • MUST document how to configure log correlation

Verified: https://github.com/signalfx/splunk-otel-js#correlate-traces-and-logs

  • MUST define minimum supported version of each auto-instrumentation framework
    • SHOULD define maximum supported version of each auto-instrumentation framework

❌ This does not seem to be satisfied: https://github.com/signalfx/splunk-otel-js#default-instrumentation-packages

@jtmal-signalfx or @rauno56 can you please double check this requirement and point me to docs that satisfy this if I missed them?

@MrAlias
Copy link
Contributor

MrAlias commented Jan 24, 2022

Semantic Conventions

OpenTelemetry Resource Attributes

Description: Required and recommended OpenTelemetry resource semantic
conventions
.

All Splunk distributions of OpenTelemetry,

  • MUST set the following OpenTelemetry resource attributes according to the
    OpenTelemetry Semantic
    Conventions
    :

    • telemetry.sdk.name
    • telemetry.sdk.version
    • telemetry.sdk.language
  • SHOULD set the following resource attributes when applicable:

    • telemetry.auto.version

Note: this section does not apply to Real User Monitoring libraries, as they do
not use the OpenTelemetry Resource.

@jtmal-signalfx or @rauno56 can you verify this is satisfied? I do not have enough familiarity with the JS implementation to do so.

Splunk Resource Attributes

Description: Set of attributes used to uniquely identify a Splunk distro
version in combination with OpenTelemetry's telemetry.sdk.* attributes.

Attribute Type Description Examples Required
splunk.distro.version string The version number of the Splunk distribution being used. 1.5.0 Yes

Note: this section does not apply to Real User Monitoring libraries, as they do
not use the OpenTelemetry Resource.

❌ A search of the repository did not return an implementation of this. @jtmal-signalfx or @rauno56 can you verify this is satisfied or needs to be added?

@MrAlias
Copy link
Contributor

MrAlias commented Jan 24, 2022

To be sure, the profiling capabilities are not being included here because they do not look to be ready for a stable release. @rauno56 or @jtmal-signalfx can you verify the plan is to not include profiling in the stable release?

@MrAlias
Copy link
Contributor

MrAlias commented Jan 24, 2022

Versioning

All GDI repositories MUST be versioned according to Semantic Versioning
2.0
. GDI repositories are versioned
separately from OpenTelemetry repositories as Splunk-specific breaking changes
MAY be introduced. GDI repositories MUST indicate what version of OpenTelemetry
repositories they are based on through release notes and SHOULD indicate through
logging. Additional version number constraints can be found in the sections
below.

Verified.

@MrAlias MrAlias self-assigned this Jan 24, 2022
@rauno56
Copy link

rauno56 commented Mar 8, 2022

@MrAlias, thanks for the reveiw. I collected your comments in the following list:

@rauno56
Copy link

rauno56 commented May 20, 2022

@MrAlias, what comes to releases, since they're old ones and no one in the node ecosystem uses GH Releases for any authority I'll skip redoing the releases.

I would kindly ask someone(@iNikem / @flands ?) with permission to

  1. Enable Require signed commits if it's not already.
  2. Check whether the Jira GH App is configured - I don't have the access to the settings page and don't know how to check it otherwise. I don't know how this app would be useful though.

@mzajacsplunk
Copy link

  1. I checked and "Require signed commits" is correctly set
  2. I checked and the Jira GH app is configured

@rauno56
Copy link

rauno56 commented Jun 7, 2022

Cool! Then we're all set.

@MrAlias
Copy link
Contributor

MrAlias commented Jun 8, 2022

Cool! Then we're all set.

Agreed! 🎉

Closing with the conclusion that this is complete, the JS distribution is ready for GA!

@MrAlias MrAlias closed this as completed Jun 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants