Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ingestion of X-Ray segments via UDP #502

Merged
merged 14 commits into from
Aug 10, 2020

Conversation

hencrice
Copy link
Contributor

@hencrice hencrice commented Jul 23, 2020

Description:
This is largely a port of the X-Ray segment ingestion code from the existing X-Ray Daemon codebase. The main differences are:

  1. in the poll(), I resorted to not having a buffer pool but use a local buffer instead. Details can be found in the comments for the function.
  2. switched read() to use a more Go idiomatic return signature
  3. updated the util package to be more idiomatic.

Note that the receiver is still incomplete. So I did not change the README file.

References for this port:

  1. https://github.com/aws/aws-xray-daemon/blob/master/cmd/tracing/daemon.go#L257
  2. https://github.com/aws/aws-xray-daemon/blob/master/pkg/socketconn/udp/udp.go
  3. https://github.com/aws/aws-xray-daemon/blob/master/pkg/tracesegment/tracesegment.go
  4. https://github.com/aws/aws-xray-daemon/blob/master/pkg/util/util.go

Link to tracking Issue:
Closes #430

Testing:
New unit tests added.

Documentation:
No README update needed.

This is largely a port of the X-Ray segment ingestion code from the existing X-Ray Daemon codebase. The main differences are:
1. in the `poll()`, I resorted to not having a buffer pool but use a local buffer instead. Details can be found in the comments for the function.
2. switched `read()` to use a more Go idiomatic return signature

Closes #430.
@hencrice hencrice requested a review from a team July 23, 2020 23:07
@codecov
Copy link

codecov bot commented Jul 23, 2020

Codecov Report

Merging #502 into master will increase coverage by 0.15%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #502      +/-   ##
==========================================
+ Coverage   86.26%   86.42%   +0.15%     
==========================================
  Files         195      199       +4     
  Lines       10632    10753     +121     
==========================================
+ Hits         9172     9293     +121     
  Misses       1128     1128              
  Partials      332      332              
Flag Coverage Δ
#integration 71.09% <ø> (ø)
#unit 86.25% <100.00%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
receiver/awsxrayreceiver/factory.go 100.00% <100.00%> (ø)
...xrayreceiver/internal/tracesegment/tracesegment.go 100.00% <100.00%> (ø)
...eiver/awsxrayreceiver/internal/udppoller/poller.go 100.00% <100.00%> (ø)
receiver/awsxrayreceiver/internal/util/util.go 100.00% <100.00%> (ø)
receiver/awsxrayreceiver/receiver.go 100.00% <100.00%> (ø)
exporter/awsxrayexporter/translator/cause.go 94.73% <0.00%> (-5.27%) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 32c9e6a...1c879b8. Read the comment docs.

Copy link
Contributor

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for helping with this!

receiver/awsxrayreceiver/internal/socketconn/udp/udp.go Outdated Show resolved Hide resolved
receiver/awsxrayreceiver/internal/socketconn/udp/udp.go Outdated Show resolved Hide resolved
receiver/awsxrayreceiver/internal/socketconn/udp/udp.go Outdated Show resolved Hide resolved
receiver/awsxrayreceiver/internal/util/util.go Outdated Show resolved Hide resolved
receiver/awsxrayreceiver/internal/util/util.go Outdated Show resolved Hide resolved
receiver/awsxrayreceiver/internal/util/util.go Outdated Show resolved Hide resolved
receiver/awsxrayreceiver/receiver.go Outdated Show resolved Hide resolved
receiver/awsxrayreceiver/receiver.go Outdated Show resolved Hide resolved
receiver/awsxrayreceiver/receiver.go Outdated Show resolved Hide resolved
receiver/awsxrayreceiver/receiver.go Outdated Show resolved Hide resolved
hencrice added 4 commits July 27, 2020 13:31
1. Add Amazon copyright header
2. Moved references to commit message
3. Refactored a large chunk of receiver out to a separate UDP poller
@hencrice hencrice requested a review from anuraaga July 27, 2020 22:31
Copy link
Contributor

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the cleanups!

@hencrice
Copy link
Contributor Author

@james-bebbington Could you take a look? This PR is mostly vendor-specific code.

Copy link
Contributor

@pjanotti pjanotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @hencrice - some small suggestions. I also need to check with someone from CNCF/OpenTelemetry if the added copyright notices are ok.

@pjanotti
Copy link
Contributor

@open-telemetry/admins @open-telemetry/technical-committee this PR is adding copyright notices below the CNCF/OpenTelemetry one. I would like a confirmation if that is ok or not. Example: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/502/files#diff-cfed4d263740d3decbf9e18976f13188R1-R23

@hencrice
Copy link
Contributor Author

@open-telemetry/admins @open-telemetry/technical-committee this PR is adding copyright notices below the CNCF/OpenTelemetry one. I would like a confirmation if that is ok or not. Example: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/502/files#diff-cfed4d263740d3decbf9e18976f13188R1-R23

Thanks for bringing this up. I'm curious as well since this is essentially a port of some of the existing X-Ray daemon functionalities, with some relatively big changes to the original code.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking this PR until I get the confirmation about license. Sorry for the inconvenience.

// See the License for the specific language governing permissions and
// limitations under the License.

// Copyright 2018-2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lizthegrey @ccaraman @sarahnovotny do you know if this is the right way to do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to know as well. The only reason I put this kind of header in some files is because some of the files are largely a port of the code in the existing X-Ray daemon with some relatively big changes. The X-Ray daemon itself is licensed under Apache v2.

Copy link
Contributor Author

@hencrice hencrice Jul 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu @lizthegrey @ccaraman @sarahnovotny any update?

I've reached out to @alolita on AWS side regarding the use of the Amazon copyright header. And she will follow up with OT committee.

@mtwo
Copy link
Member

mtwo commented Jul 29, 2020

Assigned to @pjanotti in today's meeting

@hencrice
Copy link
Contributor Author

hencrice commented Jul 29, 2020

Assigned to @pjanotti in today's meeting

Oh no! I missed the SIG. Meant to ask about the licensing header issue mentioned above. @bogdandrutu is there an ETA on how to proceed with the licensing issue?

receiver/awsxrayreceiver/config.go Outdated Show resolved Hide resolved
rlen, err := p.read(bufPointer)
if errors.As(err, &errIrrecv) {
p.logger.Error("irrecoverable socket read error. Exiting poller", zap.Error(err))
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a new poller be launched if it returns here?

Copy link
Contributor Author

@hencrice hencrice Jul 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kbrockhoff thanks for the feedback.

I don't think so in this case because if it's an irrecoverable error (i.e. non-transient net.Error in read() of udppoller), that implies the UDP socket is probably closed or broken to the degree that launching more go routines will not make more progress.

I can add a TODO here to think about attempting to reopen the UDP socket with the same address. But there's no guarantee that will work either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The X-Ray daemon takes the approach that if is there is a socket, connection or configuration issue then it kills the entire instance. This approach will not work with the collector because there may be other receivers/exporters that are still working. The collector should not stop processing metrics or logs just because traces are failing.

Tearing down and cleaning up a non-working socket and building a new one is probably the optimum approach for the final state. I am OK with just adding a TODO for now.

To make this work smoothly, we likely need to add a monitor which watches all the receivers/exports. If multiple are failing then the monitor will flip the health status to failing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the idea of a health monitor. We can even come up with an interface and force all plugins to implement:

  1. health ping/status report
  2. best-effort restart when the plugin is unhealthy

@lizthegrey
Copy link
Member

Assigned to @pjanotti in today's meeting

Oh no! I missed the SIG. Meant to ask about the licensing header issue mentioned above. @bogdandrutu is there an ETA on how to proceed with the licensing issue?

If they're both Apache Licensed there's no reason to repeat the Apache boilerplate twice. However, wrt giving Amazon special mention as a copyright holder separate from just being "the opentelemetry authors", the simplest way would be to get the deletion of the amazon specific mention and replacement with copyright (firstyear)-2020 the otel authors done by an Amazon contributor covered by the CLA amazon has on file with the CNCF/OTel.

@hencrice
Copy link
Contributor Author

hencrice commented Aug 7, 2020

@pjanotti @kbrockhoff @bogdandrutu I have removed the Amazon copyright header. This PR is ok to merge now. Thanks!

@pjanotti
Copy link
Contributor

pjanotti commented Aug 7, 2020

@hencrice let's wait also on a green-light from @kbrockhoff

@anuraaga
Copy link
Contributor

anuraaga commented Aug 8, 2020

If possible let's get a sign-off from @alolita too.

@alolita can you confirm that Amazon is transferring the copyright for existing code from the x-ray daemon to OpenTelemetry? New code is automatically handled by the CLA but I haven't seen any notion in this thread confirming we're fine for copied code.

As in open-telemetry/community#305 handling copied code is not trivial - even now the Java agent takes a stance of having an employee from datadog do the actual merge when copying in code, which won't apply here as it stands. @lizthegrey makes a good point that the word OpenTelemetry authors includes Amazon so the copyright isn't lost. It makes sense to me but I'm not comfortable without explicit wording from the Amazon side.

@hencrice
Copy link
Contributor Author

hencrice commented Aug 10, 2020

If possible let's get a sign-off from @alolita too.

@alolita can you confirm that Amazon is transferring the copyright for existing code from the x-ray daemon to OpenTelemetry? New code is automatically handled by the CLA but I haven't seen any notion in this thread confirming we're fine for copied code.

As in open-telemetry/community#305 handling copied code is not trivial - even now the Java agent takes a stance of having an employee from datadog do the actual merge when copying in code, which won't apply here as it stands. @lizthegrey makes a good point that the word OpenTelemetry authors includes Amazon so the copyright isn't lost. It makes sense to me but I'm not comfortable without explicit wording from the Amazon side.

Hey @anuraaga, thanks for bringing this up. Since this is a port of existing X-Ray daemon functionalities, I heavily referenced the original implementation so it's great that we want to be confident in the copyright. Here's what @alolita wrote in the opentelemetry-devs Slack channel:
"
Hi Bingfeng - the OT guidance is to not include any Amazon copyrights in the source files. They remove it if we have any copyrights associated with Amazon. Waiting for our lawyers to give clear guidance. In the meantime, @yenlinc @nicfisch don't add a copyright but please ensure you are members. We can always push back later and add if our lawyers work it out.
"

This is why I removed the header for now.

@hencrice
Copy link
Contributor Author

@kbrockhoff could you take another pass? If it's ok, could you merge this? Thanks!

@hencrice
Copy link
Contributor Author

@hencrice let's wait also on a green-light from @kbrockhoff

Hey @pjanotti he approved it a few days ago

@bogdandrutu bogdandrutu merged commit 4b46c21 into open-telemetry:master Aug 10, 2020
ljmsc referenced this pull request in ljmsc/opentelemetry-collector-contrib Feb 21, 2022
The `go.opentelemetry.io/otel/exporter/trace/jaeger` package was
mistakenly released with a `v1.0.0` tag instead of `v0.1.0`. This
resulted in all subsequent releases not becoming the default latest,
meaning that `go get`s pulled in the incompatible `v0.1.0` release of
that package when pulling in more recent packages from other otel
packages. Renaming the `exporter` directory to `exporters` fixes this
issue by consequentially renaming the package.

Additionally, this action also renames *all* exporters. This is
understood to be a disruptive action to existing users as they will need
to update any dependencies they currently have on our exporters.
However, it was decided to take this action regardless. The need to
resolve the existing issue explained above is highly important, and
given the Alpha state of this project these kinds of breaking changes
should be expected (though not without reason).

Resolves #331

Co-authored-by: Rahul Patel <[email protected]>
codeboten pushed a commit that referenced this pull request Nov 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port X-Ray segments ingestion via UDP
7 participants