-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ingestion of X-Ray segments via UDP #502
Add ingestion of X-Ray segments via UDP #502
Conversation
This is largely a port of the X-Ray segment ingestion code from the existing X-Ray Daemon codebase. The main differences are: 1. in the `poll()`, I resorted to not having a buffer pool but use a local buffer instead. Details can be found in the comments for the function. 2. switched `read()` to use a more Go idiomatic return signature Closes #430.
Codecov Report
@@ Coverage Diff @@
## master #502 +/- ##
==========================================
+ Coverage 86.26% 86.42% +0.15%
==========================================
Files 195 199 +4
Lines 10632 10753 +121
==========================================
+ Hits 9172 9293 +121
Misses 1128 1128
Partials 332 332
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for helping with this!
1. Add Amazon copyright header 2. Moved references to commit message 3. Refactored a large chunk of receiver out to a separate UDP poller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the cleanups!
@james-bebbington Could you take a look? This PR is mostly vendor-specific code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @hencrice - some small suggestions. I also need to check with someone from CNCF/OpenTelemetry if the added copyright notices are ok.
@open-telemetry/admins @open-telemetry/technical-committee this PR is adding copyright notices below the CNCF/OpenTelemetry one. I would like a confirmation if that is ok or not. Example: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/502/files#diff-cfed4d263740d3decbf9e18976f13188R1-R23 |
Thanks for bringing this up. I'm curious as well since this is essentially a port of some of the existing X-Ray daemon functionalities, with some relatively big changes to the original code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking this PR until I get the confirmation about license. Sorry for the inconvenience.
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
// Copyright 2018-2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lizthegrey @ccaraman @sarahnovotny do you know if this is the right way to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to know as well. The only reason I put this kind of header in some files is because some of the files are largely a port of the code in the existing X-Ray daemon with some relatively big changes. The X-Ray daemon itself is licensed under Apache v2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bogdandrutu @lizthegrey @ccaraman @sarahnovotny any update?
I've reached out to @alolita on AWS side regarding the use of the Amazon copyright header. And she will follow up with OT committee.
Assigned to @pjanotti in today's meeting |
Oh no! I missed the SIG. Meant to ask about the licensing header issue mentioned above. @bogdandrutu is there an ETA on how to proceed with the licensing issue? |
rlen, err := p.read(bufPointer) | ||
if errors.As(err, &errIrrecv) { | ||
p.logger.Error("irrecoverable socket read error. Exiting poller", zap.Error(err)) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should a new poller be launched if it returns here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @kbrockhoff thanks for the feedback.
I don't think so in this case because if it's an irrecoverable error (i.e. non-transient net.Error
in read()
of udppoller
), that implies the UDP socket is probably closed or broken to the degree that launching more go routines will not make more progress.
I can add a TODO here to think about attempting to reopen the UDP socket with the same address. But there's no guarantee that will work either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The X-Ray daemon takes the approach that if is there is a socket, connection or configuration issue then it kills the entire instance. This approach will not work with the collector because there may be other receivers/exporters that are still working. The collector should not stop processing metrics or logs just because traces are failing.
Tearing down and cleaning up a non-working socket and building a new one is probably the optimum approach for the final state. I am OK with just adding a TODO for now.
To make this work smoothly, we likely need to add a monitor which watches all the receivers/exports. If multiple are failing then the monitor will flip the health status to failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love the idea of a health monitor. We can even come up with an interface and force all plugins to implement:
- health ping/status report
- best-effort restart when the plugin is unhealthy
If they're both Apache Licensed there's no reason to repeat the Apache boilerplate twice. However, wrt giving Amazon special mention as a copyright holder separate from just being "the opentelemetry authors", the simplest way would be to get the deletion of the amazon specific mention and replacement with copyright (firstyear)-2020 the otel authors done by an Amazon contributor covered by the CLA amazon has on file with the CNCF/OTel. |
@pjanotti @kbrockhoff @bogdandrutu I have removed the Amazon copyright header. This PR is ok to merge now. Thanks! |
@hencrice let's wait also on a green-light from @kbrockhoff |
If possible let's get a sign-off from @alolita too. @alolita can you confirm that Amazon is transferring the copyright for existing code from the x-ray daemon to OpenTelemetry? New code is automatically handled by the CLA but I haven't seen any notion in this thread confirming we're fine for copied code. As in open-telemetry/community#305 handling copied code is not trivial - even now the Java agent takes a stance of having an employee from datadog do the actual merge when copying in code, which won't apply here as it stands. @lizthegrey makes a good point that the word OpenTelemetry authors includes Amazon so the copyright isn't lost. It makes sense to me but I'm not comfortable without explicit wording from the Amazon side. |
Hey @anuraaga, thanks for bringing this up. Since this is a port of existing X-Ray daemon functionalities, I heavily referenced the original implementation so it's great that we want to be confident in the copyright. Here's what @alolita wrote in the opentelemetry-devs Slack channel: This is why I removed the header for now. |
@kbrockhoff could you take another pass? If it's ok, could you merge this? Thanks! |
Hey @pjanotti he approved it a few days ago |
The `go.opentelemetry.io/otel/exporter/trace/jaeger` package was mistakenly released with a `v1.0.0` tag instead of `v0.1.0`. This resulted in all subsequent releases not becoming the default latest, meaning that `go get`s pulled in the incompatible `v0.1.0` release of that package when pulling in more recent packages from other otel packages. Renaming the `exporter` directory to `exporters` fixes this issue by consequentially renaming the package. Additionally, this action also renames *all* exporters. This is understood to be a disruptive action to existing users as they will need to update any dependencies they currently have on our exporters. However, it was decided to take this action regardless. The need to resolve the existing issue explained above is highly important, and given the Alpha state of this project these kinds of breaking changes should be expected (though not without reason). Resolves #331 Co-authored-by: Rahul Patel <[email protected]>
Description:
This is largely a port of the X-Ray segment ingestion code from the existing X-Ray Daemon codebase. The main differences are:
poll()
, I resorted to not having a buffer pool but use a local buffer instead. Details can be found in the comments for the function.read()
to use a more Go idiomatic return signatureutil
package to be more idiomatic.Note that the receiver is still incomplete. So I did not change the README file.
References for this port:
Link to tracking Issue:
Closes #430
Testing:
New unit tests added.
Documentation:
No README update needed.