-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added null guards to 'Process' when processing an incoming span #1723
Added null guards to 'Process' when processing an incoming span #1723
Conversation
Another solution to this is what was suggested by @pavolloffay in the issue, by making sure the Given that all clients currently fail in case the service name is empty, I'd argue that the right thing to do when the |
Codecov Report
@@ Coverage Diff @@
## master #1723 +/- ##
==========================================
- Coverage 98.36% 98.32% -0.05%
==========================================
Files 193 193
Lines 9358 9364 +6
==========================================
+ Hits 9205 9207 +2
- Misses 119 122 +3
- Partials 34 35 +1
Continue to review full report at Codecov.
|
Does the storage work fine when the service name is nill? If no, we could have a sanitizer which sets the service name to some value. |
Some storages fail (ES), some work (Memory). It's not really consistent across storage mechanisms. |
The client libraries also fail if the service name is null. In the codebase we do not do nil checks on the arguments. As the storage might also fails when the service name is empty I think maybe a better would be to set it to some value or fail early in the ingestion at one place. |
In theory, each method should guard itself against bad data ("be conservative in what you do, be liberal in what you accept from others"). We could add a check at the |
I don't think it's necessary to spread this check so much around. Storage implementations have a contract with the caller, Process is a required element in the data model. We should only check this on the receiving side, as close as possible to the servers, and not infect the rest of the business logic with input validation. |
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
I just updated the PR to do the check before the call to Here's the error message for the current state of this PR when a batch with a nil
And here are the metrics that are recorded when this happens: $ curl localhost:14269/metrics 2>/dev/null | grep __unknown
jaeger_collector_spans_received_total{debug="false",format="proto",svc="__unknown",transport="grpc"} 1
jaeger_collector_spans_saved_by_svc_total{debug="false",result="err",svc="__unknown"} 1
jaeger_collector_traces_received_total{debug="false",format="proto",sampler_type="unknown",svc="__unknown",transport="grpc"} 1
jaeger_collector_traces_saved_by_svc_total{debug="false",result="err",sampler_type="unknown",svc="__unknown"} 1 |
@pavolloffay , @yurishkuro , I believe this is now in line with that you suggested. Could you please take a look? |
Which problem is this PR solving?
__unknown
. Previously, such problems were not apparent via metrics.Before this PR, the collector would crash at different points (see #1722) depending on the storage being used. After this PR, the following message is shown in the logs:
And this counter is incremented: