-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add gRPC plugin proto changes #1323
Conversation
@@ -0,0 +1,199 @@ | |||
syntax = "proto3"; | |||
|
|||
package jaeger.api_v2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure that this is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think jaeger.storage.v1
rpc GetTrace(GetTraceRequest) returns (GetTraceResponse); | ||
rpc GetServices(GetServicesRequest) returns (GetServicesResponse); | ||
rpc GetOperations(GetOperationsRequest) returns (GetOperationsResponse); | ||
rpc FindTraces(FindTracesRequest) returns (stream Trace); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning a stream here solves the issue surfaced in the other PR where grpc message limits were being hit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, not sure it completely solves those issues unless we allow multiple Trace entries in the batch for the same trace. A single trace can still easily be larger than the msg limit.
Since this is a storage API, I would rather go with returning stream FindTracesResponseChunk
where each chunk contains []Span
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity what sort of size (and number of spans) would you expect to see the largest of traces reach? As traces contain more than just spans I think I'll have to give this a bit more thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have seen traces in production with 100,000 spans. At average span size of around 500 bytes, that's 50M. Even 10k spans would add up to 5Mb.
I think eventually we'd want to make the chunk size configurable, but I would start with 1000 or even 100 spans.
@@ -0,0 +1,199 @@ | |||
syntax = "proto3"; | |||
|
|||
package jaeger.api_v2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think jaeger.storage.v1
|
||
package jaeger.api_v2; | ||
|
||
option go_package = "proto"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps proto -> api_v1, so that Go package is api_v1
, similar to model/proto/api_v2.proto
google.protobuf.Timestamp end_timestamp = 1 [ | ||
(gogoproto.stdtime) = true, | ||
(gogoproto.nullable) = false | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you use some auto-formatter that creates these indentations? in the model.proto we have it more readable
message Log {
google.protobuf.Timestamp timestamp = 1 [
(gogoproto.stdtime) = true,
(gogoproto.nullable) = false
];
repeated KeyValue fields = 2 [
(gogoproto.nullable) = false
];
}
rpc GetTrace(GetTraceRequest) returns (GetTraceResponse); | ||
rpc GetServices(GetServicesRequest) returns (GetServicesResponse); | ||
rpc GetOperations(GetOperationsRequest) returns (GetOperationsResponse); | ||
rpc FindTraces(FindTracesRequest) returns (stream Trace); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, not sure it completely solves those issues unless we allow multiple Trace entries in the batch for the same trace. A single trace can still easily be larger than the msg limit.
Since this is a storage API, I would rather go with returning stream FindTracesResponseChunk
where each chunk contains []Span
message FindTraceIDsResponse { | ||
oneof response { | ||
FindTraceIDsSuccess success = 1; | ||
GrpcStoragePluginError error = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't we instead use grpc Status? It already includes an error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to note here is that the gogo plugin of Status is go 1.11+ (https://github.com/gogo/status)
// rpc InsertProbabilitiesAndQPS(InsertProbabilitiesAndQPSRequest) returns (InsertProbabilitiesAndQPSResponse); | ||
// rpc GetThroughput(GetThroughputRequest) returns (GetThroughputResponse); | ||
// rpc GetProbabilitiesAndQPS(GetProbabilitiesAndQPSRequest) returns (GetProbabilitiesAndQPSResponse); | ||
// rpc GetLatestProbabilities(GetLatestProbabilitiesRequest) returns (GetLatestProbabilitiesResponse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove this
rpc WriteSpan(WriteSpanRequest) returns (WriteSpanResponse); | ||
} | ||
|
||
service GrpcSpanReaderStoragePlugin { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we need Grpc and StoragePlugin prefix/suffix, they will be obvious from the package name.
rpc FindTraceIDs(FindTraceIDsRequest) returns (FindTraceIDsResponse); | ||
} | ||
|
||
service GrpcDependenciesStoragePlugin { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason we don't want this split into reader and writer, similar to span storage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to do that. Do we consider dependency store writer as out of scope for this changeset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can skip it for now.
to resolve Gopkg.lock conflict you may want to merge master, |
import "google/protobuf/timestamp.proto"; | ||
import "google/protobuf/duration.proto"; | ||
import "google/protobuf/empty.proto"; | ||
import "google/rpc/status.proto"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a go 1.11+ requirement (https://github.com/gogo/status)
message WriteSpanResponse { | ||
oneof response { | ||
google.protobuf.Empty success = 1; | ||
google.rpc.Status error = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this is not what I meant. This Status object is transported by gRPC natively, we don't need to use it in our response payloads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying to basically remove the *Response messages and simply return *Success from the rpc functions? I've just had a play doing this and it makes the jaeger plugin code that talks to the generated proto code much cleaner.
]; | ||
} | ||
|
||
message GetDependenciesSuccess { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetDependenciesResponse ? Here and in all other responses.
message FindTracesResponseChunk { | ||
repeated bytes spans = 1 [ | ||
(gogoproto.nullable) = false, | ||
(gogoproto.customtype) = "github.com/jaegertracing/jaeger/model.Span" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right dependency. The type should be Span from model.proto
, the fact that it is backed by some specific Go type is irrelevant. Otherwise how could one implement a plugin in Java?
option (gogoproto.goproto_registration) = true; | ||
|
||
message GetDependenciesRequest { | ||
google.protobuf.Timestamp end_timestamp = 1 [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor, but I think we should clean up the API and use start_time
and end_time
(don't need to call them timestamp
, it's just the type of the value, not the meaning of it).
|
||
service SpanWriterPlugin { | ||
// spanstore/Writer | ||
rpc WriteSpan(WriteSpanRequest) returns (google.protobuf.Empty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's define an empty WriteSpanResponse instead of google.protobuf.Empty
, it is better for extensibility.
looks good overall |
please resolve conflicts, the DependencyLink type has been extended in 1.10 |
@@ -22,7 +22,7 @@ option (gogoproto.sizer_all) = true; | |||
option (gogoproto.goproto_registration) = true; | |||
|
|||
message GetDependenciesRequest { | |||
google.protobuf.Timestamp end_timestamp = 1 [ | |||
google.protobuf.Timestamp end_time = 1 [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant replace lookback
with start_time
(and swap the order). The Query Service API in the future will be converted to similar gRPC API, but it will need to work over JSON. It's less cognitive load to have two timestamp-formatted fields in JSON than to have a lookback in nanoseconds.
@@ -32,7 +32,7 @@ message GetDependenciesRequest { | |||
]; | |||
} | |||
|
|||
message GetDependenciesSuccess { | |||
message GetDependenciesResponse { | |||
repeated bytes dependencies = 1 [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DependencyLink?
rpc GetTrace(GetTraceRequest) returns (GetTraceSuccess); | ||
rpc GetServices(GetServicesRequest) returns (GetServicesSuccess); | ||
rpc GetOperations(GetOperationsRequest) returns (GetOperationsSuccess); | ||
rpc GetTrace(GetTraceRequest) returns (GetTraceResponse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest implementing this similar to FindTraces, as a stream of chunks, because even a single trace could be larger than the default message limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that I'm unsure about and may need consideration is that having a Trace as a stream of Span means that the ProcessMap and Warnings fields are lost on the Trace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, but these two concepts only apply to the current JSON API used by the UI, they don't apply to the storage API (even though the storage API returns model.Trace
, it never actually populates ProcessMap or warnings, that's done post-retrieval).
|
||
package jaeger.storage.v1; | ||
|
||
option go_package = "api_v1"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe storage_v1
would be better?
Makefile
Outdated
protoc \ | ||
$(PROTO_INCLUDES) \ | ||
-I plugin/storage/grpc/proto \ | ||
--gogo_out=plugins=grpc,$(PROTO_GOGO_MAPPINGS):$(PWD)/plugin/storage/grpc/proto \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer folder name to match the package, e.g. plugin/storage/grpc/proto/storage_v1
} | ||
|
||
message FindTraceIDsResponse { | ||
repeated bytes trace_id = 1 [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trace_ids
model/proto/model.proto
Outdated
string child = 2; | ||
uint64 call_count = 3; | ||
string source = 4 [ | ||
(gogoproto.customtype) = "DependencyLinkSource", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that this would be the best of both worlds re: the conversation above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, if we do this, I would recommend adding a simple marshaling/unmarshaling test, similar to model/ids_test.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, not just for this, but also for importing model.Span
. I found it not particularly useful just adding gogoproto annotations without testing that serialization works as expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following what you mean by "importing model.Span
". In the plugin proto file? I don't believe that we actually use gogoproto annotations for importing Span there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean this:
message GetTraceRequest {
bytes trace_id = 1 [
(gogoproto.nullable) = false,
(gogoproto.customtype) = "github.com/jaegertracing/jaeger/model.TraceID",
(gogoproto.customname) = "TraceID"
];
}
From my previous experience with gogoproto, even though protoc
may compile it, you won't know if it will serialize correctly until you try it in unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and similar for spans
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been looking at this every which way, and it just bothers me that we're introducing this complication of custom type without any benefits whatsoever. The field should be just a string
, so that an external tool (e.g. a service mesh) could submit graph links with another value of Source field. Having strong type in the model
doesn't do anything useful. The only thing we need is the constant JaegerDependencyLinkSource = "jaeger"
to use as default.
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
You probably don't need to rebase, just squash & merge master. One other option could be to squash only commits starting from the one where the sign-off was missed, e.g. |
I'm unsure why |
Are you using the same protocol? Did you run "make proto-install" locally before generating? |
Thanks, that was the step I was missing. The plugin proto generated files are generating without the license header until |
We avoid changing generated files, even with tools (eg https://github.com/jaegertracing/jaeger/blob/master/thrift-gen/jaeger/constants.go). We need to add them to exclusion lists in the Makefile so that go fmt and lint don't run on them. |
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Hi @yurishkuro before I fix this up by merging master in, are there any changes here that you're waiting on me for? |
Makefile
Outdated
$(PROTOC) \ | ||
$(PROTO_INCLUDES) \ | ||
-I plugin/storage/grpc/proto \ | ||
--gogo_out=plugins=grpc,$(PROTO_GOGO_MAPPINGS):$(PWD)/plugin/storage/grpc/proto/storage_v1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in order to avoid code coverage drop, we should place the generated files under proto-gen/storage_v1/
Makefile
Outdated
$(PROTO_INCLUDES) \ | ||
-I plugin/storage/grpc/proto \ | ||
--gogo_out=plugins=grpc,$(PROTO_GOGO_MAPPINGS):$(PWD)/plugin/storage/grpc/proto/storage_v1 \ | ||
plugin/storage/grpc/proto/storage.proto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix indentation in these two lines to align with L395
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
Signed-off-by: Charles Dixon <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥇 🎉
Which problem is this PR solving?
Short description of the changes