-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bigquery: infer schema from protoc generated struct #1119
Comments
Out of curiousity, what version of protoc are you using, and are you using the latest protoc-go-gen? |
Oh, sorry. I'm using proto3 with libprotoc 3.6.0 and the latest version of protoc-go-gen (1.2.0 I suppose, since |
@pongad Any idea what's up with this? |
Hi, any news on this? |
Hi @simpajj - sorry, no news yet. We've had a bout of out of office activity these last two weeks. We're catching up on issues this week and next, and hope to get to this soon. |
@jadekler Alright :) Thanks for the info! |
Unfortunately, We could add special logic that says if the struct implements |
That would certainly help us out, if it is deemed as a good enough heuristic. I suppose another alternative would be for the golang protobuf compiler plugin to provide an option for skipping the generation of the |
In the future, the XXX fields will disappear entirely (golang/protobuf#276). Adding options to make them go away at the cost of reduced protobuf functionality is just a temporarily hack.
Well, there are discussions for an "opaque" API that has no exported fields, where everything is accessed through getter/setter methods. Also, we plan on having official support for dynamic messages, which also would not have any exported fields. It seems to me that if you wanted to specially treat protobuf messages, you should go all the way and support them fully. With the upcoming v2 API, you can iterate over all the protobuf fields and derive your schema from that. This approach seems to be a lot more robust. |
SGTM. We will wait for proto API v2, then use that to implement this feature. |
We're aiming for a stable v2 release hopefully at end of Q1 of 2019. |
Thanks. We eagerly await the new API and will wait on this issue until its release. |
cc @shollyman |
Looks like we're still not yet there on the proto v2 work, https://github.com/golang/protobuf/issues is tracking a variety of blocking issues. |
@shollyman @jadekler @jba Apologies for the broad ping, just want to check in on this issue! Now that the v2 API is out and the XXX fields are gone, it would be absolutely superb if the schema inference understood well-known-types (timestamps most of all) but possibly also the google.type messages from https://github.com/googleapis/googleapis. At my current shop we're using a hand-rolled protoc plugin to generate BigQuery conversion functions that correctly handle types like google.protobuf.Timestamp and google.type.Date. I believe if support for these types were added to the core BigQuery library, we could do away with this plugin altogether. From a wider ecosystem perspective, I think supporting the google.protobuf and google.type messages would also promote usage of protobuf in general, since long-term warehousing of protobuf messages in BigQuery would become next to trivial. I can only assume this is how it already works inside Google with Dremel? Curious to hear your thoughts here! |
Update: My team have started putting together a BigQuery encoding for protobufs here, using the v2 reflection APIs: https://github.com/einride/protobuf-bigquery-go It's still under development and not complete, but we're already getting some mileage out of it. I do think that functionality like this could live in the main bigquery package (all the dependencies are already there, and I think reading/writing protobufs to BigQuery is a common use case). For example as a new Until then, we're happy to accept pull requests on the encoding above, to make it more complete and robust. |
Client
BigQuery
Describe Your Environment
macOS 10.13.6
Expected Behavior
bigquery.InferSchema()
works for structs generated using protoc.Actual Behavior
googleapi: Error 400: Field EventTime.XXX_NoUnkeyedLiteral is type RECORD but has no schema, invalid
The protobuf compiler generates the following field when generating a Go struct:
This field prevents me from inferring the BigQuery schema from the struct. It would be nice to be able to infer the schema from structs generated by the protobuf compiler. Otherwise I'd have to manually specify the schema, which means that I also need to keep it in sync with any changes made to the protobuf message.
The text was updated successfully, but these errors were encountered: