Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support protoc --proto_path #8

Closed
mmmdreg opened this issue May 4, 2015 · 22 comments
Closed

Support protoc --proto_path #8

mmmdreg opened this issue May 4, 2015 · 22 comments

Comments

@mmmdreg
Copy link

mmmdreg commented May 4, 2015

When you define protos importing definitions from another project, protoc --proto_path=blah allows you to depend on these without re-compiling them.

In your plugin, adding another srcDir will compile the dependent protos, resulting in duplicates in a multi module project. The maven proto plugin correctly adds dependent protos to the include path without compiling them, as does the Thomas lee gradle-protoc-plugin.

Would be great if you can include the ability to modify the include path only, without adding the included proto files as arguments requiring compilation. Whether this is done using simple filesets or dependency archives isn't an issue.

Thanks

@zhangkun83
Copy link
Collaborator

Such functionality is necessary. I am a little struggling on how it should be provided.

Conceptually it is a compile-time dependency for protoc compilation. The dependencies block sounds the proper place for it. However, the current plugin already allows you to specify dependencies for a source set, and the dependent protos will not only be included in --proto_path, but also be compiled. So it actually defines dependencies for javac compilation of the source set, which is also a valid functionality.

It would be ideal if dependencies allowed you to specify extra attributes for a dependency, but it doesn't seem possible. It looks the only option is to have something like:

sourceSets {
  main {
    proto {
      importDir 'path/to/dir'
    }
  }
}

@ejona86 @aantono any ideas?

@ejona86
Copy link
Collaborator

ejona86 commented May 8, 2015

In earlier discussion with protobuf team I suggested we primarily provide this functionality through deps in maven central, like JARs with the protos inside. I've not looked at the suggested config yet.

@zhangkun83
Copy link
Collaborator

In earlier discussion with protobuf team I suggested we primarily provide this functionality through deps in maven central, like JARs with the protos inside. I've not looked at the suggested config yet.

For this use case we actually want to compile the protos from the JARs, which is what the plugin is doing right now.

@mmmdreg can you point me to a project that would benefit from this feature? This would help me decide what the best solution is.

@mmmdreg
Copy link
Author

mmmdreg commented May 11, 2015

It's internal so I can't share specifics but a simplified form looks like:

SharedProject

  • src/main/java
  • src/main/proto

DependentProject1 (swing client)

  • src/main/java
  • src/main/proto

DependentProject2 (swing client)

  • src/main/java
  • src/main/proto

PackagedProject (Java applet)
-- creates package out of the above 3.

The protos in the dependent projects import some protos from the shared project.

If we were to prevent proto compilation on the shared project, the packaged project will have to deal with the copies compiled by both dependent projects. It's cleaner to simply allow import without compile.

Refactoring suggestions are welcome but the immediate goal is to achieve functionality similar to our maven build with minimal structural changes. The application is a bit legacy.

Maybe two types of configuration dependencies for the two cases could work. Also it would be nice to create dependencies on configurations of sibling projects without needing to package the protos into archives.

eg protobuf project(path : ':sharedProject', configuration: 'protos')

On 2015/05/09, at 7:22, Kun Zhang [email protected] wrote:

In earlier discussion with protobuf team I suggested we primarily provide this functionality through deps in maven central, like JARs with the protos inside. I've not looked at the suggested config yet.

For this use case we actually want to compile the protos from the JARs, which is what the plugin is doing right now.

@mmmdreg can you point me to a project that would benefit from this feature? This would help me decide what the best solution is.


Reply to this email directly or view it on GitHub.

@ejona86
Copy link
Collaborator

ejona86 commented May 12, 2015

For this use case we actually want to compile the protos from the JARs, which is what the plugin is doing right now.

But I'm saying we want the import from deps on Maven Central. Also, I didn't think the current compile from JAR supported building with JAR on Maven Central; does it?

@zhangkun83
Copy link
Collaborator

But I'm saying we want the import from deps on Maven Central. Also, I didn't think the current compile from JAR supported building with JAR on Maven Central; does it?

The dependencies API supports pulling from Maven Central. Not sure if the way the plugin uses the API pulls artifacts. If not, it should be trivial to make it do so.

@ejona86
Copy link
Collaborator

ejona86 commented May 12, 2015

So how do we see someone using importDir along with a JAR dependency?

@zhangkun83
Copy link
Collaborator

@mmmdreg ideally DependentProject1 and DependentProject2 would automatically include the proto source dirs from their depended projectSharedProject in their --proto_path. If that's feasible, you won't need an interface for it.

@mmmdreg
Copy link
Author

mmmdreg commented May 28, 2015

I think the issue was the build of the dependent projects is including the artifacts (ie compiled Java) from the shared project, so adding to the proto_path and compiling them again will mean duplicate classes as well as wasted effort. It's possible to exclude these duplicates but that seems unnecessary too.

Maven proto plugin by default adds proto definitions in any maven dependencies to the include path so the shared compiled protos compiled once and included as a normal dependency and the dependent protos are compiled based on the included shared protos without recompiling.

For now I will just use gradle-protoc-plugin but in any case it'll be nice if you can expose this protoc feature at some point.

@zhangkun83
Copy link
Collaborator

I thought adding the proto directories of the shared project to the dependent projects' --proto_path would allow you to compile the dependent projects' proto files, which may import proto files from the shared project, without re-compiling the proto files from the shared project. Is it the case?

@mmmdreg
Copy link
Author

mmmdreg commented May 28, 2015

Oh right, misread your post. That's exactly right. :)

@aantono
Copy link
Contributor

aantono commented Jul 9, 2015

One issue we have run into while figuring out how to share message definitions was the question of where should the Java code be generated, compiled and shared. The problem is that the generated Java code by protoc is reliant upon a specific version of a protoc compiler, which in turn expects that the generated code is compiled against a specific version of the protobuf-java library. But not only should it be compiled against that version it also MUST be the same version that is used at runtime, otherwise you can get Linking errors if you for example used protoc version 2.3 to generate the Java code, but at runtime you end up using version 2.5 of the library.

The general solution that we came up with was to NEVER share the generated Java classes, but instead always share the proto artifacts and each deployable service (not a library or api) should ALWAYS generate their own Java version of the protos and compile them against whatever version of supported protoc they wish.

For libraries and apis we recommend in general not to make signatures that depend on specific generated protoc Java classes, but instead define their own interface facades, so that the consumer end service, if they choose to, can provide a wrapper implementation that uses the underlying Proto object as a data source. This also has a nice benefit of having a place to define various business logic and other processing functions, since Protobuf is a data structure and not a rich object.

To simplify the coding effort and the code maintenance, we rely on the protoc code generator plugins to apply code templates that encapsulate the wrapper class code, so it all gets embedded into the generated Java class, i.e. in a money.proto we would define a Money message. The rich Java Money object would also have various behavior functions, like add, divide, convert, etc. So we have a Money.template that defines an implementation of api.Money interface which internally wraps the proto.Money protobuf message and uses its data during invocation of the various functional methods like add, etc. It also adds a toMoney() method, so you can easily get a reference to a rich instance by calling api.Money javaMoney = protobufMoney.toMoney(). The various libraries do not directly depend on proto.Money class, but instead all code to api.Money, and rely on the consuming services to provide the implementation.

Good news is that the code templates can also be packaged as artifacts and shared as build dependencies. :) So if it seems like a complex setup, it is only sounds like it from the explanation, but in reality is very simple and has very low cost of maintenance, while virtually preventing any case of Java Serialization or signature incompatible Linking Errors to throw a wrench into a gearbox. Since protos are really a communication data protocol, this allows for isolation of the wire data and in-JVM rich implementation. While we want to standardize and optimize the wire protocol, we are all about in-JVM flexibility of usage, so trying to have EVERY repo depend on exactly the same version of a specific library is way to tight of a coupling. This design also aids with being able to create easy test mocks, since the interfaces don't have to be Proto Wrappers, but can just be locally created mocks, etc. All-in-all, we praise this solution as a triple-win scenario.

Feel free to ask clarifying questions in case I've rambled through some things without clearly explaining the details, I would be happy to explain and provide more info.

@ejona86
Copy link
Collaborator

ejona86 commented Jul 9, 2015

I agree with the idea behind "never share the generated Java classes," and I've spent a lot of time educating others as to why it is required and why it sucks (but is the only short-term option).

However, protobuf-java will soon 1) be shipping .proto files for well-known protobufs and 2) include pre-compiled code in the runtime. We also have cases of multiple projects within a single application are wanting to use common protos and files that depend on those. Using --proto_path actually works fine in these scenarios and prevents multiple copies of proto-generated .class files without adding another layer of abstraction.

We are going to need to differentiate whether we generate code or not for "dependencies." If we are generating code for a "dependency" it isn't acting like a dependency as much as an extension of the source.

@aantono
Copy link
Contributor

aantono commented Jul 9, 2015

@ejona86, completely agree with the need to differentiate. IMHO it seems that if you declare something as protobuf configuration dependency in Gradle, it is indeed an external artifact dependency. The dependency does not always have to be in a form of .class files, it could be anything else like .xml or even .proto :)
If you want to simply adjust a --proto_path on a local file system, then it is a sourceSet configuration manipulation, so it should be handled by some kind of sourceSet or other config setting adjustment.
Another approach could be to have 2 different Gradle artifact configurations. One could be protobuf and another could be protobufInclude, so any external archives declared for protobufInclude will be extracted into a directory and automatically added to -I instead of direct protoc arguments.

Thoughts?

P.S.
In general I've always been a bit careful with using -I because that assumes that we are going to generate partially compilable code and expect that the rest of the protobuf Java classes will come from elsewhere. If we follow the recommendation of NEVER sharing generated Java classes, then where is that elsewhere would be???

@ejona86
Copy link
Collaborator

ejona86 commented Jul 9, 2015

@aantono, dependencies need not be external, and external artifacts need not be "dependencies." As we've already done for protoc itself, we can grab artifacts even if they aren't in the dependencies {} section.

Having a separate protobufInclude could work.

Concerning the P.S., this is why I brought up protobuf-java itself. It is going to become a common occurrence to use -I with it. For the built-in protos they are already including the .proto files in the protobuf-java JAR for use with protoc.

@aantono
Copy link
Contributor

aantono commented Jul 9, 2015

Does makes sense to support both (all 3) then... One thing we do have to do now is to copy the descriptor.proto from the RPM into the protos-data.tar since we heavily use options, etc, so need to have descriptor messages to compile against. :)

@ejona86
Copy link
Collaborator

ejona86 commented Jul 11, 2015

You can already see that descriptor.proto is in protobuf-java's JAR. This is the same case as other well-known types like any, empty, and wrappers, so it would be a "protobuf" dependency or a "protobufInclude" dependency (depending on how we design things).

It seems like we are on the same page as far as what features to support, but how we expose the features to the user isn't quite agreed.

@aantono
Copy link
Contributor

aantono commented Jul 12, 2015

Agreed. I think we just need to decide what would be the most appropriate and intuitive way to define the configuration that makes sense.

@zhangkun83
Copy link
Collaborator

You can already see that descriptor.proto is in protobuf-java's JAR. This is the same case as other well-known types like any, empty, and wrappers, so it would be a "protobuf" dependency or a "protobufInclude" dependency (depending on how we design things).

There is a third option -- we search in the compile dependency for .proto files, and include them in --proto_path automatically. The rationale is, if you use a library that publishes .proto files along with compiled generated classes (protobuf-java being an example), you will add it to compile dependency anyway, and it would be cumbersome if you have to add it to either protobuf or protobufInclude dependency too. This is also the idea from my previous comment.

@aantono
Copy link
Contributor

aantono commented Jul 17, 2015

This sounds good, but I have a fear that searching through every single dependency jar for included *.proto files might have a negative performance impact. It is not unusual to have multiple hundred dependency jars, so not sure how performant would that be. Plus that could result in a negative side-effect, where an undesired *.proto file contained in the jar would get pulled in, without any way to exclude it, if needed.

@ejona86 thoughts?

@ejona86
Copy link
Collaborator

ejona86 commented Jul 17, 2015

I had a bit of concern about searching through the JAR files as well, but really, that is already done during compilation. In addition, ZIP does not have a tree of directories; it just has a list of files (some with / in the name). So scanning through all JARs should be similar cost to looking for a file in the classpath.

Since this would only be used with -I, if we extract extra .protos they will simply be ignored.

Overall, sounds fine.

@zhangkun83
Copy link
Collaborator

This has converged with #15.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants