Skip to content

Conversation

@jakelandis
Copy link
Contributor

This commit introduces a 3rd party plugin to co-locate
scala docs with java docs. https://github.com/lightbend/genjavadoc

Fixes: #1363

This commit introduces a 3rd party plugin to co-locate
scala docs with java docs. https://github.com/lightbend/genjavadoc

Fixes: elastic#1363
@jakelandis
Copy link
Contributor Author

Use the following commands to test:

./gradlew elasticsearch-spark-20:javadoc
open spark/sql-20/build/docs/javadoc/index.html

./gradlew elasticsearch-spark-13:javadoc
open spark/sql-13/build/docs/javadoc/index.html

@mark-vieira
Copy link
Contributor

mark-vieira commented Oct 15, 2019

While reading some of the documentation I gotta say this sounds pretty gnarly. What's the main motivation of this over just using scaladoc? Certainly there's awkwardness here since we are presenting a Scala API as though it were a Java API? Not to mention we are adding to the already atrocious compile times of the Scala compiler by adding this Java source generation that is only ever used for JavaDoc generation but now runs on every compile.

This motivation also makes no sense to me:

This project’s goal is the creation of real Javadoc for Scala projects. While Scaladoc—the native API documentation format of Scala—has several benefits over Javadoc, Java programmers are very much used to having access to API documentation in a syntax matching their programming language of choice.

But, this isn't Java code, it's Scala. So why are they doing all this hackiness to make a Scala API more amenable to Java programmers? I mean, aren't the consumers of this API Scala developers, and thereby more inclined to want the more robust Scaladoc vs a watered-down and potentially inaccurate Javadoc representation of the API? Is the assumption here that while we are implementing the API in Scala, it can still technically be consumed by Java code thereby targeting a Java audience? Is this accurate in this case? Do we expect the consumers of this stuff to be primarily Java folks or Scala folks?

@mark-vieira
Copy link
Contributor

mark-vieira commented Oct 15, 2019

To add, my comments are mainly rooted in my ignorance of the Scala ecosystem. If this is a common pattern for documentation of APIs implemented in Scala then by all means, I'll defer to the Scala experts here on whether this is a good idea. I simply found this method of JavaDoc generation to be a bit odd and want to ensure we have strong motivations for introducing this since it carries some cost.

}

javadoc {
dependsOn scaladoc //ensures that scala compiles and scala docs are valid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to run scaladoc to generate the javadocs. We just need to compile. This should depend on compileScala instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since nothing else actually runs the scala docs, I thought that generating them next to the java docs (even if not used) would serve as a good as test. But it is unnecessary and I will change it.

@jakelandis
Copy link
Contributor Author

I mean, aren't the consumers of this API Scala developers

Nope. Any consumer that wants to use this library with Spark will need to use this API. That could Java or Scala devs. My guess is most are Java devs. If we think most are Scala devs we should look at publishing the scala docs jar.

If this is a common pattern for documentation of APIs implemented in Scala then by all means

Not sure if this is a common pattern, but the library is by from Lightbend formerly known as Typesafe, which is well known.

@mark-vieira
Copy link
Contributor

Nope. Any consumer that wants to use this library with Spark will need to use this API. That could Java or Scala devs. My guess is most are Java devs. If we think most are Scala devs we should look at publishing the scala docs jar.

Gotcha, understood. Yeah, in that case I would agree then that having the API in Javadoc format would be most accessible. I assume while you were testing this the compile-time cost wasn't appreciable? If it is, there are perhaps things we could do, like make the compiler plugin conditional on the javadoc task being part of the task graph.

@jbaiera
Copy link
Member

jbaiera commented Oct 17, 2019

Nope. Any consumer that wants to use this library with Spark will need to use this API. That could Java or Scala devs. My guess is most are Java devs. If we think most are Scala devs we should look at publishing the scala docs jar.

We should probably avoid making these assumptions without any data. My gut feeling is that it's split 40/40/20 across Scala, Python, and Java (respectively), but I don't have any data to back this up other than past issues I can think of.

While reading some of the documentation I gotta say this sounds pretty gnarly.

From what I'm reading, it "emits structurally equivalent Java code for all Scala sources of a project, keeping the Scaladoc comments (with a few format adaptions)." Is this structural Java code created alongside the regular compiler class file output, or is this structural Java code an intermediary step? I'm assuming its the former, but I want us to be sure. Scala has some whacky language features in it that just do not translate well to structural Java code.

@jbaiera
Copy link
Member

jbaiera commented Oct 17, 2019

Nope. Any consumer that wants to use this library with Spark will need to use this API. That could Java or Scala devs.

Is the Java API documentation missing because it's nested under the Scala source? I want to make sure I'm understanding this clearly since I admittedly haven't spent much time looking into the Javadoc/Scaladoc problem for this part of the project.

@jakelandis
Copy link
Contributor Author

I assume while you were testing this the compile-time cost wasn't appreciable?

The javadoc task doesn't get called until 'pack' , which should only happen during the release, and doesn't add any appreciable time. I also checked compileScala in case the new params impacted it in a funny way, which it does not.

Is this structural Java code created alongside the regular compiler class file output, or is this structural Java code an intermediary step?

I'm not sure I completely follow the question, but the plugin puts the generated java code in a defined directory.

$ls spark/sql-20/build/classes
scala

$ls spark/sql-20/build/generated
java    sources

Just to be certain, I diffed the generated artifacts and the only difference is the addition of the scala javadoc. (the left is with the changes here)
image

Is the Java API documentation missing because it's nested under the Scala source?

Where is the Java API documentation missing ?

My gut feeling is that it's split 40/40/20 across Scala, Python, and Java (respectively)

We should probably consider adding the proper scala docs jar to the things that published ? However, I don't think this PR and that are mutually exclusive things.

@jakelandis jakelandis added v7.6.0 and removed v7.5.0 labels Oct 18, 2019
@mark-vieira
Copy link
Contributor

The javadoc task doesn't get called until 'pack' , which should only happen during the release, and doesn't add any appreciable time. I also checked compileScala in case the new params impacted it in a funny way, which it does not.

Right, since this is implemented as a Scala compiler plugin those Java source files are generated whenever the Scala source is compiled, regardless of whether we're actually generated JavaDocs or not, so that's the bit I would expect to have a cost. I suspect one of the most costly parts of the Scala compile process is building the AST, which needs to be done regardless of this generation step so in way it makes sense that it doesn't add much time to the build. 👍

@jbaiera
Copy link
Member

jbaiera commented Oct 21, 2019

I'm not sure I completely follow the question, but the plugin puts the generated java code in a defined directory.

Ok cool, this was my major paranoia. I was mostly worried that the compiler plugin was taking the scala source, turning it into java source, and then converting the resulting java source into byte code/class files. But it sounds like the scala source still compiles to byte code and class files directly, with the added generation of some Java files for use in the javadoc process.

Copy link
Member

@jbaiera jbaiera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small nit, but it looks like an easy change. Thanks for diving into this!

"-Ywarn-numeric-widen",
"-Xfatal-warnings"
"-Xfatal-warnings",
"-Xplugin:" + configurations.scalaCompilerPlugin.asPath,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that in the sql-13 build file we add these with tasks.withType(...) instead of appending here. I think I'm fine with either, but we should try to do the same in both files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 2e42e73

@jakelandis
Copy link
Contributor Author

@jbaiera - should be ready for another look. thanks!

Copy link
Member

@jbaiera jbaiera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jakelandis jakelandis merged commit f54c5e8 into elastic:master Oct 29, 2019
@jakelandis jakelandis deleted the add_scala_docs_to_java_docs branch October 29, 2019 16:43
jakelandis added a commit to jakelandis/elasticsearch-hadoop that referenced this pull request Oct 29, 2019
This commit introduces a 3rd party plugin to co-locate
scala docs with java docs. https://github.com/lightbend/genjavadoc

Fixes: elastic#1363
jakelandis added a commit that referenced this pull request Oct 29, 2019
This commit introduces a 3rd party plugin to co-locate
scala docs with java docs. https://github.com/lightbend/genjavadoc

Fixes: #1363
jakelandis added a commit that referenced this pull request Nov 25, 2019
#1368 introduces a plugin that is documents scala apis
for java developers. This plugin does not support 2.10 and
this variant was missed in the initial testing.

This commit for 7.x conditionally uses this plugin based on the
scala major version. 2.10 support has been removed for 8.x.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spark javadoc is incomplete

3 participants