Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove shaded content and restore dependency list to unshaded maven artifact #3935

Merged
merged 13 commits into from
Oct 14, 2022

Conversation

niloc132
Copy link
Contributor

Uses rules_jvm_external's java_export rule to mark a given java_library as an artifact for export to maven. This rule will find any dependencies that were originally fetched from maven (by looking at the maven_coordinates tags attached to their java_library), and exclude them from the output jar, plus also populate a pom.xml file template with those dependencies.

For consistency, renamed the "unshaded" deploy jar to "uberjar", so as to be clear that the dependencies have been merged, but not relocated.

Both shaded and unshaded were built and installed locally (via mvn install:install-file) and tested with the sample project at https://gist.github.com/niloc132/1244ad7b72ec951800c5c1c12bfe08bc to confirm that a stage1 build would first write typedast content, and then a stage2+3 would read them and finish the job.

Fixes #3896

@@ -12,6 +12,12 @@ http_archive(
url = "https://github.com/bazelbuild/rules_jvm_external/archive/%s.zip" % RULES_JVM_EXTERNAL_TAG,
)

load("@rules_jvm_external//:repositories.bzl", "rules_jvm_external_deps")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note for review: without these two load+invocations, bazel build //... no longer works, since the java_export rule declares some extra rules for deployment that we actually don't need. If there is a way to suppress those from existing, or if we aren't bothered that //... can't build, this can be removed.

main_class = "com.google.javascript.jscomp.CommandLineRunner",
runtime_deps = [":compiler_lib"],
)

# Produce an unshaded export jar based on :compiler_lib
load("@rules_jvm_external//:defs.bzl", "java_export")
version = "$(COMPILER_VERSION)" or "1.0-SNAPSHOT"
Copy link
Contributor Author

@niloc132 niloc132 Apr 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note for review: This was added for completeness, using the same variable name and default value as was found in bazel/sonatype_artifact_bundle.bzl. Strictly speaking, these coords have no effect, as long as only one java_export rule exists in the entire project.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You say "for completeness", but isn't this the jar that you wanted when you created this PR?
Isn't this the one that is expected to have only closure-compiler code and leave all of its dependencies out?

If not, which one is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry for lack of clarity on the self-review content: the "completeness" here was the redundant version that is also found in bazel/sonatype_artifact_bundle.bzl.

From what I can tell, in a "do everything the way that bazelbuild/rules_jvm_external would expect", this java_export (and matching ones for other artifacts) would be the way to create maven-consumable artifacts, and the sonatype_artifact_bazel file (and the .github/ci_support/deploy_sonatype_snapshot_bundles.js file) would probably go away or become a lot less complex.

I don't like duplicated logic (and constants), but couldn't find a clearer way to express this in bazel such that the version was shared between the two places.

@niloc132
Copy link
Contributor Author

Build failed with

Run mkdir -p bazel-bin-unsymlink
cp: cannot stat 'bazel-bin/compiler_unshaded_deploy.jar': No such file or directory

I'll push a commit to fix the paths that are being referenced to "unshaded" content.

@niloc132
Copy link
Contributor Author

Added a commit that replaces nearly every "compiler_unshaded_deploy.jar" reference with "compiler_shaded.jar", as they almost certainly all expect to find the entire classpath in a single jar, rather than an actual unshaded jar.

I did notice that in package.json, the build script is non-functional, it will try to call maven instead of bazel. That probably should be corrected, I can sneak it into this PR, or another followup?

@niloc132
Copy link
Contributor Author

 > google-closure-compiler-linux
yarn run v1.22.18
$ node ./test.js
google-closure-compiler-linux
  ✓ compiler binary exists
Exception in thread "main" java.lang.IllegalArgumentException: class com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.spi.BooleanOptionHandler does not have the proper constructor
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.OptionHandlerRegistry.getConstructor(OptionHandlerRegistry.java:111)
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.OptionHandlerRegistry.access$000(OptionHandlerRegistry.java:43)
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.OptionHandlerRegistry$DefaultConstructorHandlerFactory.<init>(OptionHandlerRegistry.java:217)
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.OptionHandlerRegistry.registerHandler(OptionHandlerRegistry.java:138)
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.OptionHandlerRegistry.initHandlers(OptionHandlerRegistry.java:72)
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.OptionHandlerRegistry.<init>(OptionHandlerRegistry.java:67)
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.OptionHandlerRegistry.getRegistry(OptionHandlerRegistry.java:57)
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.CmdLineParser.createOptionHandler(CmdLineParser.java:178)
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.CmdLineParser.addOption(CmdLineParser.java:146)
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.ClassParser.parse(ClassParser.java:34)
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.CmdLineParser.<init>(CmdLineParser.java:96)
	at com.google.javascript.jscomp.jarjar.org.kohsuke.args4j.CmdLineParser.<init>(CmdLineParser.java:71)
	at com.google.javascript.jscomp.CommandLineRunner$Flags.<init>(CommandLineRunner.java:957)
	at com.google.javascript.jscomp.CommandLineRunner.<init>(CommandLineRunner.java:1455)
	at com.google.javascript.jscomp.CommandLineRunner.main(CommandLineRunner.java:2239)

From this latest failure and taking a quick look at the closure-compiler-npm repo, it appears that the graal build assumes an uberjar, but not shaded build. Does it seem likely that we would need three flavors, unshaded and shaded (as the terms meant prior to c9f3d32, and as they generally mean in the general java ecosystem), plus also either a single jar with the full runtime classpath contained within it, or an ordered list of jars to use, which we can somehow pass to the graal build?

My guess right now is that in google/closure-compiler-npm, the build-scripts/reflection-config.json needs to be updated to reflect the package to the shaded jars. I'm currently trying to set up the rest of the build so I can confirm that - the simplest fix seems to be to duplicate all org.kuhsuke.args4j.* packages for a short time, until this release is out, then delete the unshaded packages.

Alternatively, I'm still looking for how the closure-compiler-npm build handled the unshaded jars pre-c9f3d327, but still had the full classpath. My best guess at the moment is that the pom file change is something of a red herring, and that the npm output was exclusively using bazel output (see https://github.com/google/closure-compiler/blob/c9f3d327be6146a7d4de1f3d7c03feb36f49524a^/.github/workflows/ci.yaml#L133 for example, the commit before the pom change, which explicitly uses the "unshaded" deploy jar). If npm wants non-jarjar'd uberjars jars, I can update these scripts - specifically, this diff would use compiler_uberjar_deploy.jar where previously it used compiler_unshaded_deploy.jar.

@h-joo h-joo requested a review from brad4d April 20, 2022 16:30
@brad4d brad4d self-assigned this Apr 20, 2022
@niloc132
Copy link
Contributor Author

niloc132 commented May 5, 2022

Haven't heard back in a while, so I tried the simple replacement of compiler_uberjar_deploy.jar in for compiler_unshaded_deploy.jar that was used before this diff, since that seems to be what it actually wants. We'll see what CI says about this.

@brad4d
Copy link
Contributor

brad4d commented May 6, 2022

Sorry. This PR is on my to-do list, but other things keep pushing it down before I can get to it.

@brad4d
Copy link
Contributor

brad4d commented May 7, 2022

I've sent an internal change for review that will remove the bad "script" entries from package.json and the corresponding yarn commands from the README.md file.
Thanks for pointing those out.

@@ -147,3 +153,15 @@ http_archive(
load("@com_github_johnynek_bazel_jar_jar//:jar_jar.bzl", "jar_jar_repositories")

jar_jar_repositories()

# Declare protobuf-java from maven central, so the unshaded maven jar can recognize external dependencies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we care about maven being able to find protobuf, but we don't care about guava, auto:value, etc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do care about those - guava and autovalue are indeed picked up by the pom automatically, as we would like them to be. Either run the build and examine the generated pom, or check out the deployed one I mentioned at https://repo1.maven.org/maven2/com/vertispan/javascript/closure-compiler-unshaded/v20220502-1/closure-compiler-unshaded-v20220502-1.pom

The difference is where closure gets those dependencies from in the bazel build - the current WORKSPACE.bazel calls @google_bazel_common//:workspace_defs.bzl's google_common_workspace_rules (see https://github.com/google/bazel-common/blob/82a7dd0f/workspace_defs.bzl#L59-L191 for the version of the function currently in use), except here:

def google_common_workspace_rules():
    """Defines WORKSPACE rules for Google open-source libraries.
    Call this once at the top of your WORKSPACE file to load all of the repositories. Note that you
    should not refer to these repositories directly and instead prefer to use the targets defined in
    //third_party.
    """
    #...
    maven_import(
        group_id = "com.google.guava",
        artifact_id = "guava",
        version = "31.0.1-jre",
        licenses = ["notice"],
        sha256 = "d5be94d65e87bd219fb3193ad1517baa55a3b88fc91d21cf735826ab5af087b9",
    )
    #...
    maven_import(
        group_id = "com.google.auto.value",
        artifact_id = "auto-value",
        version = "1.6",
        licenses = ["notice"],
        sha256 = "fd811b92bb59ae8a4cf7eb9dedd208300f4ea2b6275d726e4df52d8334aaae9d",
    )

By virtue of using the maven_import defined earlier in that file (see line 55), a tag is set with the appropriate maven coords:

        tags = ["maven_coordinates=" + coordinates],

With that tag set, the java_export rule from @rules_jvm_external will correctly detect that these dependencies originally come from maven and emit dependency tags.

On the other hand, the same file also defines a rule for protobufs, which closure-compiler uses - actually it defines two (still a lie, it defines three, but two of the rules point to the same actual dependency), but only one is wired up with a maven coords tag:
https://github.com/google/bazel-common/blob/82a7dd0f/workspace_defs.bzl#L354-L368

    maven_import(
        group_id = "com.google.protobuf",
        artifact_id = "protobuf-java",
        version = "3.7.0",
        licenses = ["notice"],
        sha256 = "dc7f93e3a3dc2c11be5ba9672af3e26410f0a3289312dbf2260d4d8a0c711a51",
    )

    for protobuf_repo in ("com_google_protobuf", "com_google_protobuf_java"):
        http_archive(
            name = protobuf_repo,
            sha256 = "6b6bf5cd8d0cca442745c4c3c9f527c83ad6ef35a405f64db5215889ac779b42",
            strip_prefix = "protobuf-3.19.3",
            urls = ["https://github.com/protocolbuffers/protobuf/archive/v3.19.3.zip"],
        )

Since closure can't actually get by with such an old build of protobuf as 3.7.0, BUILD.bazel pointed to @com_google_protobuf (and uses "bazelbuild/rules_java" as @protobuf_java_rules to access the java_proto_library() def, plus "bazelbuild/rules_proto" as @protobuf_proto_rules for proto_library and co).

(If you recall, this was the third "I'm really not sure about this, and there may well be a better way to convince bazel to do this properly" that I put in the issue, and I still suspect this to be a bug in bazel_common.)

If this is a bug in bazel_common, it has been there for a while, but it seems deliberate that these are 12 minor versions apart, and that no maven coords are attached (my understanding of bazel conventions is that there should be an alias() or something pointing at protobuf_repo + "//:protobuf_java (which points at https://github.com/protocolbuffers/protobuf/blob/v3.19.3/BUILD#L829-L833 fwiw), and that this generated alias would in turn have the maven_coordinates tag.

Apologies if this is too much detail for the answer, I thought this was clear in the issue, so wanted to be unambiguous.

main_class = "com.google.javascript.jscomp.CommandLineRunner",
runtime_deps = [":compiler_lib"],
)

# Produce an unshaded export jar based on :compiler_lib
load("@rules_jvm_external//:defs.bzl", "java_export")
version = "$(COMPILER_VERSION)" or "1.0-SNAPSHOT"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You say "for completeness", but isn't this the jar that you wanted when you created this PR?
Isn't this the one that is expected to have only closure-compiler code and leave all of its dependencies out?

If not, which one is?

@brad4d
Copy link
Contributor

brad4d commented May 12, 2022

@ChadKillingsworth this change will need to be synchronized with a change to https://github.com/google/closure-compiler-npm since that repo builds and uses compiler_unshaded_deploy.jar.
I think it should work to have it just use compiler_uberjar_deploy.jar instead.

Do you have any concerns about that?

It looks like the uberjar just includes some additional .proto files and META-INF/maven/protobuf-java/pom.(xml|properties) files.

Here's a diff of the jar contents.

unshadedto_uberjar_diff.txt

@ChadKillingsworth
Copy link
Collaborator

Just now seeing this. I have no concerns as long as a build target exists to produce the full jar. There will need to be path updates to the build made to the npm repo, but those changes would be pretty trivial.

@brad4d
Copy link
Contributor

brad4d commented Oct 13, 2022

And I just now saw Chad's response. :)

I've imported this PR and I'm sending it through internal review and testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unshaded jar contains Apache Ant, GSON, etc.
3 participants