Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: hamcrest required in class path in 2.55.0 #30617

Closed
16 tasks
Abacn opened this issue Mar 12, 2024 · 9 comments · Fixed by #30635
Closed
16 tasks

[Bug]: hamcrest required in class path in 2.55.0 #30617

Abacn opened this issue Mar 12, 2024 · 9 comments · Fixed by #30635

Comments

@Abacn
Copy link
Contributor

Abacn commented Mar 12, 2024

What happened?

Found during 2.55.0RC1 release validation against Dataflow template

This used to work in Beam 2.54.0

// Main.java
PipelineOptions options = PipelineOptionsFactory.fromArgs(argv).withValidation().create();
// Build.gradle

dependencies {
    implementation(group: 'org.apache.beam', name: 'beam-sdks-java-io-google-cloud-platform', version: "$beam_version") {
        exclude group: "org.hamcrest", module: "hamcrest"
    }
    implementation(group: 'org.apache.beam', name: 'beam-sdks-java-core', version: "$beam_version") {
        exclude group: "org.hamcrest", module: "hamcrest"
    }
    implementation(group: 'org.apache.beam', name: 'beam-runners-direct-java', version: "$beam_version") {
        exclude group: "org.hamcrest", module: "hamcrest"
    }
}

However, in Beam 2.55.0RC1, there is exception thrown (see https://github.com/GoogleCloudPlatform/DataflowTemplates/actions/runs/8251849699?pr=1361):

Exception in thread "main" java.lang.NoClassDefFoundError: org/hamcrest/Matcher
        ...
        at org.apache.beam.sdk.options.PipelineOptionsFactory.<clinit>(PipelineOptionsFactory.java:543)
        at com.github.abacn.MinimumStreaming.main(MinimumStreaming.java:19)
Caused by: java.lang.ClassNotFoundException: org.hamcrest.Matcher
        at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
        ... 34 more

The interface failed to create is TestBigQueryOptions which extends TestPipelineOptions. In TestPipelineOptions, some methods return org.hamcrest.Matcher:

@Default.InstanceFactory(AlwaysPassMatcherFactory.class)

It seems beam actually always requires org.hamcrest present in runtime class path. Indeed, another minimum test fails on both 2.54.0 and 2.55.0:

PipelineOptions options = PipelineOptionsFactory.fromArgs(argv).withValidation().as(TestPipelineOptions.class);
dependencies {
    implementation(group: 'org.apache.beam', name: 'beam-sdks-java-core', version: "$beam_version") {
        exclude group: "org.hamcrest", module: "hamcrest"
    }
    implementation(group: 'org.apache.beam', name: 'beam-runners-direct-java', version: "$beam_version") {
        exclude group: "org.hamcrest", module: "hamcrest"
    }
}

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@Abacn
Copy link
Contributor Author

Abacn commented Mar 12, 2024

Found that beam-sdks-java-harness jar contained org.hamcrest classes in 2.54.0 but no longer did this in 2.55.0, possibly a side effect of #29924

However, beam artifact should not leak third party classes that were not renamed, so previously it should be considered a bug. Fix should be done on Template side

@Abacn
Copy link
Contributor Author

Abacn commented Mar 12, 2024

This is also related to #25806. Some test fixtures depending on hamcrest / junit currently lives in main scope

@Abacn
Copy link
Contributor Author

Abacn commented Mar 12, 2024

also related to #18336

@Abacn
Copy link
Contributor Author

Abacn commented Mar 13, 2024

There is a miminum example

public static void main(String[] argv) {
    PipelineOptionsFactory.fromArgs(argv);
}
dependencies {
    implementation(group: 'org.apache.beam', name: 'beam-sdks-java-io-google-cloud-platform', version: "$beam_version")
}

which passes in 2.54.0 but failing in 2.55.0 RC1

@Abacn
Copy link
Contributor Author

Abacn commented Mar 13, 2024

This is because DataflowTemplate/v1 has a LocalSpannerIO under the same package name org.apache.beam.sdk.io.gcp.spanner. So bad

(commented to wrong Issue)

@Abacn Abacn added this to the 2.55.0 Release milestone Mar 13, 2024
@kennknowles
Copy link
Member

The file being in the jar is wrong but dependency not necessarily. I think TestBigQuery is intended for users.

@kennknowles
Copy link
Member

See #18593

@Abacn
Copy link
Contributor Author

Abacn commented Mar 14, 2024

reopen for cherry pick

@Abacn
Copy link
Contributor Author

Abacn commented Mar 15, 2024

note: even this fixes the minimum work example #30617 (comment) template still need to remove the test scope declaration for org.hamcrest dependency (or remove the explicit declaration altogether), this is due to https://stackoverflow.com/questions/75333105/maven-test-dependency-removes-transitive-compile-dependency-from-uberjar

therefore good to add an item to CHANGE.md to remind this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment