Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java_tools tests don't pass on windows #12244

Closed
comius opened this issue Oct 9, 2020 · 30 comments
Closed

Java_tools tests don't pass on windows #12244

comius opened this issue Oct 9, 2020 · 30 comments
Assignees
Labels
area-Windows Windows-specific issues and feature requests team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: bug

Comments

@comius
Copy link
Contributor

comius commented Oct 9, 2020

Description of the problem:

Java_tools tests don't pass on Windows. The tests pass on Linux and MacOS.
First failing commit is e906f89 Update java_toolchain to properly declare config transitions.

Failing build: https://buildkite.com/bazel-trusted/java-tools-binaries-java/builds/146#3727aefe-9cca-4ffc-8aad-bd77a8841428

The test that fails is bazel_java_test.sh:1493 test_build_hello_world_with_remote_embedded_tool_targets
It is a simple java_library, java_binary built with line:
bazel build //java/main:main_deploy.jar --define EXECUTOR=remote

The define selects to compile singlejar (instead of a precompiled version).

Dependency chain:

//java/main:main_deploy.jar (c100cb1)
//java/main:main (c100cb1)
@bazel_tools//tools/jdk:current_java_toolchain (c100cb1)
@bazel_tools//tools/jdk:legacy_current_java_toolchain (c100cb1)
@bazel_tools//tools/jdk:remote_toolchain (c100cb1)
@remote_java_tools_linux//:toolchain (c100cb1)
@remote_java_tools_linux//:singlejar (68ac4b4)
@remote_java_tools_linux//:singlejar_cc_bin (68ac4b4)
@remote_java_tools_linux//:output_jar (68ac4b4)
@remote_java_tools_linux//:combiners (68ac4b4)

Emitted error:

ERROR: C:/tools/msys64/home/b/_bazel_b/7qk42fzu/execroot/io_bazel/_tmp/cb0c59f5da4da9f39b5558b460a48a99/root/7mljusu
5/external/local_java_tools/BUILD:733:11: C++ compilation of rule '@local_java_tools//:combiners' failed (Exit 2): cl.exe 
failed: error executing command C:/Program Files (x86)/Microsoft Visual 
Studio/2019/BuildTools/VC/Tools/MSVC/14.27.29110/bin/HostX64/x64/cl.exe /nologo /DCOMPILER_MSVC /DNOMINMAX 
/D_WIN32_WINNT=0x0601 /D_CRT_SECURE_NO_DEPRECATE ... (remaining 27 argument(s) skipped)

--
  | bazel-out/x64_windows-opt-exec-
2B5CBBC6/bin/external/local_java_tools/_virtual_includes/combiners\src/tools/singlejar/combiners.h(24): fatal error C1083: 
Cannot open include file: 'src/tools/singlejar/transient_bytes.h': No such file or directory
  | Target //java/main:main_deploy.jar failed to build
  | Use --verbose_failures to see the command lines of failed build steps.
  | INFO: Elapsed time: 2.046s, Critical Path: 1.01s
  | INFO: 49 processes: 48 internal, 1 worker.
  | FAILED: Build did NOT complete successfully
  | FAILED: Build did NOT complete successfully

src/tools/singlejar/transient_bytes.h is defined in cc_library hdrs and works on other OSes. It uses strip_include_prefix. Possibly problem with directory separator?

@davido
Copy link
Contributor

davido commented Oct 12, 2020

@comius @katre

Would it be an option to temporarily disable this test on Windows to not delay releasing new java_tools release and switching to new java_tools release per default in upcoming Bazel release (3.8?), to support new and shiny Java 15 toolchain?

Let's face it: we do not really care about Windows OS, don't we?

@katre
Copy link
Member

katre commented Oct 13, 2020

How can I reproduce this? Where is java-tools-binaries-java coming from?

@comius
Copy link
Contributor Author

comius commented Oct 13, 2020

Sorry I don't have a short reproduction. The toolchain is effectively executing src/upload_all_java_tools.sh, but on windows:

bazel build //src:java_tools_java11_zip
bazel test //src/test/shell/bazel:bazel_java_test_local_java_tools_jdk11 --define LOCAL_JAVA_TOOLS_ZIP_URL=file://./bazel-bin/src/java_tools_java11.zip

@philwo Do you have any easy way to manually test on windows?

@katre
Copy link
Member

katre commented Oct 13, 2020

That gets me started, thanks.

@katre
Copy link
Member

katre commented Oct 13, 2020

Managed to follow the test file and run the same commands as for the test_build_hello_world_with_remote_embedded_tool_targets (which appears from the command line to be the failing one).

On the windows testing VM, this passes, and is able to build singlejar and ijar and build the requested deploy jar.

I'm not sure what's different between the playground with my environment, and the actual test VM.

@katre
Copy link
Member

katre commented Oct 13, 2020

I see: my version is using @remote_java_tools_windows, not @remote_java_tools_linux, so somewhere the remote execution is set up incorrectly.

@katre
Copy link
Member

katre commented Oct 13, 2020

Although note that I checked with cquery, not query, so it's correctly using configured target semantics:

PS C:\Users\jcater\java_tools> C:\Users\jcater\bazel\bazel-bin\src\bazel.exe cquery 'somepath(//java/main:main_deploy.jar,@remote_java_tools_windows//:combiners)' --define=EXECUTOR=remote
WARNING: C:/users/jcater/_bazel_jcater/7hkznrqn/external/remote_java_tools_windows/BUILD:733:11: in hdrs attribute of cc_library rule @remote_java_tools_windows//:combiners: Artifact 'external/remote_java_tools_windows/java_tools/src/tools/singlejar/zip_headers.h' is duplicated (through '@remote_java_tools_windows//:transient_bytes' and '@remote_java_tools_windows//:zip_headers'). Since this rule was created by the macro 'cc_library', the error might have been caused by the macro implementation
INFO: Analyzed 2 targets (0 packages loaded, 5 targets configured).
INFO: Found 2 targets...
//java/main:main_deploy.jar (a37cf84)
//java/main:main (a37cf84)
@bazel_tools//tools/jdk:current_java_toolchain (a37cf84)
@bazel_tools//tools/jdk:legacy_current_java_toolchain (a37cf84)
@bazel_tools//tools/jdk:remote_toolchain (a37cf84)
@remote_java_tools_windows//:toolchain (a37cf84)
@remote_java_tools_windows//:singlejar (4521be5)
@remote_java_tools_windows//:singlejar_cc_bin (4521be5)
@remote_java_tools_windows//:output_jar (4521be5)
@remote_java_tools_windows//:combiners (4521be5)
INFO: Elapsed time: 0.318s
INFO: 0 processes.
INFO: Build completed successfully, 0 total actions

@katre
Copy link
Member

katre commented Oct 14, 2020

Also, I can't run the //src/test/shell/bazel:bazel_java_test_local_java_tools_jdk11 tes, because it's hitting an error:

==================== Test output for //src/test/shell/bazel:bazel_java_test_local_java_tools_jdk11:
WARNING: Arguments do not specify tests!
ls: cannot access 'C:/Program Files/Java/jdk*': No such file or directory
INFO[bazel_java_test_local_java_tools_jdk11 2020-10-14 14:02:08 (+0000)] bazel binary is at /c/users/jcater/bazel/src/test/shell/bin
INFO[bazel_java_test_local_java_tools_jdk11 2020-10-14 14:02:08 (+0000)] setting up client in C:/users/jcater/_bazel_jcater/vg5n45g3/execroot/io_bazel/_tmp/cb0c59f5da4da9f39b5558b460a48a99/workspace

Java integration tests

C:/users/jcater/bazel/src/test/shell/unittest.bash: line 587: perl: command not found
-- Test log: -----------------------------------------------------------
------------------------------------------------------------------------
FAILED: terminated because this command returned a non-zero status:
C:\Users\jcater\_bazel_jcater\vg5n45g3\execroot\io_bazel\bazel-out\x64_windows-fastbuild\bin\src\test\shell\bazel\bazel_java_test_local_java_tools_jdk11:1629: in call to main
================================================================================
Target //src/test/shell/bazel:bazel_java_test_local_java_tools_jdk11 up-to-date:
  bazel-bin/src/test/shell/bazel/bazel_java_test_local_java_tools_jdk11
  bazel-bin/src/test/shell/bazel/bazel_java_test_local_java_tools_jdk11.exe
INFO: Elapsed time: 6.500s, Critical Path: 5.55s
INFO: 2 processes: 1 internal, 1 local.
INFO: Build completed, 1 test FAILED, 2 total actions
//src/test/shell/bazel:bazel_java_test_local_java_tools_jdk11            FAILED in 5.5s
  C:/users/jcater/_bazel_jcater/vg5n45g3/execroot/io_bazel/bazel-out/x64_windows-fastbuild/testlogs/src/test/shell/bazel/bazel_java_test_local_java_tools_jdk11/test.log

@katre
Copy link
Member

katre commented Oct 14, 2020

I think the perl error is from trying to write out the test XML and failing (see https://cs.opensource.google/bazel/bazel/+/master:src/test/shell/unittest.bash;drc=784385700a425632348e5bcad1b1555b35569da6;l=587).

@katre
Copy link
Member

katre commented Oct 14, 2020

This is the standard Windows GCE VM image, and it does not appear to have perl but I don't actually know where it would be.

@comius
Copy link
Contributor Author

comius commented Oct 14, 2020

PS Did you notice there is one suspicious backslash in the original output bazel-out/x64_windows-opt-exec- 2B5CBBC6/bin/external/local_java_tools/_virtual_includes/combiners\src/tools/singlejar/combiners.h

@katre
Copy link
Member

katre commented Oct 14, 2020

That is definitely worrying but if I can't reproduce the error I have no way to figure out where it's coming from.

@katre
Copy link
Member

katre commented Oct 14, 2020

@meteorcloudy has rebuilt the testing VM, I'm trying again

@katre
Copy link
Member

katre commented Oct 14, 2020

I can now run the test and see the failure. I can't reproduce it outside of a test, which points to some kind of environment mismatch. I'll keep looking.

@katre
Copy link
Member

katre commented Oct 14, 2020

Error I see is similar but not identical:

ERROR: C:/users/jcater/_bazel_jcater/vg5n45g3/execroot/io_bazel/_tmp/cb0c59f5da4da9f39b5558b460a48a99/root/bfofw67v/external/local_java_tools/BUILD:759:11: C++ compilation of rule '@local_java_tools//:mapped_file' failed: (Exit 2): cl.exe failed: error executing command C:/Program Files (x86)/Microsoft Visual Studio/2019/BuildTools/VC/Tools/MSVC/14.27.29110/bin/HostX64/x64/cl.exe /nologo /DCOMPILER_MSVC /DNOMINMAX /D_WIN32_WINNT=0x0601 /D_CRT_SECURE_NO_DEPRECATE ... (remaining 38 argument(s) skipped)
external/local_java_tools/java_tools/src/tools/singlejar/mapped_file.cc(18): fatal error C1083: Cannot open include file: 'src/tools/singlejar/mapped_file_windows.inc': No such file or directory
Target //java/main:main_deploy.jar failed to build
Use --verbose_failures to see the command lines of failed build steps.

@katre
Copy link
Member

katre commented Oct 15, 2020

Okay, lots of debugging later, I think I've found the problem:

At d5d65ec (one commit before e906f89), the test passes, and the full path to mapped_file_windows.inc is C:/users/jcater/_bazel_jcater/vg5n45g3/execroot/io_bazel/_tmp/cb0c59f5da4da9f39b5558b460a48a99/root/bfofw67v/execroot/main/bazel-out/host/bin/external/local_java_tools/_virtual_includes/mapped_file/src/tools/singlejar/mapped_file_windows.inc (241 characters).

At e906f89, the test fails, and the full path is C:/users/jcater/_bazel_jcater/vg5n45g3/execroot/io_bazel/_tmp/cb0c59f5da4da9f39b5558b460a48a99/root/bfofw67v/execroot/main/bazel-out/x64_windows-opt-exec-2B5CBBC6/bin/external/local_java_tools/_virtual_includes/mapped_file/src/tools/singlejar/mapped_file_windows.inc (266 characters).

So the switch to using exec transitions is increasing the path (changing host to x64_windows-opt-exec-2B5CBBC6), and pushing it over Windows' 255 character limit.

One obvious solution is to not have the TEST_TMPDIR be under C:/users/jcater/_bazel_jcater/vg5n45g3/execroot/io_bazel/_tmp, and instead place it directly under c:\tmp. In fact, when I reset "TEST_TMPDIR" to be "C:/tmp" in the test, it passes.

I'm going to reassign this to @meteorcloudy to determine the right way to fix Bazel. This isn't actually a problem with e906f89. it just happens to be exposed there.

@katre katre assigned meteorcloudy and unassigned katre Oct 15, 2020
@katre katre added area-Windows Windows-specific issues and feature requests and removed team-Configurability platforms, toolchains, cquery, select(), config transitions labels Oct 15, 2020
@meteorcloudy
Copy link
Member

Thanks for figuring this out!
So sad we are hit by the long path issue on Windows again 😞

@meteorcloudy
Copy link
Member

https://github.com/bazelbuild/continuous-integration/blob/master/pipelines/java_tools-binaries.yml
I'm surprised the java_tools binaries pipeline isn't use our normal pipeline setup, which means it doesn't get some workarounds for long path issues we implemented in bazelci.py. FYI @philwo

I'll try to fix this problem by writing a normal yml file for this pipeline.

@comius
Copy link
Contributor Author

comius commented Oct 15, 2020 via email

@meteorcloudy
Copy link
Member

I looked into the pipeline setup for https://buildkite.com/bazel-trusted/java-tools-binaries-java
It basically runs the upload_all_java_tools.sh which does three things:

  • Build the java_tools
  • Run //src/test/shell/bazel:bazel_java_test_local_java_tools_jdk${java_version} with the java tools built in the previous step
  • Upload the java_tools to gs://bazel-mirror/bazel_java_tools

This setup is indeed very complicated and cannot be translated to a normal yaml config file easily because the later steps depends on the output of the first one.
My first question is: do we really need to upload the java tools to bazel-mirror for every commit?

@comius
Copy link
Contributor Author

comius commented Oct 16, 2020 via email

@meteorcloudy
Copy link
Member

meteorcloudy commented Oct 16, 2020

Great, then I can see how could I make the test work without the previous build step.

@comius
Copy link
Contributor Author

comius commented Oct 19, 2020

Hey @meteorcloudy, any progress update on this? Would it make sense that I remove test from upload_all_java_tools.sh and you configure it to be run on the main branch? This way I could proceed with the release.

@meteorcloudy
Copy link
Member

Yes, that sounds good. I may need more time to figure out how to run the test properly.

@meteorcloudy
Copy link
Member

meteorcloudy commented Oct 19, 2020

I wonder do we actually need to enable the //src/test/shell/bazel:bazel_java_test_local_java_tools_jdk11 test, we already have the following in master branch:

//src/test/shell/bazel:bazel_java_test                                   PASSED in 120.4s
//src/test/shell/bazel:bazel_java_test_defaults                          PASSED in 30.8s
//src/test/shell/bazel:bazel_java_test_jdk11_prebuilt_toolchain_head     PASSED in 117.8s
//src/test/shell/bazel:bazel_java_test_jdk11_toolchain_head              PASSED in 127.9s
//src/test/shell/bazel:bazel_java_test_jdk14_prebuilt_toolchain_head     PASSED in 90.4s
//src/test/shell/bazel:bazel_java_test_jdk14_toolchain_head              PASSED in 140.1s
//src/test/shell/bazel:bazel_java_test_jdk15_prebuilt_toolchain_head     PASSED in 126.6s
//src/test/shell/bazel:bazel_java_test_jdk15_toolchain_head              PASSED in 126.9s

@meteorcloudy
Copy link
Member

I don't actually get the purpose of this https://buildkite.com/bazel-trusted/java-tools-binaries-java pipeline.
The description says Temporary pipeline for building java_tools binaries on all platforms

Did we have this because we plan to move the java tools to a separate repo? Do we still plan to execute the plan? If not, maybe we can just remove this pipeline?

@comius
Copy link
Contributor Author

comius commented Oct 19, 2020

Perhaps https://github.com/bazelbuild/java_tools/blob/master/docs/release.md helps to explain.

I don't know about old plans. @philwo has recently posed me the same questions.
I was thinking of splitting java_tools into Java part which is equal for all operating systems, and a small native part: ijar,singlejar,local_jar and then even pulling Java part into Bazel if possible...

@meteorcloudy
Copy link
Member

I see, in create_java_tools_release.sh, there is

gsutil -q cp -n "${gcs_bucket}/${rc_url}" "${gcs_bucket}/${release_artifact}"

So we do need this pipeline to upload artifacts for java tools, but I believe the test is covered by other bazel_java_tests in the master pipepline. #12300 seems to be the right fix for this issue, we don't need to rewrite the yaml config file.

bazel-io pushed a commit that referenced this issue Oct 19, 2020
…va_tools.sh…

The tests don't pass on Windows, because of an outdated CI setup, see #12244, because of Windows path limit. Removing the tests to be able make a release. Similar tests are already executed on the main branch.

Closes #12300.

PiperOrigin-RevId: 337851055
@jin jin added the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Oct 21, 2020
@philwo
Copy link
Member

philwo commented Nov 26, 2020

What's the state of this - now that the test is disabled, can we close this?

@meteorcloudy
Copy link
Member

I think so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Windows Windows-specific issues and feature requests team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: bug
Projects
None yet
Development

No branches or pull requests

6 participants