-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build on Windows #4843
Build on Windows #4843
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@googlebot I signed it! |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a promising start, although it seems some things are missing?
I rebuilt it with my local copy of latest tensorflow with patch tensorflow/tensorflow#44731, and jax now imports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
You say this allows JAX to build, which is a great start. Does it also allow some/all of the tests to pass?
build/build.py
Outdated
@@ -206,14 +208,40 @@ def check_bazel_version(bazel_path, min_version, max_version): | |||
build --define=no_ignite_support=true | |||
build --define=grpc_no_ares=true | |||
|
|||
build --define=with_xla_support=true | |||
build --action_env=TF_ENABLE_XLA=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised these two are needed, but I would believe you if you say they are...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They seems to be left over during playing with bazelrc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
build/install_xla_in_source_tree.py
Outdated
r = runfiles.Create() | ||
|
||
def _is_windows(): | ||
return sys.platform.startswith("win32") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: To be consistent with the other Python in the project, use 2-space indentation.
build/install_xla_in_source_tree.py
Outdated
copy(r.Rlocation("org_tensorflow/tensorflow/compiler/xla/python/xla_extension.so")) | ||
patch_copy_xla_client_py() | ||
|
||
if _is_windows(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just negate the if condition and skip the pass
case.
build/install_xla_in_source_tree.py
Outdated
def patch_copy_xla_client_py(dst_dir=os.path.join(args.target, "jaxlib")): | ||
with open(r.Rlocation("org_tensorflow/tensorflow/compiler/xla/python/xla_client.py")) as f: | ||
src = f.read() | ||
src = src.replace("from tensorflow.compiler.xla.python import xla_extension as _xla", "from . import xla_extension as _xla") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but I'm reminded that we can probably avoid this by just changing the code in the TF tree.
jax/random.py
Outdated
@@ -82,10 +82,10 @@ def PRNGKey(seed: int) -> jnp.ndarray: | |||
if isinstance(seed, (int, np.ndarray)): | |||
# Special handling of raw integer values, which may have be 64bit even | |||
# when jax_enable_x64=False and we don't want to drop the top 32 bits | |||
k1 = convert(np.bitwise_and(np.right_shift(seed, 32), 0xFFFFFFFF)) | |||
k1 = convert(np.bitwise_and(np.right_shift(seed, 32), np.uint32(0xFFFFFFFF))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what caused this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakevdp as an FYI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I was curious why 0x7FFFFFFF is not 28bytes python object, then I found https://stackoverflow.com/a/10365639/2091555
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure why this would differ between windows and linux/mac, however. That said, the change seems fine.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like great progress: I see from your test log that a majority of the tests pass!
I'd be happy to merge this as is, if you like (with the remaining items fixed) and to keep iterating in future PRs. Or you can keep iterating in this one. Up to you!
build/install_xla_in_source_tree.py
Outdated
|
||
|
||
def copy(src_file, dst_dir=os.path.join(args.target, "jaxlib")): | ||
# NOTE: this might be useful for debugging on Windows to inspect into the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just remove this kind of debugging code. It's better not to leave it cluttering things up.
jax/random.py
Outdated
@@ -82,10 +82,10 @@ def PRNGKey(seed: int) -> jnp.ndarray: | |||
if isinstance(seed, (int, np.ndarray)): | |||
# Special handling of raw integer values, which may have be 64bit even | |||
# when jax_enable_x64=False and we don't want to drop the top 32 bits | |||
k1 = convert(np.bitwise_and(np.right_shift(seed, 32), 0xFFFFFFFF)) | |||
k1 = convert(np.bitwise_and(np.right_shift(seed, 32), np.uint32(0xFFFFFFFF))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure why this would differ between windows and linux/mac, however. That said, the change seems fine.
Just a heads-up: I made an additional edit on top of your It appears we need both |
6292492
to
2b38e6a
Compare
with and 15 of them should be caused by missing tensorflow. |
This looks pretty promising. Should we merge it as is and work on the remaining issues in subsequent PRs? |
really excited to find this. Thanks a lot for your guy's work. Is there a quick manual on how to test the build on windows? |
I might add some building instruction in this PR and then I think we are good to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty good! A couple of small edits.
docs/developer.rst
Outdated
Additional Notes for Building ``jaxlib`` from source on Windows | ||
............................................................... | ||
|
||
On Windows, following `Install Visual Studio <https://docs.microsoft.com/en-us/visualstudio/install/install-visual-studio?view=vs-2019>`_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor grammar edits:
"On Windows, follow "
docs/developer.rst
Outdated
`CUDA Installation Guide <https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html>`_ | ||
to setup CUDA environment. | ||
|
||
Install `bazelisk <https://github.com/bazelbuild/bazelisk/releases/latest>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually suspect this step is optional now the build.py
script will download bazel for you on Windows.
docs/developer.rst
Outdated
pacman -S patch realpath | ||
|
||
|
||
Once everything is installed. Open PowerShell, make sure you've added MSYS2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once everything is installed, open PowerShell, and make sure MSYS2 is in the path of the current session.
docs/developer.rst
Outdated
|
||
Once everything is installed. Open PowerShell, make sure you've added MSYS2 | ||
path to current session. Ensure ``bazel``, ``patch`` and ``realpath`` are | ||
accessible. Activate conda environment. Folllowing command build with cuda |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Activate the conda environment. The following command builds with CUDA enabled::
docs/developer.rst
Outdated
--cuda_compute_capabilities='6.1' ` | ||
--cuda_version='10.1' ` | ||
--cudnn_version='7.6.5' ` | ||
--bazel_options='--copt' ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would leave these out by default, and instead I would say something like "To build with debug information, add the flag: --bazel_options=--copt=/Z7
."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These version things seem to be needed currently. Blame bazel xD
docs/developer.rst
Outdated
enabled with PDB generated, adjust it to whatever suitable for you.:: | ||
Once everything is installed. Open PowerShell, and make sure MSYS2 is in the | ||
path of the current session. Ensure ``bazel``, ``patch`` and ``realpath`` are | ||
accessible. Activate the conda environment. The folllowing command build with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
"The following command builds with CUDA enabled; adjust it to whatever is suitable for you::"
(Typo in "following"; "builds" needs an "s", and we need a break before "adjust")
Please rebase on top of github head and squash your commits. |
Would you mind removing the |
1. Build on Windows 2. Fix OverflowError When calling `key = random.PRNGKey(0)` OverflowError: Python int too large to convert to C long for casting value 4294967295 (0xFFFFFFFF) from python int to int32. 3. fix file path in regex of errors_test 4. handle ValueError of os.path.commonpath
Thanks @cloudhan ! This is awesome work! |
FYI, #4962 added the |
It seems this comes from cudaV11.x not storing its version info in cuda.h, and thus bazel cannot locate it anymore. Still stranded by it |
@steveyang687 Same error here, did you find a fix for this? I also could not install realpath with MSYS2, the package does not seem to exist; there is also no reference to it at the MSYS2 package repository . Edit: you need to remove the accents from the version information, i.e. write: instead of: However, it will then (still) fail on the absence of the realpath package: |
|
Thanks, I installed coreutils from msys2 bash but I still get the same error though, probably because realpath still can't be found when executing the script. Any idea how I can make sure realpath is accessible? |
I tried building from source and after a while I am getting this error: I was using Bazel 3.7.2 but the same thing happens with Bazel 4.0.0. Any idea what might be happening? |
@elchorro That might be due to upstream tensorflow introduce some conflicting configurations in some corner cases. You might want to play around with |
I just build and install the CPU-only version since I dont use it for computation-intensive jobs lol :-( |
@steveyang687 @elephaint Is |
Hi, I tried to build the "vanilla" version but I got the following error. Please, give me some ideas. Thanks!
|
Hi, tried with all "--noenable" and got a different error. Any help?
|
I have a question about where is the appropriate environment to build JAXLIB. For my Windows 10 installation I could do it in:
|
After launching the compilation with:
at some point I get the following error:
Does anyone know how to fix it? Full build output:
|
I managed to fix the above error and it finally compiled. Basically the problem was that the compilation tries to create symbolic links, but in Windows a process is not allowed to do so unless it's running with administrative rights, or the Developer Mode is enabled (see https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development). Once I activated the Developer Mode and I restarted the shell the compilation process went ahead with no errors. In case of any other errors in Windows, make sure to run the compilation within a Hope it helps!
|
@illtellyoulater Thanks, I'll add that to the docs (#10068) |
In short: The solution to my problem was to uninstall MSYS2 including deleting the I got this error:
It was trying to use The command I used: I ran the command right in the Visual Studio Code terminal. I tried the whole An earlier problem I ran into was the (I wrote all this expecting it to work, but I ran into a bunch of other problems and I finally gave up. But hopefully this is useful.) The full log:
|
Here's another problem I solved. It ran into an infinite loop until I removed the log:
|
Fix #438, Enables building jax natively on Windows.