Skip to content

[torch] Adjust env vars used for builds with aotriton enabled.#1432

Merged
ScottTodd merged 1 commit into
ROCm:mainfrom
ScottTodd:torch-windows-aotriton-env-vars
Sep 9, 2025
Merged

[torch] Adjust env vars used for builds with aotriton enabled.#1432
ScottTodd merged 1 commit into
ROCm:mainfrom
ScottTodd:torch-windows-aotriton-env-vars

Conversation

@ScottTodd
Copy link
Copy Markdown
Member

Motivation

Progress on #1040, getting closer to enabling aotriton in PyTorch on Windows.

Technical Details

This will supersede #1409 and is dependent on pytorch/pytorch#162330.

The UTF8 change I believe helps with warnings about logs for copying files with unicode characters in their names:

Message: '%s %s -> %s'
Arguments: ('copying', 'torch\\lib\\aotriton.images\\amd-gfx11xx\\flash\\bwd_kernel_dq\\FONLY__\uff0afp32@16_48_0_T_T_1___gfx11xx.aks2', 'build\\lib.win-amd64-cpython-312\\torch\\lib\\aotriton.images\\amd-gfx11xx\\flash\\bwd_kernel_dq')
--- Logging error ---
Traceback (most recent call last):
  File "C:\Users\Nod-Shark16\AppData\Local\Programs\Python\Python312\Lib\logging\__init__.py", line 1163, in emit
    stream.write(msg + self.terminator)
  File "C:\Users\Nod-Shark16\AppData\Local\Programs\Python\Python312\Lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\uff0a' in position 73: character maps to <undefined>
Call stack:
  File "D:\b\pytorch_main\setup.py", line 1785, in <module>
    main()
  File "D:\b\pytorch_main\setup.py", line 1766, in main
    setup(
  File "D:\projects\TheRock\external-builds\pytorch\3.12.venv\Lib\site-packages\setuptools\__init__.py", line 117, in setup
    return distutils.core.setup(**attrs)

Test Plan

Tested with local builds on Windows with and without --enable-pytorch-flash-attention-windows.

Test Result

Builds succeeded, ComfyUI generated images on my gfx1100 GPU (needed TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1 for aotriton on that GPU).

Submission Checklist

@ScottTodd
Copy link
Copy Markdown
Member Author

I think this is fine to merge without waiting for the upstream PR.

@ScottTodd ScottTodd merged commit 202084d into ROCm:main Sep 9, 2025
5 checks passed
@ScottTodd ScottTodd deleted the torch-windows-aotriton-env-vars branch September 9, 2025 18:00
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Sep 9, 2025
@Nem404
Copy link
Copy Markdown

Nem404 commented Sep 9, 2025

I think this is fine to merge without waiting for the upstream PR.

I hope Jeff merges pytorch/pytorch#162330 before tomorrow's CI run. Can’t wait to try AOTriton’s perf boost

@ScottTodd
Copy link
Copy Markdown
Member Author

We'll still need to flip the --enable-pytorch-flash-attention-windows flag here for nightly releases to get aotriton:

- name: Build PyTorch Wheels
id: build-pytorch-wheels
# Using 'cmd' here is load bearing! There are configuration issues when
# run under 'bash': https://github.com/ROCm/TheRock/issues/827#issuecomment-3025858800
shell: cmd
run: |
echo "Building PyTorch wheels for ${{ inputs.amdgpu_family }}"
python ./external-builds/pytorch/build_prod_wheels.py ^
build ^
--install-rocm ^
--index-url "${{ inputs.cloudfront_url }}/${{ inputs.amdgpu_family }}/" ^
--pytorch-dir ${{ env.CHECKOUT_ROOT }}/torch ^
--pytorch-audio-dir ${{ env.CHECKOUT_ROOT }}/audio ^
--pytorch-vision-dir ${{ env.CHECKOUT_ROOT }}/vision ^
--clean ^
--output-dir ${{ env.PACKAGE_DIST_DIR }} ^
${{ env.optional_build_prod_arguments }}
python ./build_tools/github_actions/write_torch_versions.py --dist-dir ${{ env.PACKAGE_DIST_DIR }}

I don't have any objections to at least trying that. I'll post a PR. Might not get around to testing / reviewing / etc. before tonight's release build, depending on when the pytorch PR is reviewed.

@Nem404
Copy link
Copy Markdown

Nem404 commented Sep 9, 2025

I don't have any objections to at least trying that. I'll post a PR. Might not get around to testing / reviewing / etc. before tonight's release build, depending on when the pytorch PR is reviewed.

Not a problem if it takes 1 or 2 more days. But up until now, aotriton was just out there, and no one knew when it would be included in the wheels. I can't believe we're this close to closing #1040 as solved :)

ScottTodd added a commit that referenced this pull request Sep 12, 2025
…#1437)

## Motivation

Fixes #1040, enabling aotriton for
flash attention in pytorch (if it works). This is expected to improve
performance in workloads like ComfyUI image generation by upwards of 60%
(e.g. 12.6 it/s to 20.0 it/s).

## Technical Details

Follow-up to #1432 and depends on
pytorch/pytorch#162330.

Note that support is experimental for some GPUs like gfx1100, so the
`TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1` environment variable may be
needed to try aotriton on those systems.

## Test Plan

Trigger either
https://github.com/ROCm/TheRock/actions/workflows/build_windows_pytorch_wheels.yml
or
https://github.com/ROCm/TheRock/actions/workflows/release_windows_pytorch_wheels.yml
across the matrix of GPU families once that PyTorch PR is merged.

We're still going to need automated tests and documentation for this.
I'd like numerics tests running somewhere and documentation that shows
how to check which pytorch features are enabled in the wheels that a
user installs.

## Test Result

Test runs:

* https://github.com/ROCm/TheRock/actions/runs/17660396787 using this
branch and `7.0.0rc20250908` for gfx110X-dgpu
* ~~https://github.com/ROCm/TheRock/actions/runs/17660456285 using the
branch and `7.0.0rc20250908` for gfx1151~~
* https://github.com/ROCm/TheRock/actions/runs/17662170140 using the
branch and `7.0.0rc20250908` for gfx1151
* Tests not running should be fixed with
#1469

(may need to retrigger to pick up fixes for flaky checkouts)

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants