Bump PT 2025131 and ET pins 20250209 #1493

Jack-Khuu · 2025-02-11T00:01:11Z

ET Pin 2025-02-09: Bumping to https://hud.pytorch.org/pytorch/executorch/commit/791472d6706b027552f39f11b28d034e4839c9af

Bumping PT pin to match one used in ^^^: https://github.com/pytorch/executorch/blob/791472d6706b027552f39f11b28d034e4839c9af/install_requirements.py#L70

pytorch-bot · 2025-02-11T00:01:15Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1493

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 8625843 with merge base 53a1004 ():

NEW FAILURE - The following job has failed:

pull / test-torchao-aoti-experimental (macos-14-xlarge) (gh)
Process completed with exit code 134.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Jack-Khuu · 2025-02-11T18:56:31Z

Failing on https://github.com/pytorch/torchchat/actions/runs/13253427250/job/37044260670?pr=1493

torchchat/.github/workflows/pull.yml

Lines 1119 to 1121 in fd04123

    
                     echo "Export and run AOTI (C++ runner)" 
        
                     python torchchat.py export stories110M --output-aoti-package-path ./model.pt2 --dtype float32 --quantize '{"embedding:wx": {"bitwidth": 2, "groupsize": 32}, "linear:a8wxdq": {"bitwidth": 3, "groupsize": 128, "has_weight_zeros": false}}' 
        
                     ./cmake-out/aoti_run ./model.pt2 -z ./tokenizer.model -t 0 -i "${PRMT}"

~~Looks like we might be due to an AO mismatch error~~
~~Probably need both this and #1458~~

Edit: Unrelated

cc: @metascroy

Jack-Khuu · 2025-02-11T23:08:47Z

Tested locally isolated from AO changes, suggests that #1458, is unrelated

(Just bumping pt causes failure with runner)

Jack-Khuu · 2025-02-12T01:42:40Z

Error when using AOTI runner with linked torchao lib. Rolls up to the change in how pt/pt detects with OpenMP pytorch/pytorch#145870 (cc: @malfet)

Without Brew install: https://github.com/pytorch/torchchat/actions/runs/13273334566/job/37057693025?pr=1493

dyld[8590]: Library not loaded: /opt/homebrew/opt/libomp/lib/libomp.dylib
  Referenced from: <E04D3A6F-A452-31EF-9520-27C6B4140221> /Users/runner/work/torchchat/torchchat/torchao-build/cmake-out/lib/libtorchao_ops_aten.dylib
  Reason: tried: '/opt/homebrew/opt/libomp/lib/libomp.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/opt/libomp/lib/libomp.dylib' (no such file), '/opt/homebrew/opt/libomp/lib/libomp.dylib' (no such file)
/Users/runner/work/_temp/2d8a0077-1581-4972-9650-72fbb0b54b33.sh: line 6:  8590 Abort trap: 6           ./cmake-out/aoti_run ./model.pt2 -z ./tokenizer.model -t 0 -i "${PRMT}"

With Brew install: https://github.com/pytorch/torchchat/actions/runs/13275987426/job/37065581082?pr=1493

OMP: Error #15: Initializing libomp.dylib, but found libomp.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://openmp.llvm.org/
/Users/runner/work/_temp/53841c20-106a-43[87](https://github.com/pytorch/torchchat/actions/runs/13275987426/job/37065581082?pr=1493#step:9:88)-9bec-5985e8418fa9.sh: line 6:  8899 Abort trap: 6           ./cmake-out/aoti_run ./model.pt2 -z ./tokenizer.model -t 0 -i "${PRMT}"

@swolchok I saw you had fun with this last week: pytorch/executorch#8098

Thoughts on how to unblock?

malfet · 2025-02-12T01:44:56Z

Error when using AOTI runner with linked torchao lib. Rolls up to the change in how pt/pt detects with OpenMP pytorch/pytorch#145870 (cc: @malfet)

Without Brew install: https://github.com/pytorch/torchchat/actions/runs/13273334566/job/37057693025?pr=1493

dyld[8590]: Library not loaded: /opt/homebrew/opt/libomp/lib/libomp.dylib
  Referenced from: <E04D3A6F-A452-31EF-9520-27C6B4140221> /Users/runner/work/torchchat/torchchat/torchao-build/cmake-out/lib/libtorchao_ops_aten.dylib
  Reason: tried: '/opt/homebrew/opt/libomp/lib/libomp.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/opt/libomp/lib/libomp.dylib' (no such file), '/opt/homebrew/opt/libomp/lib/libomp.dylib' (no such file)
/Users/runner/work/_temp/2d8a0077-1581-4972-9650-72fbb0b54b33.sh: line 6:  8590 Abort trap: 6           ./cmake-out/aoti_run ./model.pt2 -z ./tokenizer.model -t 0 -i "${PRMT}"

[Edit] How torchao is build? I.e. why does it link itself with libOMP, it should just borrow the dependency from Torch (where it's bundled as part of nightlies, I just check that's the case)

Jack-Khuu · 2025-02-12T07:54:28Z

Pointer to the cmake build into torchao:

torchchat/torchchat/utils/scripts/install_utils.sh

Lines 186 to 209 in fd04123

    
           install_torchao_aten_ops() { 
        
             local device=${1:-cpu} 
        
             if [[ "$device" == "cpu" ]]; then 
        
               echo "Building torchao custom ops for ATen" 
        
               pushd ${TORCHCHAT_ROOT}/torchao-build/src/ao/torchao/experimental 
        
             elif [[ "$device" == "mps" ]]; then 
        
               echo "Building torchao mps custom ops for ATen" 
        
               pushd ${TORCHCHAT_ROOT}/torchao-build/src/ao/torchao/experimental/ops/mps 
        
             else 
        
               echo "Invalid argument: $device. Valid values are 'cpu' or 'mps'." >&2 
        
               return 1 
        
             fi 
        
             CMAKE_OUT_DIR=${TORCHCHAT_ROOT}/torchao-build/cmake-out 
        
             cmake -DCMAKE_PREFIX_PATH=${MY_CMAKE_PREFIX_PATH} \ 
        
               -DCMAKE_INSTALL_PREFIX=${CMAKE_OUT_DIR} \ 
        
               -DCMAKE_BUILD_TYPE="Release" \ 
        
               -S . \ 
        
               -B ${CMAKE_OUT_DIR} -G Ninja 
        
             cmake --build  ${CMAKE_OUT_DIR} --target install --config Release 
        
             popd 
        
           }

[Edit] How torchao is build? I.e. why does it link itself with libOMP, it should just borrow the dependency from Torch (where it's bundled as part of nightlies, I just check that's the case)

I'm not familiar with the linking
@metascroy Can you help answer? Jerry is out, but cc: @jcaip to loop in AO

jcaip · 2025-02-12T20:06:35Z

cc @malfet I see that there's this line in Utils.cmake which is responsible for the custom linking.

https://github.com/pytorch/ao/blame/d3306b22b0e9cba09762c335757c1dcfbd96f170/torchao/experimental/Utils.cmake#L24

maybe noob question - should that be target_link_libraries(${target_name} PRIVATE "${TORCH_LIBRARIES}") like the line 21 above to borrow the dependency from Torch?

Bump PT 2025131 and ET pins 20250209

072d5a8

Jack-Khuu requested review from shoumikhin, Gasoonjia and manuelcandales February 11, 2025 00:01

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 11, 2025

shoumikhin approved these changes Feb 11, 2025

View reviewed changes

Update install instructions for et

df2fedf

Split up torchao test

3e884ab

Jack-Khuu added 2 commits February 11, 2025 16:15

Test installing libomp

ba5774d

Gate omp install

8625843

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump PT 2025131 and ET pins 20250209 #1493

Bump PT 2025131 and ET pins 20250209 #1493

Jack-Khuu commented Feb 11, 2025

pytorch-bot bot commented Feb 11, 2025 •

edited

Loading

Jack-Khuu commented Feb 11, 2025 •

edited

Loading

Jack-Khuu commented Feb 11, 2025 •

edited

Loading

Jack-Khuu commented Feb 12, 2025 •

edited

Loading

malfet commented Feb 12, 2025 •

edited

Loading

Jack-Khuu commented Feb 12, 2025

jcaip commented Feb 12, 2025

Bump PT 2025131 and ET pins 20250209 #1493

Are you sure you want to change the base?

Bump PT 2025131 and ET pins 20250209 #1493

Conversation

Jack-Khuu commented Feb 11, 2025

pytorch-bot bot commented Feb 11, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1493

❌ 1 New Failure

Jack-Khuu commented Feb 11, 2025 • edited Loading

Jack-Khuu commented Feb 11, 2025 • edited Loading

Jack-Khuu commented Feb 12, 2025 • edited Loading

malfet commented Feb 12, 2025 • edited Loading

Jack-Khuu commented Feb 12, 2025

jcaip commented Feb 12, 2025

pytorch-bot bot commented Feb 11, 2025 •

edited

Loading

Jack-Khuu commented Feb 11, 2025 •

edited

Loading

Jack-Khuu commented Feb 11, 2025 •

edited

Loading

Jack-Khuu commented Feb 12, 2025 •

edited

Loading

malfet commented Feb 12, 2025 •

edited

Loading