Skip to content

{math}[GCCcore/13.2.0] ArmComputeLibrary v23.08#21309

Open
migueldiascosta wants to merge 4 commits intoeasybuilders:developfrom
migueldiascosta:20240904163603_new_pr_ArmComputeLibrary2308
Open

{math}[GCCcore/13.2.0] ArmComputeLibrary v23.08#21309
migueldiascosta wants to merge 4 commits intoeasybuilders:developfrom
migueldiascosta:20240904163603_new_pr_ArmComputeLibrary2308

Conversation

@migueldiascosta
Copy link
Copy Markdown
Member

@migueldiascosta migueldiascosta commented Sep 4, 2024

(created using eb --new-pr)

The motivation for this easyconfig was that on Arm (at least on a64fx, but probably also applies to other Arm processors) a pip-installed PyTorch was multiple times faster than an easybuilt one, and an analysis with perf showed that ACL was being used (also a recent OpenBLAS with support for ARM_SVE, but should be taken care by using PyTorch with a more recent toolchain and OpenBLAS, e.g. #20489)

This is not the most recent version of ACL, but PyTorch 2.3 (the one in #20489) says that the maximum supported version is 23.08

Using this with PyTorch 2.3 requires setting USE_MKLDNN=ON, USE_MKLDNN_ACL=ON, USE_MKLDNN_CBLAS=ON, and a patch derived from Ryo-not-rio/oneDNN@ca60ff4 to the bundled oneDNN

@migueldiascosta migueldiascosta marked this pull request as draft September 4, 2024 08:36

buildopts = "os=linux arch=armv8a build=native multi_isa=1 "
buildopts += "Werror=0 debug=0 neon=1 opencl=0 embed_kernels=0 "
buildopts += "fixed_format_kernels=1 openmp=1 cppthreads=0 "
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from https://github.com/pytorch/pytorch/blob/main/.ci/docker/common/install_acl.sh#L13-L16

in particular, arch=armv8a multi_isa=1 should be more generic without loosing functionality/performance:

https://github.com/ARM-software/ComputeLibrary/blob/de7288cb71e6b9190f52e50a44ed68c309e4a041/docs/user_guide/library.dox#L567-L578

benchmarks on a64fx compared to arch=armv8.2-a-sve didn't show any difference

@migueldiascosta
Copy link
Copy Markdown
Member Author

Test report by @migueldiascosta
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
cna0003.deucalion.macc.fccn.pt - Linux Rocky Linux 8.5, AArch64, UNKNOWN, Python 3.6.8
See https://gist.github.com/migueldiascosta/4f83eebbd8ba97bd57069d1d5a16be30 for a full test report.

@migueldiascosta migueldiascosta marked this pull request as ready for review September 11, 2024 03:59
@migueldiascosta migueldiascosta added this to the release after 4.9.3 milestone Sep 13, 2024
@boegel boegel modified the milestones: release after 4.9.4, release after 5.0.0 Mar 18, 2025
@Flamefire
Copy link
Copy Markdown
Contributor

When we set $ACL_ROOT_DIR we do not need the part of the patch where the FindACL is changed, see easybuilders/easybuild-easyblocks#4096

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 25, 2026

@migueldiascosta Is it worth still merging this now that we have a newer version merged? See

@Flamefire
Copy link
Copy Markdown
Contributor

@boegel That one is for a different toolchain. Do you suggest to update the version for the toolchain used in this PR?

@migueldiascosta
Copy link
Copy Markdown
Member Author

I'm ok with closing this PR, since we're not likely to enable ACL for PyTorch/2.3-foss-2023b (the one this was originally targeted at)

@Flamefire
Copy link
Copy Markdown
Contributor

Flamefire commented Apr 2, 2026

Why not? Given the huge performance difference I'd actually update all PyTorch easyconfigs to use imkl on x68 and ACL for Arm maybe starting at 2023a, as 2022b is the oldest active one

As for versions I'd use the PYPI PyTorch packages as reference

@migueldiascosta
Copy link
Copy Markdown
Member Author

@Flamefire just thought we would likely not bother. let me fix the shared library extension in this PR then, same as in the merged one

@github-actions github-actions bot added update and removed new labels Apr 2, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

Updated software ArmComputeLibrary-23.08-GCCcore-13.2.0.eb

Diff against ArmComputeLibrary-25.02-GCCcore-13.3.0.eb

easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-13.3.0.eb

diff --git a/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-13.3.0.eb b/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-23.08-GCCcore-13.2.0.eb
index 0c178b7c0b..8a0c0b24dd 100644
--- a/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-13.3.0.eb
+++ b/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-23.08-GCCcore-13.2.0.eb
@@ -1,21 +1,21 @@
 easyblock = 'SCons'
 
 name = 'ArmComputeLibrary'
-version = '25.02'
+version = '23.08'
 
 homepage = 'https://github.com/ARM-software/ComputeLibrary'
 description = """The Arm Compute Library is a collection of low-level machine learning functions optimized for
  Arm® Cortex®-A, Arm® Neoverse® and Arm® Mali™ GPUs architectures."""
 
-toolchain = {'name': 'GCCcore', 'version': '13.3.0'}
+toolchain = {'name': 'GCCcore', 'version': '13.2.0'}
 
 source_urls = ['https://github.com/ARM-software/ComputeLibrary/archive/refs/tags/']
 sources = ['v%(version)s.tar.gz']
-checksums = ['339376cd05b5efe83a3909333956d7663022f0dd8c7977a35e04b35551546be6']
+checksums = ['62f514a555409d4401e5250b290cdf8cf1676e4eb775e5bd61ea6a740a8ce24f']
 
 builddependencies = [
-    ('binutils', '2.42'),
-    ('SCons', '4.9.0'),
+    ('binutils', '2.40'),
+    ('SCons', '4.6.0'),
 ]
 
 prefix_arg = 'install_dir='

@Flamefire
Copy link
Copy Markdown
Contributor

I expect that less tests rather than more will fail so changing those ECs will be little work with high gain, which makes it worth going back as far as reasonably possible.
Are you going to create PRs for ACL in the other toolchains? I.e. 12.3, 14.2, 14.3. Then I'll open the PRs to add them to PyTorch

@migueldiascosta
Copy link
Copy Markdown
Member Author

I can create those ACL PRs, yes. For PyTorch-2.1.2-foss-2023a.eb (GCCcore 12.3.0) though, not sure which ACL version to use, there was no .ci/docker/common/install_acl.sh on that version of PyTorch, any suggestions?

@migueldiascosta
Copy link
Copy Markdown
Member Author

from https://github.com/pytorch/pytorch/blob/v2.1.2/cmake/public/ComputeLibrary.cmake looks like anything higher than ACL 21.02 should be ok, but probably safer to use exactly ACL 21.02 for PyTorch 2.1.2

@Flamefire
Copy link
Copy Markdown
Contributor

I found a way: Extract the wheel and run strings torch.libs/libarm_compute-*.so | grep arm_compute_version

  • 2.3.0: arm_compute_version=v23.08
  • 2.1.2: arm_compute_version=v23.05.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants