Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Mac OSX on Apple Silicon #465

Merged
merged 21 commits into from
Feb 2, 2024
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions doc/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ Description
- Updated SmartSim's machine learning backends
- Added ONNX support for Python 3.10
- Added support for Python 3.11
- Added support for SmartSim with Torch on Apple Silicon


Detailed Notes

Expand All @@ -41,13 +43,17 @@ Detailed Notes
there is now an available ONNX wheel for use with Python 3.10, and wheels for
all of SmartSim's machine learning backends with Python 3.11.
(SmartSim-PR451_) (SmartSim-PR461_)
- SmartSim can now be built and used on platforms using Apple Silicon
(ARM64). Currently, only the pyTorch backend is supported. Note that libtorch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong but isn't the preferred capitalization

Suggested change
(ARM64). Currently, only the pyTorch backend is supported. Note that libtorch
(ARM64). Currently, only the PyTorch backend is supported. Note that libtorch

? At least I thought that was how they titled their releases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're totally correct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-surfacing this pyTorch -> PyTorch

will be downloaded from a CrayLabs github repo. (SmartSim-PR465_)


.. _SmartSim-PR446: https://github.com/CrayLabs/SmartSim/pull/446
.. _SmartSim-PR448: https://github.com/CrayLabs/SmartSim/pull/448
.. _SmartSim-PR451: https://github.com/CrayLabs/SmartSim/pull/451
.. _SmartSim-PR453: https://github.com/CrayLabs/SmartSim/pull/453
.. _SmartSim-PR461: https://github.com/CrayLabs/SmartSim/pull/461
.. _SmartSim-PR465: https://github.com/CrayLabs/SmartSim/pull/465


0.6.0
Expand Down
118 changes: 87 additions & 31 deletions smartsim/_core/_install/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ class BuildError(Exception):

class Architecture(enum.Enum):
X64 = ("x86_64", "amd64")
ARM64 = ("arm64",)

@classmethod
def from_str(cls, string: str, /) -> "Architecture":
Expand Down Expand Up @@ -366,6 +367,8 @@ class RedisAIBuilder(Builder):

def __init__(
self,
_os: OperatingSystem = OperatingSystem.from_str(platform.system()),
architecture: Architecture = Architecture.from_str(platform.machine()),
build_env: t.Optional[t.Dict[str, t.Any]] = None,
torch_dir: str = "",
libtf_dir: str = "",
Expand All @@ -376,7 +379,14 @@ def __init__(
verbose: bool = False,
) -> None:
super().__init__(build_env or {}, jobs=jobs, verbose=verbose)

self.rai_install_path: t.Optional[Path] = None
if _os not in OperatingSystem:
raise BuildError(f"Unsupported operating system: {_os}")
self._os = _os
if architecture not in Architecture:
raise BuildError(f"Unsupported architecture: {architecture}")
ashao marked this conversation as resolved.
Show resolved Hide resolved
self._architecture = architecture

# convert to int for RAI build script
self._torch = build_torch
Expand All @@ -385,10 +395,21 @@ def __init__(
self.libtf_dir = libtf_dir
self.torch_dir = torch_dir

# TODO: It might be worth making these constructor args so that users
# of this class can configure exactly _what_ they are building.
self._os = OperatingSystem.from_str(platform.system())
self._architecture = Architecture.from_str(platform.machine())
# Sanity checks
self._check_backends_arm64()

def _check_backends_arm64(self) -> None:
if self._architecture == Architecture.ARM64:
unsupported = []
if self.build_tf:
unsupported.append("Tensorflow")
if self.build_onnx:
unsupported.append("ONNX")
ashao marked this conversation as resolved.
Show resolved Hide resolved
if unsupported:
raise BuildError(
f"The {'.'.join(unsupported)} backends are not "
MattToast marked this conversation as resolved.
Show resolved Hide resolved
"supported on ARM64. Run with `smart build --no_tf`"
ashao marked this conversation as resolved.
Show resolved Hide resolved
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the decision to try and move as much of the validation of dependencies into the constructor to error out early!

But I don't necessarily love the existence of this method. It's going to become very difficult to maintain if we were to expand out into different architectures w/o full dependency support and/or we want to become more prescriptive in our builds (e.g. different versions of libraries for different architectures, etc.)

What would you think of, instead, expanding the _RAIBuildDependency interface so that it included an __is_satisfiable__ method:

class _RAIBuildDependency(ABC):
    ...

    @abstractmethod
    def __is_satisfiable__(self) -> bool: ...

which can then be implemented by all dependencies that implement the interface:

class _TFArchive(_WebTGZ, _RAIBuildDependency):
    ...

    def __is_satisfiable__(self) -> bool:
        try:
            self.url
        except BuildError:
            return False
        else:
             return True

# And presumably very similar implementations 
# for `_PTArchiveLinux`, `_PTArchiveMacOSX`, and `_ORTArchive`

and then checking for dependency satisfiability in RedisAIBuilder.__init__ with a simple:

class RedisAIBuilder(Builder):
    ...

    def __init__(self, ...) -> None:
        ...
        if not all(
            dep.__is_satisfiable__() for dep
            in sequence_of_deps_that_the_rai_builder_needs
        ):
            raise BuildError(f"{type(self).__name__} failed to satisfy "
                              "dependencies for this platform")
            # We can workshop the error message if want to be a bit 
            # more prescriptive, but you get the idea!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per conversation yesterday, this solution is not viable as we do not have all the information necessary at __init__ time to construct the dependencies (e.g. we do not know if we are building for GPU), which should definitely be changed, but is also definitely not in the scope of this PR.

Switching to having all _RAIBuildDependencys to have a static method to return platform/architecture combinations that they can be built for. This will move the platform checks to each dependency which will be at least slightly more maintainable until we can move more build details into the RedisAIBuilder.__init__.


@property
def rai_build_path(self) -> Path:
Expand Down Expand Up @@ -436,6 +457,8 @@ def fail_to_format(reason: str) -> BuildError: # pragma: no cover
raise fail_to_format(f"Unknown operating system: {self._os}")
if self._architecture == Architecture.X64:
arch = "x64"
elif self._architecture == Architecture.ARM64:
arch = "arm64v8"
else: # pragma: no cover
raise fail_to_format(f"Unknown architecture: {self._architecture}")
return self.rai_build_path / f"deps/{os_}-{arch}-{device}"
Expand All @@ -450,13 +473,17 @@ def _get_deps_to_fetch_for(
# dependency versions were declared in single location.
# Unfortunately importing into this module is non-trivial as it
# is used as script in the SmartSim `setup.py`.
fetchable_deps: t.Sequence[t.Tuple[bool, _RAIBuildDependency]] = (
(True, _DLPackRepository("v0.5_RAI")),
(self.fetch_torch, _PTArchive(os_, device, "2.0.1")),
(self.fetch_tf, _TFArchive(os_, arch, device, "2.13.1")),
(self.fetch_onnx, _ORTArchive(os_, device, "1.16.3")),
)
return tuple(dep for should_fetch, dep in fetchable_deps if should_fetch)

# DLPack is always required
fetchable_deps: t.List[_RAIBuildDependency] = [_DLPackRepository("v0.5_RAI")]
if self.fetch_torch:
fetchable_deps.append(choose_pt_variant(os_, arch, device, "2.0.1"))
if self.fetch_tf:
fetchable_deps.append(_TFArchive(os_, arch, device, "2.13.1"))
if self.fetch_onnx:
fetchable_deps.append(_ORTArchive(os_, device, "1.16.3"))

return tuple(fetchable_deps)

def symlink_libtf(self, device: str) -> None:
"""Add symbolic link to available libtensorflow in RedisAI deps.
Expand Down Expand Up @@ -756,31 +783,12 @@ def _extract_download(
zip_file.extractall(target)


@t.final
@dataclass(frozen=True)
class _PTArchive(_WebZip, _RAIBuildDependency):
os_: OperatingSystem
architecture: Architecture
device: TDeviceStr
version: str

@property
def url(self) -> str:
if self.os_ == OperatingSystem.LINUX:
if self.device == "gpu":
pt_build = "cu117"
else:
pt_build = "cpu"
# pylint: disable-next=line-too-long
libtorch_arch = f"libtorch-cxx11-abi-shared-without-deps-{self.version}%2B{pt_build}.zip"
elif self.os_ == OperatingSystem.DARWIN:
if self.device == "gpu":
raise BuildError("RedisAI does not currently support GPU on Macos")
pt_build = "cpu"
libtorch_arch = f"libtorch-macos-{self.version}.zip"
else:
raise BuildError(f"Unexpected OS for the PT Archive: {self.os_}")
return f"https://download.pytorch.org/libtorch/{pt_build}/{libtorch_arch}"

@property
def __rai_dependency_name__(self) -> str:
return f"libtorch@{self.url}"
Expand All @@ -793,6 +801,54 @@ def __place_for_rai__(self, target: t.Union[str, "os.PathLike[str]"]) -> Path:
return target


@t.final
class _PTArchiveLinux(_PTArchive):
@property
def url(self) -> str:
if self.device == "gpu":
pt_build = "cu117"
else:
pt_build = "cpu"
# pylint: disable-next=line-too-long
libtorch_archive = (
f"libtorch-cxx11-abi-shared-without-deps-{self.version}%2B{pt_build}.zip"
)
return f"https://download.pytorch.org/libtorch/{pt_build}/{libtorch_archive}"


@t.final
class _PTArchiveMacOSX(_PTArchive):
@property
def url(self) -> str:
if self.device == "gpu":
raise BuildError("RedisAI does not currently support GPU on Mac OSX")
if self.architecture == Architecture.X64:
pt_build = "cpu"
libtorch_archive = f"libtorch-macos-{self.version}.zip"
root_url = "https://download.pytorch.org/libtorch"
return f"{root_url}/{pt_build}/{libtorch_archive}"
if self.architecture == Architecture.ARM64:
libtorch_archive = f"libtorch-macos-arm64-{self.version}.zip"
# pylint: disable-next=line-too-long
root_url = (
"https://github.com/CrayLabs/ml_lib_builder/releases/download/v0.1/"
)
return f"{root_url}/{libtorch_archive}"

raise BuildError("Unsupported architecture for Pytorch: {self.architecture}")


def choose_pt_variant(
os_: OperatingSystem, arch: Architecture, device: TDeviceStr, version: str
) -> t.Union[_PTArchiveLinux, _PTArchiveMacOSX]:
if os_ == OperatingSystem.DARWIN:
return _PTArchiveMacOSX(arch, device, version)
if os_ == OperatingSystem.LINUX:
return _PTArchiveLinux(arch, device, version)

raise BuildError(f"Unsupported OS for pyTorch: {os_}")
ashao marked this conversation as resolved.
Show resolved Hide resolved


@t.final
@dataclass(frozen=True)
class _TFArchive(_WebTGZ, _RAIBuildDependency):
Expand Down
92 changes: 83 additions & 9 deletions tests/install/test_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,22 @@
import pytest

import smartsim._core._install.builder as build
from smartsim._core._install.buildenv import RedisAIVersion

# The tests in this file belong to the group_a group
pytestmark = pytest.mark.group_a

RAI_VERSIONS = RedisAIVersion("1.2.7")
valid_platforms = (
dict(_os=build.OperatingSystem.DARWIN, architecture=build.Architecture.X64),
dict(_os=build.OperatingSystem.DARWIN, architecture=build.Architecture.ARM64),
dict(_os=build.OperatingSystem.LINUX, architecture=build.Architecture.X64),
)
extract_names = lambda a, b: ".".join((a.name, b.name)).lower()
platform_ids = (extract_names(*platform.values()) for platform in valid_platforms)
for_each_platform = pytest.mark.parametrize(
MattToast marked this conversation as resolved.
Show resolved Hide resolved
"platform", valid_platforms, ids=platform_ids
)

for_each_device = pytest.mark.parametrize("device", ["cpu", "gpu"])

Expand All @@ -56,23 +68,21 @@
@pytest.mark.parametrize(
"mock_os", [pytest.param(os_, id=f"os='{os_}'") for os_ in ("Windows", "Java", "")]
)
def test_rai_builder_raises_on_unsupported_op_sys(monkeypatch, mock_os):
monkeypatch.setattr(platform, "system", lambda: mock_os)
def test_os_enum_raises_on_unsupported(monkeypatch, mock_os):
MattToast marked this conversation as resolved.
Show resolved Hide resolved
with pytest.raises(build.BuildError, match="operating system") as err_info:
build.RedisAIBuilder()
build.OperatingSystem.from_str(mock_os)


@pytest.mark.parametrize(
"mock_arch",
[
pytest.param(arch_, id=f"arch='{arch_}'")
for arch_ in ("i386", "i686", "i86pc", "aarch64", "arm64", "armv7l", "")
for arch_ in ("i386", "i686", "i86pc", "aarch64", "armv7l", "")
],
)
def test_rai_builder_raises_on_unsupported_architecture(monkeypatch, mock_arch):
monkeypatch.setattr(platform, "machine", lambda: mock_arch)
def test_arch_enum_raises_on_unsupported(monkeypatch, mock_arch):
MattToast marked this conversation as resolved.
Show resolved Hide resolved
with pytest.raises(build.BuildError, match="architecture"):
build.RedisAIBuilder()
build.Architecture.from_str(mock_arch)


@pytest.fixture
Expand All @@ -84,6 +94,7 @@ def p_test_dir(test_dir):
def test_rai_builder_raises_if_attempting_to_place_deps_when_build_dir_dne(
monkeypatch, p_test_dir, device
):
monkeypatch.setattr(build.RedisAIBuilder, "_check_backends_arm64", lambda a: None)
monkeypatch.setattr(
build.RedisAIBuilder,
"rai_build_path",
Expand All @@ -99,6 +110,7 @@ def test_rai_builder_raises_if_attempting_to_place_deps_in_nonempty_dir(
monkeypatch, p_test_dir, device
):
(p_test_dir / "some_file.txt").touch()
monkeypatch.setattr(build.RedisAIBuilder, "_check_backends_arm64", lambda a: None)
monkeypatch.setattr(
build.RedisAIBuilder, "rai_build_path", property(lambda self: p_test_dir)
)
Expand All @@ -111,6 +123,27 @@ def test_rai_builder_raises_if_attempting_to_place_deps_in_nonempty_dir(
rai_builder._fetch_deps_for(device)


invalid_build_arm64 = [
dict(build_tf=True, build_onnx=True),
dict(build_tf=False, build_onnx=True),
dict(build_tf=True, build_onnx=False),
]
invalid_build_ids = [
",".join([f"{key}={value}" for key, value in d.items()])
for d in invalid_build_arm64
]


@pytest.mark.parametrize("build_options", invalid_build_arm64, ids=invalid_build_ids)
def test_rai_builder_raises_if_unsupported_deps_on_arm64(monkeypatch, build_options):
ashao marked this conversation as resolved.
Show resolved Hide resolved
with pytest.raises(build.BuildError, match=r"backends are not supported on ARM64"):
build.RedisAIBuilder(
_os=build.OperatingSystem.DARWIN,
architecture=build.Architecture.ARM64,
**build_options,
)


def _confirm_inst_presence(type_, should_be_present, seq):
expected_num_occurrences = 1 if should_be_present else 0
occurrences = filter(lambda item: isinstance(item, type_), seq)
Expand All @@ -133,8 +166,10 @@ def _confirm_inst_presence(type_, should_be_present, seq):
@toggle_build_pt
@toggle_build_ort
def test_rai_builder_will_add_dep_if_backend_requested_wo_duplicates(
device, build_tf, build_pt, build_ort
monkeypatch, device, build_tf, build_pt, build_ort
):
monkeypatch.setattr(build.RedisAIBuilder, "_check_backends_arm64", lambda a: None)

rai_builder = build.RedisAIBuilder(
build_tf=build_tf, build_torch=build_pt, build_onnx=build_ort
)
Expand All @@ -149,8 +184,9 @@ def test_rai_builder_will_add_dep_if_backend_requested_wo_duplicates(
@toggle_build_tf
@toggle_build_pt
def test_rai_builder_will_not_add_dep_if_custom_dep_path_provided(
device, p_test_dir, build_tf, build_pt
monkeypatch, device, p_test_dir, build_tf, build_pt
):
monkeypatch.setattr(build.RedisAIBuilder, "_check_backends_arm64", lambda a: None)
mock_ml_lib = p_test_dir / "some/ml/lib"
mock_ml_lib.mkdir(parents=True)
rai_builder = build.RedisAIBuilder(
Expand All @@ -171,6 +207,7 @@ def test_rai_builder_will_not_add_dep_if_custom_dep_path_provided(
def test_rai_builder_raises_if_it_fetches_an_unexpected_number_of_ml_deps(
monkeypatch, p_test_dir
):
monkeypatch.setattr(build.RedisAIBuilder, "_check_backends_arm64", lambda a: None)
monkeypatch.setattr(
build.RedisAIBuilder, "rai_build_path", property(lambda self: p_test_dir)
)
Expand Down Expand Up @@ -205,3 +242,40 @@ def _some_long_io_op(_):
build._threaded_map(_some_long_io_op, [])
end = time.time()
assert end - start < sleep_duration


def test_correct_pt_variant_os():
# Check that all Linux variants return Linux
for linux_variant in build.OperatingSystem.LINUX.value:
os_ = build.OperatingSystem.from_str(linux_variant)
assert isinstance(
build.choose_pt_variant(os_, "x86_64", "cpu", RAI_VERSIONS.torch),
ashao marked this conversation as resolved.
Show resolved Hide resolved
build._PTArchiveLinux,
)
# Check that ARM64 and X86_64 Mac OSX return the Mac variant
all_archs = (build.Architecture.ARM64, build.Architecture.X64)
for arch in all_archs:
os_ = build.OperatingSystem.DARWIN
assert isinstance(
build.choose_pt_variant(os_, arch, "cpu", RAI_VERSIONS.torch),
build._PTArchiveMacOSX,
)


def test_PTArchive_MacOSX_url():
ashao marked this conversation as resolved.
Show resolved Hide resolved
os_ = build.OperatingSystem.DARWIN
ashao marked this conversation as resolved.
Show resolved Hide resolved
arch = build.Architecture.X64
pt_version = RAI_VERSIONS.torch

pt_linux_cpu = build._PTArchiveLinux(build.Architecture.X64, "cpu", pt_version)
x64_prefix = "https://download.pytorch.org/libtorch/"
assert x64_prefix in pt_linux_cpu.url

pt_macosx_cpu = build._PTArchiveMacOSX(build.Architecture.ARM64, "cpu", pt_version)
arm64_prefix = "https://github.com/CrayLabs/ml_lib_builder/releases/download/"
assert arm64_prefix in pt_macosx_cpu.url


def test_PTArchive_MacOSX_gpu_error():
ashao marked this conversation as resolved.
Show resolved Hide resolved
with pytest.raises(build.BuildError, match="support GPU on Mac OSX"):
build._PTArchiveMacOSX(build.Architecture.ARM64, "gpu", RAI_VERSIONS.torch).url
Loading