Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On MacOS on ARM hardware, the "processors.isa", "os.architecture", "architecture", and "hardwareisa" facts report incorrect values when running under Rosetta 2 #2716

Open
zbentley opened this issue May 17, 2024 · 12 comments
Labels
bug Something isn't working

Comments

@zbentley
Copy link

Describe the Bug

On an ARM MacOS environment, facter's builtin processors.isa and os.architecture facts report the architecture being emulated by Rosetta, not the architecture of the host hardware.

Rosetta 2 is MacOS's system for the execution of Intel binaries on ARM-based (M1/2/3) Macs.

When a binary is Intel-only, or when a binary is "universal" and contains segments for both Intel and ARM, Rosetta 2 can be used to run it as x86_64.

Expected Behavior

  • The computer's processors don't change to contain different silicon based on whether an emulation layer is present, so processors.isa (and hardwareisa) should not change based on whether facter is being run in an emulated capacity.
  • The architecture for which MacOS was installed cannot change, so os.architecture (and architecture) should not change based on whether facter is being run in an emulated capacity.

Steps to Reproduce

  1. Use MacOS 13 or better on an ARM (M1, M2, M3 etc) Mac. The arch command run without arguments should report arm64 or arm64e.
  2. Install Rosetta on that Mac: softwareupdate --install-rosetta
  3. Install facter.
  4. Run facter in the arm64 architecture (should be the default), and observe that the isa/architecture related facts match the computer's hardware:
> arch -arch arm64 facter | grep 'isa\|arch'
architecture => "arm64",
isa => "arm",
  1. Run facter via Rosetta emulation, and observe that the isa/architecture facts are now incorrect:
> arch -arch x86_64 facter | grep 'isa\|arch'
architecture => "x86_64",
isa => "i386",

Environment

MacOS 14.5, also reproduced on MacOS 13.

I was unable to install facter directly via gem as the dependency on the now-deprecated hpricot prevented compilation on my machine.

Instead, I reproduced this using two puppetlabs-official distributions of facter: facter 4.6.1 via brew install --cask puppet-agent (which ships with an ARM-only ruby 2.7.8p225 (2023-03-30 revision 1f4d455848) [arm64-darwin23]), and facter 4.7.0 via brew install --cask puppet-bolt (which ships with an x86_64 ruby 2.7.8p225 (2023-03-30 revision 1f4d455848) [x86_64-darwin21]. The x86-ness isn't implicated here, and is likely temporary pending resolution of a low-priority issue I reported))

Additional Context

This seems to be due to the os/processor facts' dependency on the uname propvider (e.g. here). uname is not an appropriate way to get hardware information on MacOS. It reports information about the current runtime/emulation context, not about the machine itself:

> uname -a
Darwin atropos.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 arm64
> arch -arch arm64 uname -a
Darwin atropos.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 arm64
> arch -arch x86_64 uname -a
Darwin atropos.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 x86_64
@zbentley zbentley added the bug Something isn't working label May 17, 2024
@zbentley
Copy link
Author

The speed processor fact also seems to be selectively present based on emulation:

> diff <(arch -arch arm64 facter) <(arch -arch x86_64 facter)
< ... snip temporally-dependent info >
<   architecture => "arm64",
---
>   architecture => "x86_64",
498c498
<   hardware => "arm64",
---
>   hardware => "x86_64",
520c520
<   isa => "arm",
---
>   isa => "i386",
533a534
>   speed => "2.40 GHz",

@zbentley
Copy link
Author

A workaround for this issue can be achieved by overriding the affected facts with the values returned by them when forced to run as the native architecture. An example such custom fact file is below (in Puppet, I store and access this fact as close to the entry points to my Puppet manifest evaluation as possible):

Facter.add('_oldfacts') do
  confine kernel: 'Darwin'
  setcode do
    Puppet::Util::Json.load(Facter::Core::Execution.execute('arch -64 facter --no-ruby --show-legacy --no-cache --no-external-facts --no-color --json'))
  end
end

['architecture', 'hardwareisa', 'hardwaremodel', 'processors', 'os'].each do |fact|
    Facter.add(fact) do
      confine kernel: 'Darwin'
      setcode do
        Facter.value('_oldfacts').fetch(fact)
      end
    end
end

@zbentley
Copy link
Author

Upon digging into it a little, I'm honestly not wild about uname for getting processor info in general: https://github.com/coreutils/coreutils/blob/master/src/uname.c#L317

While MacOS uses a BSD-based uname and not necessarily the above source code, the fact that the Linux edition is also so willing to fall back to the compilation architecture of the binary rather than asking the kernel via sysctl doesn't give me a ton of faith in the approach.

@cthorn42
Copy link
Collaborator

Might be related to #2703, our process has some outdated tooling that we need to update. Once the above issue is resolved we'll try to reproduce this issue you have described here.

@zbentley
Copy link
Author

I don't understand how this would interact with #2703 (other than that I had to get pre-built Facter for the to-reproduce steps). What does the documentation generation gem have to do with uname's flaws?

@tvpartytonight
Copy link
Contributor

Thanks @zbentley for bringing this up; have you tried building the gem excluding documentation, which would exclude ronn and the transitive dependency on hpricot? That might allow to build it for your use case.

While we agree this is likely an improvement, we do not anticipate addressing this any time soon, so hopefully you can build it without ronn.

@zbentley
Copy link
Author

@tvpartytonight the issue arises from Rosetta, not any of the gems against which this package is compiled.

Facter provides these values by shelling out to uname. uname, on MacOS, is a universal binary which can run as either x86 or ARM. Unfortunately, uname reports architecture/processor values according to the architecture it was launched with, not the architecture that exists in the hardware. As a result, uname is unsuitable for returning information about the host platform (doesn't stop every tool you've ever heard of from depending on it, though; it's far from just Facter).

Regardless of what gems facter is built with, if I do, say, an x86 Bash spawning an ARM Ruby to run Facter, and Facter in turn spawns the universal binary uname, uname will launch in x86 mode even if its immediate parent is ARM.

@zbentley
Copy link
Author

This can be reproduced without Facter at all, via the following on an M1 mac:

zac@atropos ~ ∴ sh -c 'ruby -e "puts %x(uname -a)"'
Darwin atropos.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 arm64
zac@atropos ~ ∴ arch -arch x86_64 sh -c 'ruby -e "puts %x(uname -a)"'
Darwin atropos.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 x86_64
zac@atropos ~ ∴ arch -arch arm64 sh -c 'ruby -e "puts %x(uname -a)"'
Darwin atropos.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 arm64

The problem is that Facter uses uname for rather a lot of facts, with the presumption that uname returns truths about the host system's hardware. It does not; rather, it returns different information depending on how it was invoked.

@joshcooper
Copy link
Contributor

@zbentley Running puppet in Rosetta isn't something we support and is only going to cause problems when trying to manage the OS (similar to WOW32 on Windows). We ship both x86_64 and ARM agents, so why not install from https://downloads.puppet.com/mac/puppet8/14/arm64?

@zbentley
Copy link
Author

zbentley commented Jun 8, 2024

Many Puppet installation guides direct users to install Puppet in such a way that a Rosetta environment is used:

  1. See e.g. the issue I reported for Bolt's autoinstallation of Puppet: Provide MacOS puppet-bolt Homebrew cask installers with the arm64 architecture bolt#3264
  2. Until recently the Puppet docs recommended the installation of Puppet on MacOS (brew install --cask puppet-agent) using a formula that preferred x86 architectures on ARM machines.
  3. Many users did or still do install Homebrew itself as an x86 program suite on MacOS. In such circumstances (which older ARM Macs are very easily accidentally "grandfathered into" via several common upgrade workflows, or just by having been initially provisioned back when Homebrew/Puppet/Zsh/etc. only offered x86 versions), even the current formula for Puppet agent will be installed as x86 on an ARM mac.
  4. Anecdotally, a variety of colleagues and companies I'm familiar with who are using Puppet for MacOS provisioning have run Puppet via Rosetta (either due to Bolt limitations, Homebrew issues, or just by accident) for several years, though I imagine they will be pleased to make use of faster native binaries.

@joshcooper In light of how easy it is to accidentally run Puppet under Rosetta, and how many folks and official docs have been recommending Puppet installations under Rosetta, I'd urge you to reconsider making Rosetta execution unsupported.

If that's not something you're interested in, could we make the unsupportedness louder? In other words, if Facter cannot reliably be run in Rosetta without breakage as described here, could we make facter and/or Puppet itself fail to supply architecture-specific facts entirely if running in Rosetta?

Ordinarily that isn't something I'd suggest, preferring to avoid extra work/complexity and trust users to manage their runtime environments correctly. However, given how seamless Rosetta's integration with MacOS is, it's extremely common for programs to run under Rosetta by accident, so I think in this case it makes more sense to error or warn loudly when that's happening in an unsupported way.

@joshcooper
Copy link
Contributor

Thanks for listing out the issues @zbentley! That said I think we should work on adding ARM support where it's missing (like bolt) and fixing homebrew installation instructions so that users install the correct package for their architecture. It's only a matter of time before

Many users did or still do install Homebrew itself as an x86 program suite on MacOS... even the current formula for Puppet agent will be installed as x86 on an ARM mac.

I thought all of our current formulas for puppet7 and up support ARM. Or are you saying if you install x86 homebrew on macOS ARM, then you'll get x86 puppet due to? https://github.com/puppetlabs/homebrew-puppet/blob/24f8e42983872f5f5184edfe140945a6b9202a91/Casks/puppet-agent-7.rb#L5

could we make the unsupportedness louder?

Yes that would make sense. What is the best way to detect when facter is running in Rosetta?

Related, can we improve our homebrew formula so that they always install arm packages on arm hosts, even when homebrew is x86?

@zbentley
Copy link
Author

Thanks for replying! You're correct that Puppet will currently be installed by Homebrew as ARM; I was either mistaken or using outdated distributions at the original time of writing.

Even if present-day installers have the right arch, I still suspect this issue will remain widespread for quite some time, though: since the inaccurate facter values will occur if any ancestor of the current process is x86, that means that this will affect:

  • People with old distributions of Puppet (hopefully small and shrinking over time).
  • People running Puppet from within an x86 shell will get x86 values on ARM (e.g. people who first brew install zsh or whatnot back when Homebrew didn't have native binaries for many things, or who intentionally or unintentionally use or used x86 homebrew to get a launcher).
  • People running Puppet as a result of some other x86 harness (e.g. any distribution of Bolt at the time of this writing, or a ruby/python/perl/etc. script running in an x86 interpreter).

What is the best way to detect when facter is running in Rosetta?

I'm not sure if it's the best way, but a reliable technique I have used is to compare the values of some command with and without an arch -64 (the latter instructs use of the native 64-bit architecture). So if you compare e.g. the output of sysctl hw.optional.arm64 or uname with or without that prefix, you can determine if Rosetta's present or not.

Can we improve our homebrew formula so that they always install arm packages on arm hosts, even when homebrew is x86?

I think the same technique would work in a Homebrew formula, though I'm not sure if it's blessed by the Homebrew maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants