With macOS 14.4, Apple introduced new API in CoreAudio that allows any app to capture audio from other apps or the entire system, as long as the user has given the app permission to do so.
Unfortunately this new API is poorly documented and the nature of CoreAudio makes it really hard to figure out exactly how to set things up so that your app can use this new functionality.
This project is provided as documentation for this new API to help developers of audio apps.
AudioCap.mp4
Here’s a brief summary of the new API added in macOS 14.4 and how to put everything together.
As you’d expect, recording audio from other apps or the entire system requires a permission prompt.
The message for this prompt is defined by adding the NSAudioCaptureUsageDescription
key to the app’s Info.plist. This key is not listed in the Xcode dropdown, you have to enter it manually.
There’s no public API to request audio recording permission or to check if the app has that permission. This project implements permission check/request using private API from the TCC framework, but there is a build-time flag to disable private API usage, in which case the permission will be requested the first time audio recording is started in the app.
Assuming the app has audio recording permission, setting up and recording audio from other apps can be done by performing the following steps:
- Get the PID of the process you wish to capture
- Use kAudioHardwarePropertyTranslatePIDToProcessObject to translate the PID into an
AudioObjectID
- Create a CATapDescription for the object ID above, and set (or just get) its
uuid
property, which will be needed later - Call AudioHardwareCreateProcessTap with the tap description to create the tap, which gets its own
AudioObjectID
- Create a dictionary for your aggregate device that includes
[kAudioSubTapUIDKey: <your tap description uuid string>]
in itskAudioAggregateDeviceTapListKey
(you probably want to configure other things, such as settingkAudioAggregateDeviceIsPrivateKey
to true so that it doesn’t show up globally) - Call AudioHardwareCreateAggregateDevice with the dictionary above
- Read
kAudioTapPropertyFormat
from the process tap to get itsAudioStreamBasicDescription
, then create anAVAudioFormat
matching the description, this will be needed later - Create an
AVAudioFile
for writing with your desired settings - Call
AudioDeviceCreateIOProcIDWithBlock
to set up a callback for your aggregate device - Inside the callback, create an
AVAudioPCMBuffer
passing in your format; you can usebufferListNoCopy
withnil
deallocator then just callwrite(from:)
on your audio file, passing in the buffer - Call
AudioDeviceStart
with the aggregate device and IO proc ID - Remember to call all your
Audio...Stop
andAudio...Destroy
cleanup functions - Let the
AVAudioFile
deinit to close it - Now you have an audio file with a recording from the system or app
Thanks to @WFT for helping me with this project.