I have recently posted about multicore issues for realtime audio on Apple silicon, because with the release of MacOS 14 (codename Sonoma) including a Graphical User Interface (GUI) bug affecting overall performance, I initially thought this problem with fixed. But well, it was not…. Here is the next (last?) episode of the story….
The Issue
It has been an ongoing problem for a while on Apple Silicon machines (probably since MacOS Ventura, as we did not experience this issue with our initial Apple Silicon benchmark on Big Sur and Monterey) – Intel systems are currently not impacted. The symptom is that you get random audio dropouts in some audio apps, especially when working under low CPU load (yeah, this does not sound intuitive, but that’s the reality!).
This only happens for audio apps which perform parallel audio processing to get the best out of modern CPUs that are capable of working simultaneously on multiple tasks, and rely on Apple’s Core Audio to work with audio devices. You may typically have experienced the problem with our apps and plug-ins that have the multicore option, such as PatchWork, Axiom, MB-7 Mixer or Late Replies. But some major DAWs have been impacted too (I have seen some official advice about reducing the buffer size to fix it – forcing the CPU to work harder and actually increasing CPU load).
The Root Cause
Apple Silicon processors are asymmetric: they have both efficiency cores (E-cores), suitable for background tasks with low processing demands, and performance cores (P-cores), designed for heavy-duty processing tasks. This requires different scheduling for real-time tasks (tasks that have to be executed with time constraints, such as real-time audio), and others.
There have been already a few things available for developers to tell the system about the type of tasks they are doing, but they do not seem to be used by MacOS (although the features are available in the Unix parts of MacOS – namely pthreads), as setting these properties seems to have absolutely no impact on their scheduling in audio apps.
In 2020, with the release of the M1 processor and MacOS 11 (codename Big Sur), Apple introduced a new concept called “Audio WorkGroups” for this purpose (which has some advantages over the forementioned solutions for sure, but also has limitations, and require some impacting new developments for apps an plug-ins developers).
Even if you are not a software developer, you may want to watch this WWDC video, as it gives an idea of what happens in real-time audio apps and plug-ins using multicore features, and how these Workgroups are supposed to help for real time audio tasks:
While this new concept looked like an optional feature at the time (the MacOS scheduler seemed to work fine for real-time audio without using it), something has definitively changed since (probably to support Audio Units V3 that are hosted in another process, and maybe to make MacOS more energy-efficient?), and you cannot do without it anymore in audio applications.
No Hope?
Some time ago (as soon as these issues appeared, actually), it was decided to bite the bullet and try to migrate the entire Blue Cat Audio multicore infrastructure to this new system to solve these problems that started to impact more and more customers.
Although it may seem very simple from the video above, it actually took a long time to start getting actual results, mainly because this feature is very poorly documented (some parts of the documentation are outdated or contradictory), and also because we tried several other solutions at the same time, trying to stay away for this “new feature” that could just be a temporary thing (trust me, it has happened a lot in the past).
Believe it or not the video above is the only reliable piece of information available, and yet it is missing many details – and the devil is in the details, as you probably know. Also, this is a recent addition to the system, and not many developers are actually impacted by such a dedicated feature (there are not that many multicore-capable audio plug-ins available), so you won’t find many examples (that actually work). And last but not least, there are actually parts of our plug-ins and apps (such as network connectivity) that are quite unique and do not fall into the “simple fix” category.
Tips for fellow developers: do not waste your time with QoS or real-time thread attributes, they are completely useless for this problem. The latter have however to be set on a thread (any random value will be fine) to be able to join a workgroup. Oh, and I don’t think it is documented anywhere, but workgroups are reference counted objects: do use retain and release or you will get into trouble!
Anyway, as explained in a previous post, this infrastructure also has a problem: the solutions provided by Apple are only available for standalone applications or Audio Units plug-ins. Nothing for others (VST, AAX, VST3…). Steinberg stated that they would not add this capability to VST3 and roll out their own solution (?!?), and since they are still trying to deprecate VST2, there is no hope there. It means that if you load a VST plug-in inside an app, using multicore features inside will lead to similar issues as before. Pro Tools is a very different beast due to its tight integration with hardware, and does not seem to be impacted so far, so it is not sure how this may (or may not) work there at some point.
Saved by Open Source?
With a solution for Audio Units and apps (I am afraid there might still be potential problems to be seen in the future though… Apple breaking their operating system with almost every new version), it is still rather difficult to actually release a fix without supporting other plug-in formats as well: as explained earlier, Apple provides absolutely no way to work with these workgroups for anything else than Core Audio applications or Audio Unit plug-ins, and that’s a pity, because many customer are using different systems.
There are however parts of the MacOS system that are open source (mostly because it relies on Unix and other open source projects). After a deep journey in the public MacOS source code to try to understand how to use these audio workgroups properly, a generic solution for all cases was found! Well… it is actually a “hack”, as it relies on how workgroups are implemented in the MacOS system: it is always a bad idea to rely on the undocumented implementation of a feature, but in this case Apple gives us no choice. And for this particular piece of code, it seems pretty safe to rely on it (it is not likely to break, and has not changed at all since it was added, remaining stable for 4 versions of the system).
New Previews Available!
So here we are: we now have a solution that seems to be working pretty well for all cases so far, including VST or VST3 plug-ins loaded via a plug-in loader such as PatchWork, and we have just released some previews for the plug-ins below that should finally fix the problem. If you are running an Apple Silicon Mac and want to fully benefit from multicore processing, check them out! It may also improve audio performance on Intel Macs running Mac OS 11 (codename Big Sur) or newer.
- Fixed multicore processing and Network Audio Server performance: PatchWork, Axiom.
- Fixed multicore processing: MB-7 Mixer and Late Replies.
- Improved low latency network connection reliability for Connector.
Please report back and tell us how this all works for you!
Can you share your solution for VST3? (If not here, maybe on the JUCE forums if you use it)
So far as you mentioned, we can only register threads to the hardware callback audio workgroup shared by AU plugins, as far as I know and for VST3 we’d need to have the plugin wrapper provide the audio hardware workgroup ID/token (unless there some other way for plugin code to find it without user input).
Sure, I’ll do (not using JUCE here though, so I am not on the forum). I am just waiting for customer feedback on the current previews, as there are too many hosts out there to test them all, and the “hack” may depend on how the host works. It’s no rocket science though. Anyone who can read C code and is curious enough to search how workgroups are implement on the web should find out a solution pretty quickly.