Multi-core audio optimizations for Mac M1/2 CPUs

@taktik thanks for the bug fixes!

sorry to bother, but do you think multicore optimization would be feasible in the future? What feature are you hoping for in upcoming Renoise releases? - #636 by neopan

Unfortunately, there is no simple switch/way to do this. This old quote from Multicore Faq still is true:

If you have 10 VSTs running and nine of them use almost no CPU power, but one uses most of it, Renoise can’t magically make the heavy VST faster. Only the VST itself could do that.

Also, to create “independent streams” you can’t split tracks that feed into each other (-> groups/sends), so some tracks have to be computed on a CPU to keep the signal path connected.

I’m sure there will be further optimisations here and there, but we need concrete examples and ideas (songs/setups where the current CPU handling seems to be bad). Then we could see what’s going on here and how we can improve things.

1 Like

maybe this is not the best place to discuss this… but i get about 4X plugins running on Reaper compared to Renoise, as in the example i posted. with various complex multi group/send routings in Reaper. Also to my impression Renoise is much more single core speed dependent compared to Reaper (the performance boost between my old 2011 i7 mbp or i7 4790k hackintosh and a new M2 mbp is not that significant with Renoise, while it is huge with Reaper). This is even without DIVA style cpu bomb plugins, just a lot of plugins.

don’t want to sound negative, i love Renoise and feel free to correct me, but to my feeling it never played as well with the advent of multicore cpu’s compared to other DAW’s. which is especially a pity now in an era where amount of cores far outweight raw GHz speeds. imho.

(don’t shoot me Taktik, i love Renoise and appreciate all your hard work <3)

This is good to know, thanks!

Would you mind sharing a similar Reaper and Renoise song that clearly shows this? This could be a good starting point for debugging and improving things.

The new M1s with their drastic enforcement of CPU power saving are also a special case.

Always make sure you only enable as many CPU threads as there are performant (fast) cores in Renoise for Audio. Using low-power cores in mix with fast cores will probably make things worse for real-time audio applications.

We are already making use of Understanding Audio Workgroups | Apple Developer Documentation in Renoise, but maybe Reaper does something else on top as well.

Either way we do need more info and testing here before doing anything.

1 Like

thanks Taktik! i’m on holiday now and busy schedule weeks after but i’ll try to cobble an example together (with identical 3d party plugins only, so the ‘test’ is more objective)

just tested switching # of cores, on a base M2 pro switching between 6 cores (amount of performance cores) and 10 (total cores) gives the same amount of cpu usage. interestingly, switching to 4 cores gives almost the same cpu %, difference about 2-3%.

also Reaper is exceptional regarding audio performance. before arm macs it ran much better then Logic for me on intel cpu’s, and on this machine it is still much more performant then Ableton (i have the impression Logic performance improved a lot with arm macs)

1 Like

One more important note when comparing CPU usages between hosts:
Please make sure you are using the same audio device, sample rate, buffer size and similar number of enabled CPUs for audio settings.

Also don’t compare CPU percentage displays of the hosts. Use the taskmanager / process explorer instead. Hosts may show different kind of CPU loads depending on the energy throttling.

The most reliable and important way to test CPU loads, is to find the point when you start to hear crackles in the audio output stream. This basically is the upper limit.

1 Like

here’s a quick crappy headphones test file. balance is off but plugin presets and OS settings etc should be the same. mbp m2 pro 10 cores ventura 13.4.1. renoise 17ms latency, reaper no block request, headphones out.

renoise reports 49-50% cpu internally - activity monitor reports about 60% + about 70% plugin processes (hard to keep track of 'em) = about 130%

reaper reports about 7-8% cpu internally - activity monitor reports about 80%

edit: @taktik i know it’s hugely subjective to compare cpu performance, also confusing with seperate plugin processes in activity monitor. but to me it (subjectively) feels that i can run about 3x-4x more plugins with reaper. i can keep on building this test file if you’d like (sorry not much time now), pretty sure renoise will start to xrun much earlier.

Edit

To simplify testing I’d first avoid the external plugin processes: disable plugin sandboxing in Renoise and avoid using intel plugins on the M1. Do the same for Reaper.

I think this is the most confusing thing here.

The CPU meter in Renoise reflects the *CPU usage of the performance/energy throttled CPU → what CPU time is currently available.

The CPU meter in Reaper seems to reflect the CPU usage, assuming the CPU would run at full speed
→ what CPU time is theoretically available when the CPUs are running at full speed. But it’s not actually running at full speed.

The system will try to slow the CPUs down as much as possible to save power. So ideally the CPU meter in Renoise should be at 80% all the time and never above 90-100, if the system is doing a good job of reducing power consumption.

If that’s true, it seems more like a bug than a lack of fine-tuning. Let’s try to find out. But maybe this is also caused by Reaper’s Anticipative FX Processing feature? There’s no such feature in Renoise (automatic offline rendering).

1 Like

To simplify testing I’d first avoid the external plugin processes: disable plugin sandboxing in Renoise and avoid using intel plugins on the M1. Do the same for Reaper.

sandboxing in renoise is disabled, i’m only using arm native plugins. Reaper does not have plugin sandboxing setting. Even with arm plugins only renoise generates several plugin processes in activity monitor.

I think this is the most confusing thing here.

i posted activity monitor cpu also. i know cpu load is subjective, and a dynamic thing. i already tested this a few times at home when power is connected (= cpu not throttling per Apple). but when i hit renoise at about 78% internally it will start to xrun, no matter what. Besides that It’s hard to measure cpu in activity monitor because Renoise shows a bunch of seperate plugin processes, Reaper does not.

But maybe this is also caused by Reaper’s Anticipative FX Processing feature?

just ran the test with anticipative FX Processing off, activity monitor reports about 65% instead of 80%. so it’s even lower.

Edit: my bad, didn’t double check plugin sandboxing in Renoise. I’ll test tonight and will report!

@taktik finally could test again with plugin sandboxing off. power connected. 48Khz 25ms latency (there’s not much cpu difference between 20-30ms at medium usage on this mbp) 6 cores enabled.

renoise reports 35% internally, 82% in activity monitor with the test. ran an older project running on the edge of xruns (about 78-80% cpu peak internally), hovers around 50% cpu internally now with sandboxing off. task manager reports about 135%.

so the sandboxing thing def helps a ton my bad Taktik (migrated renoise from another arm mac where i was switching between buggy intel/arm plugins, i remember now that sandboxing helped a lot then)

But then i started to stress test 3 older projects with these settings, projects on the edge of xruns before the plugin sandbox switch. i filled them with plugins untill crackles occure (quite evenly, not throwing 10 plugs on one track but about one 4voice synth + 1 hefty fx + 1 send per track. made sure i’m only using arm native plugins).
in all cases with all projects i get crackle/xruns at 290% - 300% in activity monitor, hmm. so out of curiosity i opened one of the biggest reaper projects on my disk. that one reports about 653% cpu in activity monitor.

so correct me if i’m wrong but could it be possible that Renoise is only able to use 3 cores on this system?

thanks!

In my experience Reaper is also more efficient than Renoise and Bitwig.
I always thought this is because Reaper calculates audio in advance. I can make a test with e.g. Diva on the weekend on my Linux PC

Would be cool if somebody else could test this on arm Mac (i fried my previous M1 mbp)

This might be an interesting, topic related read:

totally don’t know if this is the culprit, but if reaper can max out all cores on < Sonoma i don’t see any reasons why Renoise wouldn’t be able (in my tests it just maxed out at 300% or 3 cores). but tbh i’m fine with that, as long as i know more or less what the upper limit is before running into it.