ReBench - benchmarking Renoise and your CPU.


(Ledger) #41

Well, IMHO firewire is quite a “good” invest currently, since prices for used firewire audio interfaces are massively dropping, thanks to Apple, not supporting it anymore (only officially, yes there is still a firewire driver in OSX 10.11) and trying to force to people to use useless thunderbolt + 20 adapters.

A PCIe 1x firewire card with TI chipset is about 10 €. So for around 150-200, you will get an audio interface which did cost over 1000 € some years ago. Of course have a look first, if there is still a driver for recent Win/OSX.

Interesting to know. It would be good to know what the benchmarks were for different interfaces / connectors too, though were getting into a large number of variables with converters/ drivers etc. Ive also got too much invested in the Crimson which has got good AD/DA converters and an analog volume control, which saves speakers/ears from any nasty surprises!. Its certainly not perfect as the monitor outputs run quite hot signal wise, so requires further gainstaging after output via monitor/ amp volumes. Also the volume pot has needed some attention with some DeOxit DN5 spray after it started to crackle badly!, though it seems to be behaving better now.

If I was looking again I`d probably be looking at the Audient ID14/22, but unfortunately no analog volume control with those… gladly not looking at the moment for more decisions! head is still spinning trying to get up to date with all the latest pewter stuff!

Anyway it seems with these new i7 level chips any losses seem to be diminished. All my CPU meter anxiety has been really put to bed for a while! If you really need more power there is always Haswell-E, or if you`ve got money to burn, a high end Xeon E5 or two!


(Zer0 Fly) #42

Hi, some Linux tests here. Haven’t seen some here yet.

  • Renoise 3.1

  • Second Hand Workstation, it is one old horse

  • Xeon w3690 six cores, hyperthreading and turbo boost disabled, so lower single core and not the maximum multicore performance. steampowered.

  • old m-audio pci sound card

  • two kernels benched, official ubuntu 4.2.0-lowlatency (“LL”) and self-built 4.1.15-rt realtime kernel (“RT”)

  • base is xubuntu 14.04 system, some tweaks applied to ensure better lowlatency operation, but not tuned like a cnc driver

  • all operation on 64bits

  • only jackd operation has been measured. alsa prooved weaker and unstable in results - sometimes on par or close to, but randomly giving much lower results if tests are repeated

  • x-runs occured during tests (sometimes shortly after at the beginning, at pattern 12 or so!), but this seems a renoise-specific problem - I’ve tested listening to the drone the test produces, and could only hear the real glitches just the moment before hitting the 90% mark

  • 256 pattern version was used, took ages to load and much much longer to close renoise (but renoise won’t crash), it made renoise consume 5.5 gigs of memory.

  • settings were chosen to the value qjackcontrol presents and that is very close to some digital feedback roundtrip latency I once measured by making a loop in my soundcard that has a hardware mixer in it. “100ms” are buffer settings “2buffers2048samples@44.1kHz" and “10ms” is "2buffers256samples@44.1khz”. But maybe it is half of the latency just for playback, I’m unsure about it atm, and what comparable settings win/mac people would use.

Results:

One Core:

“10ms”: 39/39 rpts LL/RT

“100ms”: 52/53 rpts LL/RT

Four Cores:

“10ms”: 136/137 rpts LL/RT

“100ms”: 183/185 rpts LL/RT

Six (full, no HT) Cores:

“10ms”: 199/200 rpts LL/RT

“100ms”: 86.9%/86.6% @ pattern 257 “tah, linux ruley”

So you see an old sixcore machine can make teh ruley if all cores are used. And has blow the 256 pattern test. Renoise will properly use any number of cores you feed it with. You will need an even bigger test (or use heavier dsp per channel) for newer generation sixcore cpus, maybe even soon at four-core operation. Also an rt kernel won’t be detrimental to the possible workload, but rather improve performance a bit by its better, much more aggressive task scheduling compared to the “soft” lowlatency kernel. Though the reason I installed it was simply to have an option to record drop-out-free at very low latencies (musician monitors mic/instrument with software effects while playing). Maybe I’ll try setting the maschine to pure single core operation with turbo boost enabled (OS will see a single core only, too), or enable hyperthreading (12 cores, bwahaha, make amiga ruley even more!) for some fun.


(Ledger) #43

Nice! I didn`t understand half the parameters set up there but that is an impressive end result!

Shows that Passmark can be taken with a huge grain of salt with regard to audio performance, though I`m sure your careful optimizations made a big difference too.

http://www.cpubenchmark.net/compare.php?cmp[]=1275&cmp[]=2246

Intel Xeon W3690 @ 3.47GHz
9,681
Intel Xeon E3-1231 v3 @ 3.40GHz
9,619


(Zer0 Fly) #44

hm, I’ve read somewhere a hyperthreading core has about half (or one third) of the throughput of a “real” core. But it is depending a lot on what the other cores do at that moment. That’s why I disabled ht on mine - it might speed up rendering, but is no good bet for latency sensitive stuff like realtime audio, and 6 cores are more than enough for me.

So your 4/8 cores result might be roughly comparable to my 6 core result? But only very roughly! To directly compare maybe we’d have to bench 3 full cores of mine vs 3 of yours, so both machines have free cores, with hyperthreading disabled on your pc. Or I try to enable HT on mine, and see a full bench - passmark should have measured performance with turbo boost and HT on with both chips. Then the values could be compared/scaled by the single core passmark scores. You can see the effect of having a newer cpu generation at out single core benching values. Like intel chips can have more power than amd although the amd is running on a much higher clock. Also those cpu scores are probably made for raw computing throughput, and not for latency/responsiveness tasks.

Yes, and my machine is reasonably tuned, and with all that renoising activating those tunings I already fear the next electricity bill. It is also that you can really tune linux much more delicate and throughout than windows, and maybe even better than a mac. I also vaguely remember other programs, where I read in the past that the linux versions had more performance, simply because the compiler for linux did a better job and the operating system was less bloated. For realtime performance, the operating system point probably weighs even heavier.

I’m looking forward to other linux/3.1 benchmark results now. Did someone already compare differences between renoise 3.0 and 3.1? I think there were changes in the EQ, and I’ve seen the EQ device is one part of the testing effect chains?


(Ledger) #45

Here we go:

Previous Best

Xeon 1231 v3

HYPER THREADING ENABLED (4 cores = 8 hyperthreaded cores)

Crimson unplugged and pattern ed out of view,

RAM in proper Dual Channel operation,

High Performance profile in win 7 64bit

AntiVirus Realtime still running


1 core 10ms = 51
8 cores 10ms = 156

1 core 100ms = 60
8 cores 100ms = 227


HYPER THREADING DISABLED

Crimson unplugged and pattern ed out of view,

RAM in proper Dual Channel operation,

High Performance profile in win 7 64bit

AntiVirus Realtime still running


1 core 10ms = 51 (=)
3 cores 10ms = 138 (n/a)
4 cores 10ms = 147 (-9)

1 core 100ms = 61 (+1)
3 cores 100ms = 170 (n/a)
4 cores 100ms = 208 (-19)


Edit: corrected: Wrong previous best posted, so HT disabled loses 8 RPTS on Max cores – not as much as I would have thought though.

OK I`ve re-run the Hyperthreading enabled in same conditions with AV on. Funnily enough I got best results so far (maybe windows was doing some housekeeping first time or it was full moon or something :badteeth: … )


(somemoron) #46

Here are my scores:

AMD FX-8350 (eight core) @ 4.00Ghz (stock)
Renoise 3.1.0 Win 7(64)

1 core (10ms): 30

8 cores (10ms): 97

1 core (100ms): 43

8 Cores (100ms): 149

I also have a laptop (rather old though)

Intel i7 L640@2.13ghz

Renoise 3.1.0 Win 7(64)

1 core (10ms): 21

4 cores (10ms): 42

1 core (100ms): 35

4 Cores (100ms): 64

Thanks for providing the benchmark Cactoos!


(iconoclast) #47

Intel i9-i7940X, HyperThreading enabled, stock settings (3,1 GHz base, 4.3 GHz turbo boost 2, 4.5 GHz turbo boost 3, yadda yadda).

RME Fireface UFX via USB

Windows 10 Pro

1 CPU, 11 ms (512 samples)

50 (crackling starts at 24)

sad.pngWTF is going on here. I’m not throttling either, temps are fine.

28 CPUs, 11 ms (512 samples)

256 (heard a faint crackling at 252) - maxed out

This benchmark, with all cores enabled, barely tickled the 7940X, one core was maxed at around 4-4.2 GHz whereas the other 13 were still at 1.2 GHz.

But I have to say the benchmark doesn’t really reflect the real world much because the default settings with dynamic clocks on this CPU is pretty unusable in Renoise. I don’t know if it’s Windows 10, bad CPU scaling in Renoise (probably not, restricting it to 4C/8T like the 2600K doesn’t help), or the Fireface, or a combination. Needless to say I’m pretty sad I get worse dropouts with this machine than my old i7-2600k at the same buffer depth. Turning off all dynamic clock voodoo and fixing all 14 cores at 3.1 GHz improves things, gets me back to how my 2600k was, with the additional benefit that I can run 96 kHz without problems (not that I really want to), but that’s not really how these HEDT CPUs were designed to operate.


(spacedrone808) #48

AMD RYZEN 1700 STOCK FREQ/X370/64GB RAM/INTEGRATED AUDIO

Win7 x64 SP2+

2 small apps in background

All software settings set to recommended. Overload protection at 90%.

ReBench.xrns

1 core -> 48

16 core [8 real/8 ht] -> 161 reached the end with load 54%

ReBench - Version extended to 256 tracks and 1537 effects

1 core -> 44

16 core [8 real/8 ht] -> 204

AMD RYZEN 1700 3.8Ghz (+800 Mhz to stock on cheap air cooler)

Closed all background apps, but network is on + vpn active.

ReBench.xrns

1 core -> 54

16 core [8 real/8 ht] -> 161 reached the end with load 53%

ReBench - Version extended to 256 tracks and 1537 effects

1 core -> 50

16 core [8 real/8 ht] -> 240

Looks like AMD become very dangerous to arrogant Intel.

I suppose that ThreadRipper is even more dangerous beast.

“Insiders” should reconsider their insane pricing for nothing, ohh sorry for better 3d games performance per single core, but who needs that sort of stuff…

PS. THANKS GOES OUT TO CACTOOS FOR GREAT BENCH!


(spacedrone808) #49

aero disabling? really?? Is this still 1995 ?

Yeap, I disabled this crap also. Shiny GUI not only decrease performance, but actually decrease a bit of working area of the screen.


(ffx) #50

Is aero still available in win10 as service?


(spacedrone808) #51

​No information. I am using Win7.


(pat) #52

mid-2015 MBP, 2.8 Ghz i7, 8 cores, onboard sound, 10ms – 216 RPTS (it started stuttering a bit around 175 or so)

I upped it to 48khz and it’s 192 RPTS.

44.1khz and 93ms latency (max on my system) = 254 RPTS


(Man) #53

i7 6800K (6-cores) @3.4Ghz, 16Gb Ram, Win10 Pro 64-bit, ASIO driver with Komplete Audio 6 audio interface, sample rate 44100, buffer size 192 samples : 226RPTS.

It seems a decent gaming pc makes for a decent Renoise pc as well. smile.png


(speedraver) #54

Thanks for the benchmark, that was an interesting one! I´ve used the 256 one on a Threadripper 1950X (16 cores / 32 threads) overclocked to 3.95 GHz and P-States disabled (always running at 3.95 GHz, no downclocking - most important for Renoise as it does not up-clock if Renoise puts a load on the CPU, whyever that is, but others have reported it here too with other CPUs) and your recommended settings (DirectSound / 10ms, etc.).

32 threads (16 cores) @ 10ms DirectSound = 256 (CPU usage on last pattern is 50%)

1 thread @ 10ms DirectSound = 52

Cheeeers :slight_smile:


(ffx) #55

Nice,Threadripper seems to be a real multicore monster. But my single core performance still outperforms you, which is kinda sad, since I use a i7 4770 (now already old). What kind of audio interface do you use, and did you set the windows power profile to “desktop”?


(speedraver) #56

@ffx Yes, naturally the I7´s, especially newer ones (though generation 4 like yours should not really be faster, only if it´s overclocked), are a bit faster on a per-core base (that’s also the same in all other benchmarks), but they have only a few of those, that’s the problem and the reason why I´ve gone with AMD this time (always been on Intel before, the last AMD I had was an Athlon XP, haha). Curious though, how many patterns can you do on a single core on yours at 10ms DirectSound? I was on a I7 3770K before, overclocked to 4,4 GHz, and the single core performance is exactly the same like on my new Threadripper 1950X (overclocked to 3962 MHz). I don´t think it´s sad to have 16 real cores of this caliber, haha and the single core performance on the TR 1950X is more than enough to run a single VSTi + a huge processing chain (this is the maximum that is concentrated on one core in the worst case as it needs serial processing Instrument -> Effects) even at 88200 Hz (that´s what I am using these days as many VSTi´s sound better aliasing wise than at 44100 Hz), but I can now run at least 16 VSTi´s + their processing on that CPU without any of those colliding for processing power + the 16 extra HT threads which bring a little extra room too. In fact my current song (solely using VSTi´s and VST´s goes to a max CPU load of 60% in the chorus section, that is with about 6 VSTis going and each VSTi having processing without 5 effects on the chain). And all of that for a very nice bang for the buck on that CPU, compared to what Intel had to offer right now. I don´t game on this machine, so more cores = more VSTi´s is what it is for me, regardless if the single cores are a bit slower. You could also go for an AMD Epyc with 32 cores and 64 threads, the single core performance is even slower than (as it is with Intels XEONs), but it would again fit my usage scenario and would allow me even more plugin instances, but for now, I am rather happy as I am not maxing out my current setup at all.

No, I am using my custom power plan which, apart from having all energy savings turned off, has some extra settings that usually cant be changed at all, like how many cores go for core-parking and under which load, which strategy for re-activating them, etc., I am sys-admin in real life and do this stuff day in and day out, so I am running a highly customized Win 10 with almost everything ripped out (www.ntlite.com) too. Anyhow, P-States are disabled too for Renoise, hence the CPU always runs at 3962 MHz (the max I could get without having to boost voltage EXTREMELY higher, which I did not want as I wanted to keep this as energy efficient as possible). So that should not stop it from reaching higher results. I also specially selected DirectSound and 10ms, with ASIO and 7ms I can make it until pattern 56, not a huge difference though.

My audio interface is my old beloved E-MU 0404 USB, never found any more modern interface with better converters and op-amps (in or out) yet, and RightMark AudioAnalyzer comes to the same conclusion as well:-) OK, some current cards from Lynx are up-to-par, but cost $1000, while this one did cost me $70 on eBay 3 years ago ;)Here are the RightMark results for my one:

Summary Frequency response (from 40 Hz to 15 kHz), dB

+0.02, -0.06

Excellent

Noise level, dB (A)

-113.4

Excellent

Dynamic range, dB (A)

113.4

Excellent

THD, %

0.0009

Excellent

THD + Noise, dB (A)

-97.2

Excellent

IMD + Noise, %

0.0012

Excellent

Stereo crosstalk, dB

-109.9

Excellent

IMD at 10 kHz, %

0.0013

Excellent

General performance

Excellent

I was using a Native Instruments Komplete Audio 6 Interface before, much more current and should hence be much better (you´d think), the results there looked like this:

Summary Frequency response (from 40 Hz to 15 kHz), dB

+0.03, -0.25

Very good

Noise level, dB (A)

-94.1

Very good

Dynamic range, dB (A)

94.1

Very good

THD, %

0.011

Good

THD + Noise, dB (A)

-74.7

Average

IMD + Noise, %

0.023

Good

Stereo crosstalk, dB

-91.6

Excellent

IMD at 10 kHz, %

0.018

Very good

General performance

Very good


(ffx) #57

Hm then I think the surprising difference for one core is caused by the type of audio interface: USB still has serious amount of cpu overhead, while I use firewire (on macos), which has very little overhead. I set my cpu timing to “turbo” in the bios, I don’t know if that means overclocked, but it shows the cpu type typical frequencies in the OS.

I think next time I really will buy such an AMD cpu. Also supported by hackintosh btw.


(speedraver) #58

Hehe, who knows what they come up with next at AMD…And really, it is not surprising at all, as I´ve mentioned, the single core speed of the TR 1950X is equal to the I7 4XXX CPUs, just that you get 16 cores + 16 HT threads on the TR 1950X compared to “just” 4 or 6 (8 / 12) on the I7 CPUs :slight_smile: Also, there are differences in the compiler/efficiency how certain commands are translated for the CPU (and AMD Threadripper is not fully supported by the compilers yet, so as soon as this happens, speeds will get even better too). But you can hunt me with anything *tosh, I am all Windows and hate MAC´s + Apple, haha :slight_smile:


(gezmond) #59

Win 10 64 bit - Renoise 3.1 - AMD RYZEN 5 1600 STOCK

ReBench - Version extended to 256 tracks and 1537 effects

10ms latency

1 CORE - 47 RPTS

6 CORES - 214 RPTS

100ms latency

1 CORE - 73 RPTS

6 CORES - Ta! Amiga Rulez!!! 59% CPU usage


(Borodin) #60

Ryzen 2600 at stock, 16GB RAM at 2993Mhz

Win 10 64bit - Renoise 3.1

ReBench 256

Direct Sound:

10ms:

Single Core - 48 RPTS

6 Core - 194 RPTS

100ms:

Single Core - 77 RPTS

6 Core - Ta! Amiga Rules!!! (Max CPU 56%)

Babyface Pro ASIO

256 samples (5.8ms):

Single Core - 57 RPTS

6 Core - 256 RPTS (hit 90% on last pattern!)

512 samples (11.6ms):

Single Core - 71 RPTS

6 Core - Ta! Amiga Rulez!!! (Max CPU 73%)

Thanks to Cactoos for the great benchmark!