Jump to content


Photo

[fixed rc1] Linux: Cpu peak with LP Moog when automating resonance


  • This topic is locked This topic is locked
12 replies to this topic

#1 kytdkut

kytdkut

    Member

  • Normal Members
  • PipPip
  • 62 posts
  • Gender:Not Telling
  • Location:Buenos Aires

Posted 10 December 2015 - 01:30

I don't know if any of you will be able to replicate this, but see attached song please.

I'm runing linux x64. Using alsa instead of jack causes the same issue.

 

This cpu peak results in one or multiple xruns, that's how I noticed this.

 

 

Notice how there is a valley between each reso automation gesture in attached song. If you erase that valley and leave only the up-down gesture no xrun occurs.

 

Also, using any other filter works as intended.

 

Hope I have described this issue successfully.

 

 

Cheers!

Attached Files



#2 taktik

taktik

    Renoise Developer

  • Admins
  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • 15040 posts
  • Gender:Male
  • Location:Berlin, Germany
  • Interests:füße waschen

Posted 10 December 2015 - 10:40

Can't see a peak here, but automating filter parameter costs a bit of CPU time too. Maybe just enough to lead to XRUNS in your specific setup?

Can anyone else replicate this with a different setup?

Btw: If you want to lower the CPU usage for this specific instrument, don't set the Analog Dilter device's oversampling rate to 8x, but keep it at 2x or disable it. And oversampling amount of 8x is pretty hefty and usually not worth it.

#3 taktik

taktik

    Renoise Developer

  • Admins
  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • 15040 posts
  • Gender:Male
  • Location:Berlin, Germany
  • Interests:füße waschen

Posted 10 December 2015 - 11:57

Oh, and we did some tweaks for exactly this kind of automation for b6 too. So if you are still running b5 or older betas, please try this again with b6.

#4 Meef Chaloin

Meef Chaloin

    Big Masta Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPip
  • 524 posts
  • Gender:Male

Posted 10 December 2015 - 12:45

I can replicate this I think, I hear a glitch when the valleys occur. I don't know if they are xruns though, JACK isn't reporting them as such.

linux 64 bit, b6.



#5 kytdkut

kytdkut

    Member

  • Normal Members
  • PipPip
  • 62 posts
  • Gender:Not Telling
  • Location:Buenos Aires

Posted 10 December 2015 - 14:05

My setup is ok (i3 370m), I've never had problems loading lots of vsts and even running jack on a 16 samples buffer, which I do frequently. Using a 512 samples buffer stops xruns from happening.

I've recorded a screen capture showing DSP load comparison between filters in my setup: https://drive.google...YUQ5ZFc2QlpTY0U (download, its only 3 megs)

 

 

Thanks!

 

PD.: removed the analog filter in effects section

 

 

Edit: forgot to add that I'm using b6.


Edited by kytdkut, 10 December 2015 - 15:35.


#6 Meef Chaloin

Meef Chaloin

    Big Masta Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPip
  • 524 posts
  • Gender:Male

Posted 10 December 2015 - 23:38

The glitches I'm getting are much more pronounced than that. I tried to record the output of renoise in to audacity but strangely the glitches don't appear in the recording. There are no glitches if I render it either. Cadence shows no xruns and I am using conservative jack settings, CPU in cadence only peaks around 60%. Couldn't replicate it with any other filter apart from the LP Moog. I noticed changing the release and sustain both had an affect on the glitches, sort times made the glitches go away completely.

 

I'm quite confused about it, definitely seems to be LP Moog specifically but only I can't work out why I can't record it.



#7 taktik

taktik

    Renoise Developer

  • Admins
  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • 15040 posts
  • Gender:Male
  • Location:Berlin, Germany
  • Interests:füße waschen

Posted 11 December 2015 - 11:07

Thanks for the details and screen capture. Pretty weird that it indeed only seems to happen with the Moog LP.

Still can't replicate this here on Windows, but will do a few more tests on Linux - maybe it's some Linux specific code generation quirk.

#8 neopan

neopan

    Advanced Member

  • Normal Members
  • PipPipPip
  • 84 posts

Posted 12 December 2015 - 15:31

i can replicate this too, however if i set the lower automation points at 8% instead of 0% no crackles, seems like it only happens if the automation gets close to 0% (on a weak dualcore 1.2 chromebook w jack/debian tho, but no other xruns)



#9 taktik

taktik

    Renoise Developer

  • Admins
  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • 15040 posts
  • Gender:Male
  • Location:Berlin, Germany
  • Interests:füße waschen

Posted 14 December 2015 - 17:20

Turned out to be a Linux x86_64 problem only. Worth explaining what's going on here - among all the "usual bugs" this one is pretty rare and shiny, from a technical point of view ;)

Problem is that glibc's "pow" function gets really slow (really really goddamn slow), depending on which values you feed it with (well-formed floating point numbers, not denormal numbers or such).

When automating resonance in Renoise's Moog LP the following statement is applied along the line:

pow(1.0 + r*4.0, 0.45); // with r >= 0 <= 1

If r goes down to zero, which happes in this example song, pow's first parameter goes down to 1 and glibc's x86_64 pow impl starts getting slower and slower. Extraordinary slow at some point, which then leads into XRUNS.

Here's some guy with the same problem, examples and a few more infos: http://entropymine.c...rsener/slowpow/

EDIT: Oh it's actually documented now http://man7.org/linu...an3/pow.3.html:

On 64-bits, pow() may be more than 10,000 times slower for some
(rare) inputs than for other nearby inputs. This affects only pow(),
and not powf() nor powl().


We can of course work around this in this special case, but this is a general problem with pow on Linux then, so it needs a general solution. Will do more tests and research here...


  • carmazine, ffx, neopan and 1 other like this

#10 ffx

ffx

    Composes without Wires burns Directly from Brain to DVD that is already in Store Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • 3490 posts
  • Gender:Not Telling

Posted 14 December 2015 - 17:42

but surely not OSX too? since it uses same compiler?


Test system: macOS 10.13.4, HFS+. Firewire Audio, i7 4770, 8GB Ram, GTX1050 2GB, 48kHz
GUI Automation RecorderNative DSP Context MenuTuned ShortcutsUnified Value Shift And Transpose | Jump To From Send Modified (Docs) | Quick Template | Nisanmol's Groove Tool FixedThemes


#11 kytdkut

kytdkut

    Member

  • Normal Members
  • PipPip
  • 62 posts
  • Gender:Not Telling
  • Location:Buenos Aires

Posted 14 December 2015 - 19:46

Awesome. It stops happening if you change the bottom value to 0.001 instead of 0.000 either in the macro or in the automation. I can't hear a difference between 0.000 and 0.001.

 

Thanks for the detailed info!



#12 Zer0 Fly

Zer0 Fly

    Guruh Motha Fakka is Levitating and Knows Everything About Renoise Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • 1257 posts
  • Gender:Not Telling
  • Location:Am Mind
  • Interests:Buddha

Posted 15 December 2015 - 12:23

Strange that you seem to use generic (very precise, but slow) libm versions instead of taking special measures for such code - I always thought in DSP that exchanging/overloading the standard math library functions with faster/custom stuff would be one of the earlier steps to do to generally raise performance. Like overloading with the float versions (i.e. powf, and then using the builtin) or a special complied "--ffast-math" wrapper library, or even better with pure intrinsics/builtins and/or a special sse math library.

 

I feel for you, because if you now do such a step on a larger scale, it would mean that every dsp using it needs to go into the test bench again for possible precision issues because of it being treated rather lax by the faster versions of the math functions. That is why it should be done early. But do, it is worth it and simple to do besides the additional stability/sound quality testing! Not only for fixing this bug, but for general performance boostings if done right. Überstunden @ unter'm Weihnachtsbaum?



#13 taktik

taktik

    Renoise Developer

  • Admins
  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • 15040 posts
  • Gender:Male
  • Location:Berlin, Germany
  • Interests:füße waschen

Posted 15 December 2015 - 12:29

Can't optimize everything, just in case, but only relevant things. We're already using --fast-math already here and are also already using specialized math libraries for many DSP tasks. But most of such "speedups" are done via approximations, which usually is not what you want to use by default but only when "necessary". This problem with GCCs pow (Linux x86_64 only) is just weird and a very special case.