Jump to content


Photo

Renoise tools - speed optimization initiative


  • Please log in to reply
12 replies to this topic

#1 joule

joule

    Guruh Motha Fakka is Levitating and Knows Everything About Renoise Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • 1491 posts
  • Gender:Not Telling
  • Location:Sweden
  • Interests:music, philosophy, engineering

Posted 02 February 2016 - 11:02

Maybe I am the only one of us that writes pretty bad code, but wouldn't it be a good idea to gather some information on LUA speed optimizations in general, and for Renoise in particular?

 

Below are two resources I have found on optimizing speed in LUA. Feel free to post more links and advices. Things that you have learnt or just find important to share about LUA in general or the Renoise API in particular regarding speed optimizations and its finer details.

 

http://www.lua.org/gems/sample.pdf

http://stackoverflow...f-a-lua-program

http://lua-users.org...ationCodingTips

https://springrts.co...Lua_Performance

 

What do you believe is the most abused/non-optimized task in Renoise tools? For my own tools I would suspect that it is pattern iteration and searching for specific pattern data by iterating thru tables. (I hereby pledge to revise some of my tools in regards to this :)


Edited by joule, 02 February 2016 - 11:16.

  • Ledger and fladd like this

#2 fladd

fladd

    Guruh Motha Fakka is Levitating and Knows Everything About Renoise Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • 1225 posts
  • Gender:Male
  • Location:The Netherlands

Posted 02 February 2016 - 11:30

Maybe I am the only one of us that writes pretty bad code

 

You have seen my tools, right? :-)



#3 joule

joule

    Guruh Motha Fakka is Levitating and Knows Everything About Renoise Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • 1491 posts
  • Gender:Not Telling
  • Location:Sweden
  • Interests:music, philosophy, engineering

Posted 02 February 2016 - 11:39

I haven't looked at any of your code, and maybe I shouldn't then :)

 

Ugliness aside, naturally what you would want to focus the most on is to optimize the loops. This evening, I will have a go at trying to dump pattern data to strings and parsing/searching from there instead of the simple way by intensively accessing tables. I'll post any results if beneficial.



#4 ffx

ffx

    Guruh Motha Fakka is Levitating and Knows Everything About Renoise Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • 2936 posts
  • Gender:Not Telling
  • Interests:macOS fanboying

Posted 03 February 2016 - 10:56

Nah, don't think code optimization within lua will bring much benefits. Imo totally waste of time. Don't waste your time.

Much more efficient this can be done from renoise's api, lua interpreter, xml song parser or gui side.

There are lot of examples where you see that the lua api in renoise is really slow. Examples:

- building context menus - gui slows this trivial operation down, in extreme ways
- advanced edit operations over the whole song. You will notice serious lags here. Or changing instrument number order.
Etc.

Edited by ffx, 03 February 2016 - 10:58.

  • Raul (ulneiz) likes this

MacOS 10.12.6 Retina, Renoise 3.1 64 bit   -   Tuned Shortcuts | Multi-Jump From/To Send | Quick Template | Insert Native DSP Menu (incl. deprecated)


#5 joule

joule

    Guruh Motha Fakka is Levitating and Knows Everything About Renoise Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • 1491 posts
  • Gender:Not Telling
  • Location:Sweden
  • Interests:music, philosophy, engineering

Posted 03 February 2016 - 11:00

My findings:

 

* renoise.song().pattern_iterator:lines_in_song() is much "faster" than renoise.song().pattern_iterator:note_columns_in_song(). I would prefer using lines_in_song() if application permits.

 

* Making a custom pattern iterator seems to be a better choice than using the renoise pattern_iterator.

local x = os.clock()

function test() -- standard iteration
local iter = renoise.song().pattern_iterator:lines_in_song()
for _, val in iter do
end
end

function test2() -- custom iteration, optimization commented out
for seq, pattern_index in ipairs(renoise.song().sequencer.pattern_sequence) do
 for track_index = 1, renoise.song().sequencer_track_count do
--  if not renoise.song():pattern(pattern_index):track(track_index).is_empty then
  for line_index, line in ipairs(renoise.song():pattern(pattern_index):track(track_index).lines) do
--  end
  end
 end
end
end

--test()
test2()

print(string.format("elapsed time: %.2f\n", os.clock() - x))

The custom iteration also allows for further optimization (like i've commented out in test2()), if application permits. Depending on application you might also want to optimize further by filtering out track types (groups, sends, master)

 

Benchmarks on my system (song: Medievil Music - Access Pwd.xrns)

Standard iteration: 1.59s

Custom iteration: 1.36s

Custom iteration optimized: 0.62s

 

Anyone disagrees? :)


Edited by joule, 03 February 2016 - 11:12.

  • jiku, Ledger, Raul (ulneiz) and 1 other like this

#6 danoise

danoise

    Probably More God or Borg Than Human Member

  • Renoise Team
  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • 6419 posts
  • Gender:Male
  • Location:Berlin
  • Interests:wildlife + urban trekking

Posted 03 February 2016 - 13:28

To focus on performance is a very nice idea. I'm happy to share my own findings :-)

 

1. If you need to random access note/effect columns, do it via methods 

Documentation clearly states that random access to the following objects is inefficient: 

renoise.song().patterns[].tracks[].lines[]:note_columns[index]
renoise.song().patterns[].tracks[].lines[]:effect_columns[index]
renoise.song().instruments[].phrases[]:lines[index]
renoise.song().patterns[].tracks[]:lines[index]

Reason is, that the when asked for an item within an array, the API will first need construct the whole array and return it. After which, you then select single entry.

With such a simple case, it's much better to access the object via it's associated getter method: 
renoise.song().patterns[].tracks[].lines[]:note_column(index)
renoise.song().patterns[].tracks[].lines[]:effect_column(index)
renoise.song().instruments[].phrases[]:line(index)
renoise.song().patterns[].tracks[]:line(index)

2. Schedule your updates 

 

This also happens to be the golden rule of the Lua Gems book: Performance tip #1: "Don't do it", followed by #2: "Do it later". 

I do this throughout all my realtime-performing tools - it's not hard, just involves a bit more code. 
 
Basically, what you need to do is to centralize the updates that your tools needs to perform, and call this method through the idle notifier.
For instance, a simple tool might have a update method which would set user interface components to their most current state. 
A more advanced tool could split the update method into several smaller bits, making each update a lighter task to perform. 
 
The trick is that you raise a flag (a boolean variable) when the update needs to happen.
For example, the flag could be raised by specifying something like this: update_requested = true
 
Then, in a fraction of a second the idle notifier will pick this up and call the update method - and reset the flag
if (update_requested) then
  -- do something cool, then reset flag 
  update_requested = false
end

 

The advantage is that you could ask for updates any number of times between idle notifiers, but the scheduled update will only happen once.

 

This approach can also be adopted to those situations where a notifier gets triggered very often. For example, the line notifier is famously flooding you with messages when you e.g. clear a pattern - one message for each line in each track. This can easily get into the hundreds - you would definitely want to schedule/delay the response in such a case.

 

3. Cache Renoise API methods and objects when possible

 

It's a fact that the Lua language itself is extremely fast. Same is true for C++. The bottleneck is mostly between the two: everything has to be looked up, converted by luabind. 

So, another way to improve performance of scripts would be to cache those objects that doesn't change during the execution of a script. 

 

For example, the call to renoise.song() is obviously a method call. And yes, this comes with a cost, albeit a very small one. 

I'm pretty sure you wouldn't see much of an impact with simple tools. But the larger a tool becomes, the more calls needs to be done, and things start to accumulate.

 

For this reason, it's recommended that you avoid code like this:

for i = 1,100 do
  if(renoise.song().patterns[i]) then
    -- there was a pattern here
  end
end

It can be very easily be changed into

local rns = renoise.song()
for i = 1,100 do
  if(rns.patterns[i]) then
    -- there was a pattern here
  end
end

Some of my own tools have taken this to the extreme and contains just a single reference to the renoise song object used throughout the tool. 

 

But this comes with a different kind of complexity as you'd then have to watch out that you don't invoke any anonymous code that contain such references.

Why? Because the references will become invalid the moment that the song (or whatever you were referencing) is gone.

 

If you have worked with notifiers, you will know that referencing objects can be tricky - making sure that notifiers are always suppressed or removed when their associated objects have gone. 

 

4. Profile your code 

 

Urhg, this is the one aspect of programming that I find intensely boring. I actually prefer writing unit tests.

But yeah, it's the scientific approach  ;)  

 

--

 

 

* renoise.song().pattern_iterator:lines_in_song() is much "faster" than renoise.song().pattern_iterator:note_columns_in_song()

 

Well, they are different - the lines_in_song one doesn't go as "deep" as the "note_columns_in_song". 

I would be surprised if the latter one performed worse than making the first one do what the second one does. 

 

But you have a point with the "custom iterator", if you know exactly what kind of task you set out to do you can optimize the hell out of it. 

Reminds me a bit of ffx' idea for a "jquery-alike" syntax... but I am sure he would find it too slow  :lol:

 

Edit: 


There are lot of examples where you see that the lua api in renoise is really slow
 
You're right that in some cases, optimizing lua code is not the way forward. But when creating a gazillion menu entries, the API is not to blame - this is simply Renoise being slow in doing this particular type of task. I would view this as an opportunity to optimize Renoise - especially as this particular type of UI update can't be scheduled (unlike others, such as updating the status bar) 
 
Another way to thinking about improving API performance is to consider which things could be pulled into Lua, processed there, and then delivered back once done. Most obviously, this is true when processing samples - it's done frame by frame. If we could instead process chunks, we could avoid the overhead caused by the lua / C++ bindings. 
 
Even with something like luajit (which otherwise promises a massive increase in speed), this overhead is causing that increase to be not so overwhelming - unless you are doing "pure" lua stuff. 

Edited by danoise, 03 February 2016 - 13:48.

  • jiku, Ledger, Conner_Bw and 5 others like this

Tracking with Stuff. API wishlist | Soundcloud


#7 joule

joule

    Guruh Motha Fakka is Levitating and Knows Everything About Renoise Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • 1491 posts
  • Gender:Not Telling
  • Location:Sweden
  • Interests:music, philosophy, engineering

Posted 18 November 2016 - 16:04

Regarding ipairs iteration, something that might be worth mentioning is that example 1 is faster than example 2.

 

Note that this doesn't seem true for renoise.song() tables (,lines, .note_columns et c). I am guessing that iterations of these are customized and optimized behind the scenes. Otherwise:

 

Example 1:

local my_table = { 1, 42, 3, 7, 10 }
for i = 1, #my_table do
  local val = my_table[i]
  -- code
end

Example 2:

local my_table = { 1, 42, 3, 7, 10 }
for i, val in ipairs(my_table) do
  -- code
end

  • Raul (ulneiz) likes this

#8 joule

joule

    Guruh Motha Fakka is Levitating and Knows Everything About Renoise Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • 1491 posts
  • Gender:Not Telling
  • Location:Sweden
  • Interests:music, philosophy, engineering

Posted 29 November 2016 - 10:46

I found something recently that is worth mentioning regarding line iteration (scanning for stuff) (again!). This can speed up some things pretty drastically when doing heavy iterations:

 

If your application is accessing more than one property within a pattern line (scanning more than one note column, or searching for values in the whole song), strongly consider to use :lines_in_range(). The tostring(line) can be used with string pattern matching for very fast scanning of pattern data in comparison to the standard nested iterations. You will access one string intsead of iterating all note and effect columns.

 

This can be a huge optimizaton under the right circumstances and has perhaps been a bit neglected (?). A code example can be seen here: http://forum.renoise...e-2#entry352478

 

(The reason why it's faster is, I guess, that native tostring methods in the song object are not just simple LUA metamethods, but internal C++ functions that are a lot faster than a LUA table concatenation would be, for example. So it's often worth taking advantage of when possible. Perhaps there are also other usable "C++ metamethods" that have been neglected and can be exploited for optimizations?)


Edited by joule, 29 November 2016 - 15:02.


#9 joule

joule

    Guruh Motha Fakka is Levitating and Knows Everything About Renoise Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • 1491 posts
  • Gender:Not Telling
  • Location:Sweden
  • Interests:music, philosophy, engineering

Posted 03 February 2017 - 12:20

Regarding caching (point 3 in Danoise's post, taken to the extreme):

 

An interesting technique I've tried speeds up API pattern data read access by app. 10 times. Theoretically, you can quite easily cache the whole song data in a friendly format and update the cache dynamically with notifiers (negligable overhead, unless every tool starts using this technique to speed up any read access...)

 

For now, this technique is practically very useful for example in realtime tools reading heavily on specific tracks (iterating for voices, chords...), as all read access will happen in internal lua tables/objects that are 100% synced to the actual song data. Just replace any read access to target the cache instead of renoise.song().

 

PS. Again.. it would be great if tools could access each others environments. We could then make a general speed-up tool, acting as a "dependency", that would cache objects for other tools. Anyhow, I've made a simple framework for pattern data in case anyone is interested.


Edited by joule, 03 February 2017 - 12:21.


#10 Raul (ulneiz)

Raul (ulneiz)

    Guruh Motha Fakka is Levitating and Knows Everything About Renoise Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • 1011 posts
  • Gender:Male
  • Location:Spain

Posted 09 August 2017 - 14:01

I have consulted this topic several times with some hope. What FFX says is true. There are some parts of the API that work slow when trying to span many patterns at once. It includes 512 lines per pattern and multiply the patterns to a high figure, 500 or 700. You will see that many iteration functions are much slower. For example, delete the entire volume column of the selected note column. Or all the volume columns of an entire track.

 

However, building tools, I have noticed marked improvements according to the code used. I mean, that a function may be better written to run it faster, especially with the iteration.

 

So while it sounds obvious, both things are needed, optimizing the Renoise API in certain cases, and also using better-optimized "LUA tricks" to improve performance.
 
On the other hand, I just want to leave a feeling. With a tool of 100 or 200KB with polished code, the tool can offer a lot of controls or specific functions. It is very Libyan. When I did not have much idea of LUA, I thought that a good tool needed a lot more code, and therefore many more KB.
 
But one thing is what I want and another very different what I expect (I think some of you feel the same way). I have been reading many topics on the forums, which have been ignored over and over again. There are many things...
 
This seems a topic directly addressed to taktik. It is not here. But it would be great to have more examples of optimization, because they really do their job for build best tools.

:excl: Development of my tool: GT16-Colors

 

:excl: My API wishlist R3.1 (updated 24 July 2017):

Spoiler

 

:excl: My Renoise 3.1 wishlist (updated 26 September 2017):

Spoiler

#11 Ledger

Ledger

    Guruh Motha Fakka Knows More About Renoise Than Taktik

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • 3379 posts
  • Gender:Male

Posted 13 August 2017 - 14:05

My findings:
 
* renoise.song().pattern_iterator:lines_in_song() is much "faster" than renoise.song().pattern_iterator:note_columns_in_song(). I would prefer using lines_in_song() if application permits.
 
* Making a custom pattern iterator seems to be a better choice than using the renoise pattern_iterator.
 

Spoiler
The custom iteration also allows for further optimization (like i've commented out in test2()), if application permits. Depending on application you might also want to optimize further by filtering out track types (groups, sends, master)
 
Benchmarks on my system (song: Medievil Music - Access Pwd.xrns)
Standard iteration: 1.59s
Custom iteration: 1.36s
Custom iteration optimized: 0.62s
 
Anyone disagrees? :)

 

 
Just working my way through this thread.  Added another loop, which is just a standard lua for loop rather than ipairs. It seems to be as quick with the optimization, of weeding out empty tracks.
 
I also modiifed so that the tests run one after the other, resetting x to os.clock() at the start of each function
 
I never really liked the ipairs syntax but maybe I`m missing out on some adavantages by not using it?

local x -- os.clock()

------------------------------------------------------------------
------------------------------------------------------------------
function test() -- standard iteration
  x = os.clock() -- reset x (clock stamp start)
  local iter = renoise.song().pattern_iterator:lines_in_song()
  for _, val in iter do
  -- do nothing
  end
  --print time elapsed
  print(string.format("elapsed time test(): %.2f\n", os.clock() - x))
end

-------------------------------------------------------------------
-------------------------------------------------------------------
function test2() -- custom iteration
  x = os.clock() -- reset x (clock stamp start)
  for seq, pattern_index in ipairs(renoise.song().sequencer.pattern_sequence) do
    for track_index = 1, renoise.song().sequencer_track_count do 
      if not renoise.song():pattern(pattern_index):track(track_index).is_empty then --optimization
        for line_index, line in ipairs(renoise.song():pattern(pattern_index):track(track_index).lines) do
        -- do nothing
        end
      end
    end
  end
  --print time elapsed
  print(string.format("elapsed time test(2): %.2f\n", os.clock() - x))
end

-------------------------------------------------------------------
-------------------------------------------------------------------
function test3()--standard Lua loop
  x = os.clock() -- reset x (clock stamp start)
  for pattern_index = 1,#renoise.song().sequencer.pattern_sequence do
    for track_index = 1, renoise.song().sequencer_track_count do
      if not renoise.song():pattern(pattern_index):track(track_index).is_empty then --optimization
        for line_index, line in ipairs(renoise.song():pattern(pattern_index):track(track_index).lines) do
        -- do nothing
        end
      end
    end
  end
  --print time elapsed
  print(string.format("elapsed time test(3): %.2f\n", os.clock() - x))
end

--run tests
test()  
test2()
test3()

Edited by Ledger, 13 August 2017 - 22:24.

--> Lua For Beginners <--
--> Lua for newbies <--

My Scripts On Forum

Top Tip!

 

cpu : Xeon 1231 v3, os : Win 7 64bit, audio: Audient iD4
posts as 4tune @ KvR and some other music related sites


#12 joule

joule

    Guruh Motha Fakka is Levitating and Knows Everything About Renoise Member

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • 1491 posts
  • Gender:Not Telling
  • Location:Sweden
  • Interests:music, philosophy, engineering

Posted 14 August 2017 - 08:33

I never really liked the ipairs syntax but maybe I`m missing out on some adavantages by not using it?

ipairs has a "nice syntax" where you get both the key and the value in one go, but I don't use it anymore for the reason mentioned above.

 

This is my default practice, and should be the fastest:

local my_table = { 1, 42, 3, 7, 10 }

local val
for i = 1, #my_table do
  val = my_table[i]
  -- code
end

Of course, pairs is still useful, as it iterates non-indexed tables.



#13 Ledger

Ledger

    Guruh Motha Fakka Knows More About Renoise Than Taktik

  • Normal Members
  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • 3379 posts
  • Gender:Male

Posted 14 August 2017 - 11:58

I see.  Yes I don`t use non-indexed tables much (think I maybe tried a couple in some early scripts).

 

As the API doesn`t use them aswell the standard for-loop seems to cover most needs nicely.

 

 

Still have to disagree on the ipairs being a nice syntax though :P


--> Lua For Beginners <--
--> Lua for newbies <--

My Scripts On Forum

Top Tip!

 

cpu : Xeon 1231 v3, os : Win 7 64bit, audio: Audient iD4
posts as 4tune @ KvR and some other music related sites