The way Renoise and Task Manager (and generally other system monitoring software) report CPU usage with multiple core is quite different, and for a good reason.
Renoise of course splits the workload to multiple cores, but this is mostly done in just track by track basis. Most of DSP is inherently linear, and as such can only be executed in a single thread, and not split between cores. This implies that if you’re using fewer tracks than your CPU has cores/threads, then the remaining ones are just going to sit idle, because they can’t help with most of the audio processing anyway.
Now, the other thing about CPU usage in this context is that what matters is the slowest thread, or, the track that requires the most processing. Since the CPU must be able to produce new audio buffers at a strict fixed rate, if just one of these threads lags behind, the entire audio is going to suffer. For this reason Renoise and other DAWs effectively report the CPU usage of the worst performing CPU thread. For an example, you might have a track that requires 80% of the time (say, 4ms or the total 5ms buffer length) to produce an audio buffer, but meanwhile in parallel you might have a dozen other tracks that only require 5% of that time. That 80% track is still the weakest link, since that’s what your audio performance hinges on. Conversely, Task Manager will report the average CPU usage between all cores, which will always be at most what Renoise reports, or in most cases much lower.
On my 8 core / 16 thread Ryzen 2700X CPU if I change the thread allocation in Renoise from 16 to 8, or even 16 to 4, there’s rarely any difference in performance or the reported CPU usage. This is because at least in my projects there often are a few channels that dominate in CPU load and more threads are not going to help with that, and at the same time quicker to process tracks are still handled fine by a lower thread count.
When it comes to audio and DSP, single threaded performance is still king, as long as you have enough threads to begin with (8 cores / 16 threads is easily sufficient nowadays for most tasks). I would still consider the higher core count CPUs however, because in addition there being more of them, the cores themselves are also a bit faster.
Regarding managing CPU usage, try to not put too much load on your groups and buses, including the master bus, at least not up front, and leave things like “mastering” preferably at the end once you’re ready to export. Also track freezing would be really helpful of course, but sadly Renoise doesn’t support this natively, and the workarounds and tools for this are still really poor.