Oups sorry, i did'nt think about this.. The stage 0 will reset with new note..
For the multiply it's ok for the sample rate, that are take into account in the green process.
Yep they are only 8 register.. This is sometime complex, we have to note which register need to be used from the end to the beginning (or a certain time), and which are not used anymore then could be used for something else.
(If we want to optimize and not reload them to much)
I'm really not sure you will get so much optimization using sse and addition. The problem is that an addition will not take so much cpu, but using sse you will need to add more code and more addition...
Also, if the addition are in poly white, you could not use sse. It's only possible with mono blue..
Normally, place where adding a lot of stuff might not be the better place where you could do optimization.
But it's hard to say where are the place that will give you the best result.
Did you use a lot of envelope ? Might be strange because they are hopped, but they take lot of cpu.
I think that optimized form exist but i didn't try them so much.
Or a lot of lfo that add together ? Maybe you could use hopped lfo, add them, then lowpass the result.
(I'm not even sure that the lowpass is necessary but like smoothing thing.)
Or maybe the best think to do is to isolate a maximum of part of your schematic.
Cut the volume with an amp and try to only take small part of the schematic to test the cpu cost.
Generally when i do that i could isolate one part of the schematic that do almost 2/3 of the cpu cost.
I'm surprise with the difference between the MV and prim dezipper.
I was thinking they was almost the same. And prim dezipper are reputed to take no cpu.
But seeing the code with the analyzer they are really different.
Also the prim will do more line where the MV do some curve. I prefer curve
