Stream FFT and iFFT

DSP related issues, mathematics, processing and techniques
Post Reply
User avatar
trogluddite
Posts: 1730
Joined: Fri Oct 22, 2010 12:46 am
Location: Yorkshire, UK

Stream FFT and iFFT

Post by trogluddite »

It's just a delicate little baby at the moment, so treat it with care... ;)
Stream FFT v001 Build 8.fsm
(158.47 KiB) Downloaded 2377 times
All schematics/modules I post are free for all to use - but a credit is always polite!
Don't stagnate, mutate to create!
tester
Posts: 1786
Joined: Wed Jan 18, 2012 10:52 pm
Location: Poland, internet

Re: Stream FFT and iFFT

Post by tester »

First what I of course did - was to change that value in order to see the crash. But - it did not crashed. It looks that this value is changable when the sound is offline.
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
User avatar
trogluddite
Posts: 1730
Joined: Fri Oct 22, 2010 12:46 am
Location: Yorkshire, UK

Re: Stream FFT and iFFT

Post by trogluddite »

Yes, that's right - it's only when the assembly blocks are running that there can be problems.
Because it is such an intensive algorithm, I left out the parts that check whether memory addresses are valid inside the code arrays - if they go out of range, there's an instant crash. Saves a fair chunk of CPU, but means that there are many hard-wired values that need to be right.

Ultimately, I'll probably just release a set of modules, each preset to a different buffer length - there's a big trade off with this job between versatility and CPU load due to the limitations of the FS assembly instruction set. There's still a lot more code to add yet too - FFT -> iFFT on its own is pretty dull; messing with the data in between is where things will get much more interesting.
All schematics/modules I post are free for all to use - but a credit is always polite!
Don't stagnate, mutate to create!
tester
Posts: 1786
Joined: Wed Jan 18, 2012 10:52 pm
Location: Poland, internet

Re: Stream FFT and iFFT

Post by tester »

//edit:

redirecting this to new topic.
viewtopic.php?f=4&t=1513#p6442
Last edited by tester on Wed Jun 19, 2013 10:39 am, edited 2 times in total.
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
User avatar
MyCo
Posts: 718
Joined: Tue Jul 13, 2010 12:33 pm
Location: Germany
Contact:

Re: Stream FFT and iFFT

Post by MyCo »

The problem with realtime FFT and iFFT is, you have to overlap the buffers and apply the correct window to the FFT. And finally you have to find the right spot where you start the overlapping FFT. That's actually a very hard task.

Have you done the conversion to SSE from x86 code? It's very hard to read, but it looks like you don't really use the benefits of SSE.

When I understand it correctly, you buffer the 1024 samples linear into "inArray" and at the end of the buffer you calculate the FFT. Wouldn't it be faster to buffer the incoming samples at the bitshifted positions in an Array? Then you could leave this step (Loop1) out of the main code.
User avatar
MyCo
Posts: 718
Joined: Tue Jul 13, 2010 12:33 pm
Location: Germany
Contact:

Re: Stream FFT and iFFT

Post by MyCo »

I think I've found a bug in the FFT code. On line 125:

Code: Select all

movaps xmm1,xmm3;


xmm3 wasn't set before
User avatar
MyCo
Posts: 718
Joined: Tue Jul 13, 2010 12:33 pm
Location: Germany
Contact:

Re: Stream FFT and iFFT

Post by MyCo »

Another related bug:

Code: Select all

 mov eax,1;
cmp eax,0;
jnz JgteMtest;


the Jump is always executed because 1-0 is not zero
User avatar
digitalwhitebyte
Posts: 106
Joined: Sat Jul 31, 2010 10:20 am

Re: Stream FFT and iFFT

Post by digitalwhitebyte »

@Trog:
your implementation seems to use this algorithm.

Code: Select all

class Array
   # DFT and inverse.
   #
   # Algorithm from
   # 'Meyberg, Vachenauer: Hoehere Mathematik II, Springer Berlin, 1991, page 332'
   #
   # See http://blog.mro.name/2011/04/simple-ruby-fast-fourier-transform/
   #
   def fft doinverse = true
      src = self
      # raise ArgumentError.new "Expected array input but was '#{src.class}'" unless src.kind_of? Array
      n = src.length
      nlog2 = Math.log( n ) / Math.log( 2 )
      raise ArgumentError.new "Input array size must be a power of two but was '#{n}'" unless nlog2.floor - nlog2 == 0.0
      n2 = n / 2
      phi = Math::PI / n2
      if doinverse
         phi = -phi
      else
         src.collect!{|c| c /= n.to_f}
      end
 
      # build a sine/cosine table
      wt = Array.new n2
      wt.each_index { |i| wt[i] = Complex(Math.cos(i * phi), Math.sin(i * phi)) }
 
      # bit reordering
      n1 = n - 1
      j = 0
      1.upto(n1) do |i|
         m = n2
         while j >= m
            j -= m
            m /= 2
         end
         j += m
         src[i],src[j] = src[j],src[i] if j > i
      end
 
      # 1d(N) Ueberlagerungen
      mm = 1
      begin
         m = mm
         mm *= 2
         0.upto(m - 1) do |k|
            w = wt[ k * n2 ]
            k.step(n1, mm) do |i|
               j = i + m
               src[j] = src[i] - (temp = w * src[j])
               src[i] += temp
            end
         end
         n2 /= 2
      end while mm != n
      src
   end
end

//usage
real = []
imag = []

@in.fft(false).each {|i| real.push i.real; imag.push i.imag}
output 0, real; output 1, imag
Last edited by digitalwhitebyte on Wed Jun 19, 2013 10:28 am, edited 1 time in total.
User avatar
trogluddite
Posts: 1730
Joined: Fri Oct 22, 2010 12:46 am
Location: Yorkshire, UK

Re: Stream FFT and iFFT

Post by trogluddite »

Hi Guys
MyCo wrote: you don't really use the benefits of SSE.

Saving that for later - I'll look into that once the other coding is done, but I thought I'd keep the other SSE channels free for now to give the option of parallel (e.g. stereo) operation.

Bugs! - nope!
There are a few unconditional jumps (why no 'jmp' opcode!) - they are just used to get the execution order right to switch between testing conditions at loop start/end; so the xmm3 value is actually set later in the code.

MyCo wrote:buffer the incoming samples at the bitshifted positions in an Array? Then you could leave this step (Loop1) out of the main code.

Yes, I have tried several methods of doing that - but oddly enough, they've all come out with higher CPU load so far, despite, as you say, less read/writes and way fewer opcodes. Seems that streamlining the memory pipeline is the crucial thing here - a handful less cache misses per iteration makes a huge difference.
I'll keep working on it though - that transfer is where scaling/windowing would be applied, so it needs tightening up!

digitalwhitebyte wrote:your implementation seems to use this algorithm.

Yup, that's the one!
All schematics/modules I post are free for all to use - but a credit is always polite!
Don't stagnate, mutate to create!
Tronic
Posts: 539
Joined: Wed Dec 21, 2011 12:59 pm

Re: Stream FFT and iFFT

Post by Tronic »

I think it is a good read for understanding the world of the FFT
The FFT Demystified
Post Reply