In this post we’ll provide a simple Fast Fourier Transform (FFT) example in C where we will sacrifice performance and external code dependencies for readability and simplicity. After understanding this simple FFT example it should trivial to modify for performance and adaptations for computer architecture. Please note this has been run on an Arduino and the performance is sub-par. I’d say this FFT implementation is probably best suited for a microcontroller.

## How to use the FFT Example

In the example below we’ll perform an FFT on a complex (real + imaginary) array of 32 elements. After the FFT has completed it will write over the provided arrays as the FFT is done “in place”.

A longer example to test the FFT is provided in the sections below. These tests show more in-depth how the FFT is used.

## C API Header of the FFT

We only need a single call to the function called `fft` and we have two helper functions `rearrange` and `compute`. The `rearrange` function will rearrange the elements in the array to make the butterfly computations much easier. The `compute` function does all the heavy lifting and is where the FFT is computed.

## C Implementation of the FFT

Now it’s time to look at the implementation. I wished I had with me the fancy diagrams of butterflies and twiddle factors I used on paper to implement this, but it’s been took long. To recreate the diagrams the best way would be to use the debugger and a sheet to paper as it computes an FFT of size 8. Then have the theory and math alongside you.

## Performance of the Example

Here are some numbers I got when running on an Arduino DUE 32-Bit ARM at 84MHz. Please note I’m not sure now what datatype I used. Above we have `float` but I’m quite sure it was changed to `int`. Below are some times I collected for each size.

1. FFT Size 8 took 4.5ms
2. FFT Size 16 took 11.5ms
3. FFT Size 32 took 27.5ms
4. FFT Size 64 took 61.5ms
5. FFT Size 128 took 133.5ms

The major factors to the speed are as follows:

1. Datatype. Going from char, int, float, double makes a huge difference especially if you don’t have a math co-processor. If you’d use this in real life, it would need to be modified and the datatype carefully considered depending on computer architecture.
2. Multiplication. This really hurts say on an Atmel Arduino if it has to do multiplies on an int.
3. Cosine/Sine generation. If you knew the size of the FFT then a look-up table would be a faster option than computing every time
4. Re-arranging. Having to re-arrange the data takes good time.

## Test Cases for the FFT

We will conclude with how the FFT example was tested and some test cases so the usage can be seen. Again, the type here is `float` and this will run well on a processor with a math co-processor. If you’re using a microcontroller then you’d want to consider type char or int.