This project has moved. For the latest updates, please go here.

Bug or By Design? Non-power-of-2 IntegralTransforms introduce noise at 46,459 elements (and beyond)

Aug 5, 2013 at 9:10 PM
I am new to the world of FFT. Whenever new to an area, I like to experiment. Not knowing the performance characteristics of Transform, DFT.BluesteinForward, or DFT.Radix2Forward I wanted to do some correctness tests and timings. I started with a really small sample size (50 samples) with all data points coming from a 1 Hz signal. So when applying the Transform, it was expected that only bin 1 would have data in it, and in fact that is what happened. After that I tried mixing in an additional signal, say 1 Hz and 7 Hz, with 500 samples. Again it worked. Then I tried increasing the sample count to what we are actually intending - 5 million samples - the result bins were all over the place. So I tried 500,000 and 50,000 with the same unexpected results. So I backed up a step and wrote a program that deterministically calculates results with sample sizes from 8 to ushort.MaxValue, with all data points coming from a 1 Hz signal. I was surprised by the results.

Between 8-46,458 samples, I always and only got back a non-zero value for bin 1. Between 46,459 and 65,535 (a span of 19,077 tests), only 150 of them produced "correct" results. From a quick check of the failures, they seemed to deviate more and more from the expected results (both in number of bins with values in them and the magnitudes of the values) the more samples in the test.

Given that Radix2 started to outperform Bluestein and Transform between 8,192 and 16,384, I went ahead and ran a "correctness" test using powers of 2 between 2^3 - 2^22 (4 million, which is close to what we want). For powers of 2, I always and only got back the correct bin (1).

Now I'll be the first to admit that I do not know how the math works behind the scenes. So my question is simply - is it a bug or by design that the results start to have noise in them at 46,459 samples and beyond, and if it is by design, why does it start there instead of earlier or later (what's the catch?)?

Aug 7, 2013 at 8:57 AM
Edited Aug 7, 2013 at 8:58 AM

This is certainly not by design/intended. It could either be a straight bug in the Bluestein implementation, or a numerical stability/robustness issue. Either way it should be fixed. Thanks for the thorough analysis so far already!

At some point we may also want to optionally leverage MKL's FFTW routines (just like with linear algebra where you can choose between the fully managed implementation or a native provider), so we'll have to rework these routines a bit anyway. Where applicable this should also bring a very significant perf boost.