Interesting. I've just implemented an algorithm (matrix profile) that makes use of FFT to compute a big set of dot products of time series subsequences where the length n of the time series can be in 100s of millions. The fast convolution computation using FFT reduces the computation time from O(n) to O(log n) with awesome speed gains at this scale. Throw in a GPU and the speed goes up even faster, like processing 10 million data point in 0.1 second on a laptop.