The Fast Fourier Transform (FFT) algorithm to calculate discrete Fourier transforms (DFT) is a recent addition to ojAlgo.
The last blog post was about Image Processing (using Singular Value Decomposition, SVD). 2D Fourier transforms can also be used for image processing. Let’s have a look at how to do that.
In image processing, the most common way to represent an image is simply as a matrix of numbers. This is in the spatial domain where the rows and columns map to locations in the image and the numbers/elements to the colour at the respective locations. Sometimes image processing may be inefficient in the spatial domain. Transforming to some other domain may be beneficial.
The FFT is commonly used to transform an image between the spatial and frequency domains. It is the basis for many image filters used to remove noise, sharpen an image, analyse repeating patterns, or extract features. The FFT decomposes an image into sines and cosines of varying amplitudes and phases.
The concrete result of a 2D discrete Fourier transform is a matrix of complex numbers. The dimensions of that matrix is the same as that of the original image. The difference is that now the row/column indices corresponds to (repetition) frequencies rather than positions in the image. The norm of the complex numbers is the amplitude of that particular repetitive pattern. The phase somehow encodes details about the orientation and arrangement in the original image.
In the example program below an image is transformed, processed and then inverse-transformed. The code can be used (by you) with any square image. The first thing the program does is to convert that image to grey scale and resample to ensure that the number of pixels is a power of 2. When that’s done, this is the example’s initial image:
When Fourier transforming an image it is common to also shift the frequencies (rows and columns) so that the zero-frequency is at the centre of the matrix and higher frequencies towards the edges. Low frequencies represent gradual variations in the image; they contain the most information because they determine the overall shape or pattern in the image. High frequencies correspond to abrupt variations in the image; they provide more detail in the image, but also contain more noise. There’s normally a huge difference in amplitude between the low-frequencies (in the center of the image) and the high-frequencies.
The power spectrum is the squared amplitude of the transform (a matrix where the elements are the complex number norms, squared). This is often plotted in a topology chart or a colour coded image – white for large values and black for small (zero). Due to the huge range of values a base 10 logarithmic scale is used to display the values. These kind of images essentially always show some sort of peak (white spot) in the middle, and then some lighter pattern extending from that, and black/flat edges. Interpreting the exact pattern is for experts.
A simple image filter would define a circle, perhaps based on inspection of the power spectrum, and then treat the values inside or outside that circle differently. A filter that reduce or remove high frequencies (outside the circle) is a called a low-pass filter. A high-pass filter will leave high frequencies unchanged and reduce or remove low frequencies (inside the circle). A low-pass filter will remove detail and noise, as in the example here.
The Code
It is possible to construct an image by defining a combination of repetitive patterns (sines and cosines) in the frequency domain. This image was generated by the example code by setting 2 elements in a 1024 by 1024 frequency domain matrix, and the inverse-transforming that to spatial (normal image) domain. Try changing the values, the location of the values or adding more values.