Last updated on Thursday, September 28, 1995 at 6:05 PM.
Briefly skim over the discussion in Section 3.5 of your text of the Walsh, Hadamard, Haar, and Slant transforms to get the flavor of different transforms.
These examples help to illustrate the reason why transforms are useful: allow you to look at the information in the signal or image in a different way.
Suppose that instead of the Fourier basis functions we used
the following set of functions:

where

We can then define a forward transform

Eq. 9.3 is the Discrete Cosine Transform (DCT)
and Eq. 9.4 is the Inverse Discrete Cosine Transform
(inverse DCT).
The DCT is separable, so you can construct the 2-D DCT just as
we did for the 2-D Fourier Transform:

In principle, the DCT is similar to the Fourier Transform in the way it decomposes a signal or image into sinusoids.
While the Fourier Transform is a better analytical tool (primarily because of the convolution theorem), the DCT has a few advantages:
These two properties make the DCT the transform of choice for the JPEG and MPEG compression algorithms.
The Fourier Transform and the DCT (as well as the others in Section 3.5) all use a fixed set of basis functions for the transformation. What if we chose a set of basis functions based on the input image itself?
These different transforms all compact the information in the signal in one form or another, but the optimal compaction of information comes when we let the image itself tell us which basis functions give the best compaction.
Suppose that we have a multispectral image as shown on page 153 of your text. Each band of the spectrum has its own information, but it may be that the most information isn't found in one image band or another, but in a linear combination of information from the collective bands. How do we determine this?
One way is to find the covariance matrix C of the
pixels.
Since C is real and symmetric, it always has
eigenvectors.
If we sort the eigenvalues, we can rank the corresponding
eigenvectors. These eigenvectors indicate the spread of the
information content in the pixels-the eigenvector with the
largest variance contains the most information, the one with
the next largest variance contains the most information if all
components of the first one are removed, and so on until one is
left with the eigenvector with the smallest eigenvalue, which
corresponds to the dimension of least information.
Each eigenvector is a linear combination of the original basis vectors, so the components of each eigenvector indicate how much of each original image band goes in the optimal mix.
Transforming the image into the coordinate space whose basis vectors are the eigenvectors of the covariance matrix is called the Hotelling Transform, sometimes known as the Karhunen-Loeve Transform or principal-component analysis.
Notice that the basis vectors for such a transformation are different for each image-unlike the Fourier Transform or the DCT which use the same basis vectors. In this way, the Hotelling Tranformation does a "best fit" to the image by choosing a basis set that gives optimal compaction of the information.
Observe from Eqs 5.4 and 5.5 that to compute the Fourier Transform of an image, one needs the entire signal (image). Likewise, to reconstruct the signal (image), one needs all of the Fourier Transform components. That means that each point in the frequency domain represents information about the entire image. Even though a particular frequency coefficient may come from a particular place in the image, one cannot directly use the information in the frequency domain to determine where. How can we attempt to spatially localize frequency information?
Consider the following function:

for some
and
.
This is called a Gabor function (or sometimes a Gabor blob.
It is a sine wave of frequency
modulated (multiplied) by a
Gaussian of standard deviation
.
Since a Gabor function is the product of two other functions, its Fourier
Transform is the convolution of the Fourier Transforms of those functions.
Specifically, the Fourier Transform of a sine wave with frequency
is a single pulse at
,
and the Fourier Transform of a Gaussian of standard deviation
is
a Gaussian in the frequency domain of standard deviation
.
When you convolve these two, you get a Gaussian of standard deviation
and mean
.
By the Convolution Theorem, when you convolve a signal with a Gabor function
you multiply its Fourier Transform by this Gaussian bandpass filter at
.
Because the Gabor function has limited spatial extent (as limited by the
Gaussian modulation), it localizes information in the spatial domain.
Because it has limited frequency extent (as limited by the Gaussian
bandpass filter in the frequency domain), it also localizes information
in the frequency domain. This type of simultaneous localization of
information in both domains is called a co-joint representation.
The Gabor function doesn't perfectly localize information because it has a spatial spread. Likewise, it doesn't perfectly localize information in the frequency domain because of the spread of the bandpass filter.
Suppose we make the spatial spread smaller by using a smaller
for
the Gabor function. What happens in the frequency domain? (It gets worse.)
Consider what happens in the limit: as the Gaussian gets infinitely
small, you perfectly localize information spatially but need all of the
frequency domain to do it--just like the inverse Fourier Transform.
Suppose we make the frequency spread smaller by using a smaller
for the bandpass filter. What happens in the spatial domain? (Again, it
gets worse.)
Again, consider what happens in the limit: as the Gaussian gets infinitely
large, you perfectly localize information in the frequency domain but need
all of the spatial domain to do it--just like the Fourier Transform itself.
You can't simultaneously localize information in both domains--if you try to do better in one, you must get worse in the other. This phenomemon is called co-joint uncertainty.
There are many types of co-joint representations, but the most popular one in the literature today is the class called wavelets. We won't cover wavelets in this course, but when you do learn about them, keep this co-joint representation (and uncertainty) in mind.
© Bryan S. Morse, 1995