beginning gif up gif contents news

Next: More Image Basics
Up: Lectures
Previous: Introduction

 

Image Examples and Basics

Last updated on September 7, 1995 at 6:30 PM


Reading

Gonzalez and Woods, Ch. 1, 2.2-2.3

Notice that for right now we're skipping over Section 2.1, which deals with human vision. It's not because this isn't important, it's because I want to come back to it later and give it a more thorough coverage.

 


Image Fundamentals

Images are essentially a function of some spatial coordinates:


where denotes a position in a space of some number of dimensions. Two dimensions are most common (like photographs, X-rays, etc.) because we're used to seeing two dimensional projections of the world, but three-dimensions are also common, especially in medical imaging or other modalities where you can sample values at more than two dimensions.

Rather than a continuous function, these functions are sampled at discrete points (called picture elements or pixels for short). Usually the sampling pattern is a rectangular grid, but it need not be. Researchers have experimented with other patterns, but usually end up coming back to the old familiar rectangular grid. These samples can thus be thought of as an n-dimensional array, and are usually stored as such.

One problem that you'll quickly run into is how to index the image array: if you store the array in row-major order, as most programming languages do, you'll find (row,column) indexing convenient. If, however, you prefer to think about the problem mathematically, cartesian (x,y) coordinates are probably what you'd use. The two are reversed from each other, and this is a common source of errors in image programming.


Image vs. Scene

In this course, the term image will refer to a discretely-sampled, n-dimensional array of values acquired through some imaging means.

The term scene will refer to the real-world source from which the image was acquired (if applicable).

In this way, we will distinquish between you and a picture of you.

 


Sampling and Image Resolution

A key element in describing an image is to describe the sampling.

One common (and wrong!) way to do this is to simply count the pixels. This gives you the image size, but not any information about the sampling. It's easy to think that a 1024x1024 image is sampled finer than a 64x64 one, but it's often not true.

Sampling is best described as the number of samples per spatial unit. A 64x64 image of the Talmage Building shows far more spatial detail than a 1024x1024 image of Provo.

But even this isn't enough. Every imaging device uses a finite-sized aperture for each pixel that integrates (averages or blurs) the information within the aperture. For example, each pixel in an image isn't really a precise sample exactly at that position, but rather an average sample around that position.

Suppose you use an imaging device that uses a spatial aperture 1 mm across. Now suppose that you took 1000 samples all within one millimeter. Lots of samples, but you still really can't determine much within that one millimeter. (Actually, there are ways to try to, but we'll talk later about why this usually doesn't work well.)

What's really important then isn't just the number of samples or the spacing of the samples, but how well we can see small features in an image. This property is called resolution. Resolution is usually defined in terms of how many distinct alternating black and white lines can be identified within some spatial distance. We'll come back later and talk about resolution in more detail.

High-resolution imaging devices are sharp and clear, low-resolution devices are blurry. This has absolutely nothing to do with the number of samples or their spacing!

 


Quantization

In addition to discrete spatial sampling, images also use discrete quantization of the value of the image function .

Images can use as few as two discrete levels (one bit) to represent an image or as many as 256 (8 bit--typical greyscale), 4096 (12 bit--many medical images), or more (24 bit--typical color image).

Image pixels also need not be scalar values. Several imaging modalities give vector-valued quantities for each pixel. (In a sense, that's what RGB color is.)

 


Sources of Images

We'll cover various imaging modalities in more depth later in the course, but it's useful at this point for you to understand the wide range of images that are out there.

As we discuss each imaging modality, try to keep in mind what each pixel represents and how it is obtained.

 

Cameras

The most common types of images we're used to seeing are from cameras. If you've looked at photographs, you've seen images.

Camera images are the most similar to what we see when you look out of our eyes: a two-dimensional projection of a three-dimensional world. One of the goals of these types of images is sometimes to extract the missing depth compontent (David Marr called this -D vision). The depth isn't really missing-it isn't explicit in the image, but most images are rich in depth cues that give us hints. We generally won't talk about depth perception in image this semester, but will next semester in CS 650.

Some camera images, such as those generated by scanners (okay--this is a broad definition of ``camera''), image things that are already flat, so there isn't any depth information in the image.

 

Black and White / Greyscale

Some images only detect brightness--the color component is ignored. It was only natural that this type of image was used first, because it's easy to imagine chemicals or other detectors that react to light energy, but it's harder to develop detectors that react to different wavelengths.

The simplest types of images are those that are truly black and white ( binary images. Each pixel can be represented by a single bit. If space permits, many imaging systems will still use a full byte to store the data because it is easier to access. (You don't have to use bit-level operations to get to particular pixel values.) However, if storage is at a premium, you can pack the bits to get optimal storage utilization. Even then, there are usually padding bits so that individual scan lines still start at byte or word boundaries.

It's far more common that we get images that have varying shades of grey ( greyscale images. The most common type uses one byte per pixel and stores pixel values in the range [0...255] (or [-128..127] for signed bytes).

Some devices generate more bits per pixel, frequently stored as unpacked two-byte pixels ( unsigned short types in C).

 

Color

Camera images can also be in color. Usually they're imaged using (Red, Green, Blue) or RGB encoding, but they don't need to stay that way. You probably know that your color television uses red, green, and blue dots dots on its screen, but you may not know that the standard NTSC television signal doesn't use RGB encoding--it uses something called YIQ.

Time permitting, we'll talk about color spaces later in the course.

 

Medical Images

Medical Images generally fall into two categories: transmission (projection) and emission.

 

Transmission (Projection)

Transmission or projection images use a radioactive source placed behind the target to project radiation through the target. The structures of the target cast shadows on the other side, where the image is then taken.

The most common form of projection image is the everyday X-ray (more properly referred to as a radiograph). While medical imaging is often thought to be a hot, exciting, and rapidly progressing field, 90% of all medical images taken every day are simple X-rays.

Here is an example of an X-ray.

By taking a lot of X-ray images at a range of angles around the body, one can create a three-dimensional reconstruction of the anatomy. This process is known as Computed Tomography or CT (once known as Computed Axial Tomography or CAT scans--the "axial" and the "A" are usually left off these days).

Here is an example of a CT image and another. Notice that the image slices across the axis of the body instead of projecting through it.

Ultrasound is a form of projection imaging using high-frequency sound waves, but rather than seeing shadows, it sees reflections (echoes).

 

Emission

Emission imaging involves putting the radiactive source inside the target. The radiation emitted by the target is directly imaged. Actually, not quite directly--as with projection imaging the anatomy of the patient casts shadows as the radiation leaves the body.

While the important information in projection imaging is the anatomy (the shadows cast by the body), the important information in emission imaging is the distribution of the radiation source within the body. These radioactive isotopes are usually bound with specific pharmaceuticals that distribute themselves in certain ways according to body function.

Here is an example of an emission image from nuclear medical imaging . The noise is typical of nuclear medical images--we'll see why later.

Common forms of nuclear medical imaging are photon-emission tomography (PET) and single-photon emission tomography (SPECT).

A recent form of emission imaging is Magnetic Resonance Imaging or MRI, which causes certain molecules of the body to resonate and give off low-level radiation in the FM radio range. This signal can be detected outside of the body and the distribution of the resonating molecules can be formed into an image. The most common form of MRI involves imaging single-proton nucleii in the body (i.e., hydrogen, which appears in large quantities in the body in the form of water). This makes it more sensitive to soft tissues than X-ray imaging.

Here is an example

 

Multimodality Imaging and Registration

Since different images capture different features (CT capturing hard structure, MRI capturing soft tissues, PET/SPECT capturing physiological function, etc.) one can combine images from different modalities to give a more complete picture. This requires aligning the multiple images so that they correspond spatially--a process known as image registration.

One of the most interesting applications of multimodality imaging is the Visible Human Project conducted by the National Institutes of Health. This project is imaging a cadaver with MRI, CT, and cryosection imaging. The image from the last lecture is built from this data.

While multimodality imaging is common in medical imaging, it is also often used in other areas as well.

 

Range Images

Sometimes it is useful to capture other information than light or other radiation. Here is a range image from Michigan State. Notice how the areas nearer to you are encoded as such by being "brighter" and the farther areas are encoded as "darker".

Range images are useful for reconstructing the shape of objects in a scene. Here is an example of using range images to create 3-D object models for graphical rendering.

 

Aerial and Satellite Imaging

One of the most common forms of satellite imaging is LANDSAT, which produces several images using different forms of imaging: photographic, thermal, etc.

Other forms of aerial imaging include synthetic aperture radar.

 

Other Examples

You can see many other examples by browsing some of the on-line collections. The Computer Vision Home Page is a good starting point.

As you can see, images are a lot more than just camera pictures. They are often computed from other data (as in the case of CT or MRI images) or may themselves represent some entirely different kind of information (such as with a range image).


Vocabulary



beginning gif up gif contents news

Next: More Image Basics
Up: Lectures
Previous: Introduction

© Bryan S. Morse, 1995