beginning gif up gif contents news

Next: Segmentation: Edge Linking and Boundary Detection
Up: Lectures
Previous: Image Compression

 

Segmentation: Introduction and Edge Detection

Last updated on Monday, October 23, 1995 at 9:00 AM.


Reading

Gonzalez and Woods, Ch. 7: Introduction, 7.1


Introduction

Segmentation is the process of identifying coherent regions in images that one hopes corresponds to objects.

Automatedf segmentation is probably the most difficult problem in computer vision (and the one that I find most interesting). There are, by my experience, three major reasons why automated segmentation is so hard.

  1. Lots of information is lost when 3-D scenes are projected to two dimensions. When objects cross in front of other objects (which we call "occlusion"), it's hard to keep the pieces together.
  2. Segmentation attempts to produce primitive object regions. The notion, however, of what constitutes a primitive object is nebulous.
  3. We use our brains extensively in our perceptual processes. We easily recognize that certain parts belong or don't belong together not because of similar properties of the regions, but because we know that they form parts of the same known and recognizable object. Endowing computers with such cognitive ability is currently beyond our possibilities. It is my personal opinion that until we can build computers that think like people we won't be able to build computers that see like people.

Still, there have been many successful applications of segmentation techniques in limited application domains. We will study some of these here.

One set of segmentation techniques focuses on finding specific features in an image: points, lines, edges, etc.


Detection Operators

Detection operators local neighborhood operators typically applied using convolution.

These detectors usually look like what they're finding. To understand why this is, consider the nine weights of a operator as a vector in a nine-dimensional space. Likewise, consider the nine pixels in any neighborhood as another vector in this space. The convolution operation (pointwise multiplication and addting up the results) is simply the dot product of these two vectors. In this sense, convolution (this dot product) measures the projection of the neighborhood onto the template. For unit length vectors, the idea response is, of course, the template itself.

However, longer vectors in slightly different directions may give a better result. For this reason, you'll sometimes see the results of one linear operator divided by another.


Point Detectors

The simplest point operator is a high value surrounded by low ones:
 

Figure 17.1: A point detector


Line Detection

Line detectors are shaped like the line they're trying to find: a line of high values surrounded by low ones:
 

Figure 17.2: Line detectors

These line detectors must, however, be tuned to the width of the line they are trying to detect. Consider what happens when the above detectors are applied to a 3-pixel wide line.


Edge Detectors

First-derivative Operators

Most edge detectors are based in some way on measuring the intensity gradient at a point in the image.

As we discussed earlier, the gradient operator is


When we apply this vector operator to a function, we get


As with any vector, we can computes its magnitude and orientation . The gradient orientation gives us the direction of steepest ascent for intensity in the image, and the magnitude gives us the steepness of that ascent.

We discussed earlier what some edge-detection operators look like: typically some sort of differencing operator. Examples of these include the Roberts, Kirsch, Prewitt, and Sobel operators.

Most edges are not, however, sharp dropoffs. They're often gradual transitions from one intensity to another. What usually happens in this case is that you get a rising gradient magnitude, a peak of the gradient magnitude, and then a falling gradient magnitude. Extracting the ideal edge is thus a matter of finding this curve with optimal gradient magnitude. Next lecture, we'll discuss ways of finding these optimal curves of gradient magnitude.

Second-derivative Operators

There are other operators that try to find these peaks in gradient magnitude directly.

Let's first consider the one-dimensional case. Finding the ideal edge is equivalent to finding the point where the derivative is a maximum (for a rising edge) or a minimum (for a falling edge). Figure 7.4 in your reading shows you what this looks like.

How do we find maxima or minima of one-dimensional functions? We differentiate them and find places where the derivative is zero. Differentiating the first derivative (gradient magnitude) gives us the second derivative. Finding optimal edges (maxima of gradient magnitude) is thus equivalent to finding places where the second derivative is zero.

When we apply differential operators to images, the zeroes rarely fall exactly on a pixel. Typically, they fall between pixels. We can isolate these zeroes by finding zero crossings: places where one pixel is positive and a neighbor is negative (or vica versa).

One nice property of zero crossings is that they provide closed paths (except, of course, where the path extends outside the image border). There are, however, two typical problems with zero-crossing methods:

  1. They produce two-pixel thick edges, and
  2. They can be extremely sensitive to noise.

For two dimensions, there is a single measure, similar to the gradient magnitude that measures second derivatives.

Consider the dot product of with itself:


We usually write as . It has special name and is called the Laplacian operator.

When we apply it to a function, we get


One interesting property of the Laplacian is that it is rotationally invariant--it doesn't depend on what directions you choose, so long as they are orthogonal. The sum of the second derivatives in any two orthogonal directions is the same.

We can find edges by looking for zero-crossings in the Laplacian of the image. Let's turn our attention then to neighborhood operators for measuring the Laplacian.

Laplacian Operators

One way to measure the second drivative in one dimension is to extend the forward-differencing first-derivative operator to the forward difference of two forward differences:


We can thus use a simple neighborhood operator to approximate the second derivative in one dimension. The Laplacian is one of these in the x direction added to one of these in the y direction:
 


As mentioned earlier, zero crossings of this operator can be used to find edges, but it is susceptible to noise. This sensitivity comes not just from the sensitivity of the zero-crossings, but also of the differentiation. In general, the higher the derivative, the more sensitive the operator.

Laplacian of Gaussian Operators

One way to deal with the sensitivity of the Laplacian operator is to do what you'd do with any noise: blur it.

If we first blur an image with a Gaussian smoothing operator and then apply the Laplacian operator, we can get a more robust edge-finder.

But why go to all the trouble? Convolution is associative and commutative, so we can combine the Gaussian smoothing operator with the Laplacian operator (by convolving them one with another) to form a single edge-finding operator.

But this is still too much work. Instead of approximating the Laplacian operator with forward differencing and then applying it to a Gaussian, we can simply differentiate the Gaussian


(where ) to get a closed-form solution for the Laplacian of Gaussian:


We can then approximate this over a discrete spatial neighborhood to give us a convolution kernel.

But wait! There's an even better way. We can not only calculate the closed-form solution for a Laplacian of Gaussian, we can compute its Fourier Transorm. We know that the Fourier Transform of a Gaussian is another Gaussian, and we also know that we can differentiate using a ramp function ( or ) in the frequency domain. Multiply together the spectrum of the image, the Fourier Transform of a Gaussian, and two differentiating ramps in the one direction and you have a second-derivative of Gaussian in one direction. Do the same thing in the other direction, add them together, and you have the Laplacian of Gaussian of the image. (See, that stuff from Ch. 3 comes in handy after all.)

Difference of Gaussians Operator

It can be shown that the Laplacian of a Gaussian is the derivative with respect to of a Gaussian. That is, it is the limit of one Gaussian minus a just every so smaller Gaussian. For this reason, we can approximate it as the difference of two Gaussians. This is called the Difference-of-Gaussians or DOG operator.

Also appearing in the literature is the Difference-of-LowPass filters or DoLP operator.

Laplacians and Gradients

Zero-crossings of Laplacian operators only tell you where the edge is (actually between where the edge is), not how strong the edge is. To get the strength of the edge, and to pick which of the two candidate pixels is the better approximation, we can look at the gradient magnitude at the position of the Laplacian zero crossings.

Combined Detection

If one operator works well, why not use several? This seems like a reasonable thing to do, but the way to do it is nontrivial. Section 7.1.4 of your text gives a detailed dsecription of this technique. We won't go into it in class, but understand that it is trying to combine information from multiple orthogonal operators in a vector space and then project the results to the "edge subspace".


The Effect of Window Size

The results you get with a , , or operator will often be different. What is the correct window size? Well, there isn't one standard one that will work under every circumstance. Sometimes you have to play with it for your application. Smart vision programs will often use multiple windows and combine the results in some way.


Vocabulary



beginning gif up gif contents news

Next: Segmentation: Edge Linking and Boundary Detection
Up: Lectures
Previous: Image Compression

© Bryan S. Morse, 1995