beginning gif up gif contents news

Next: Medial Representations
Up: Lectures
Previous: Segmentation: Region Growing

 

Boundary Representation

Last updated on Monday, November 6, 1995 at 9:00 AM.


Reading

Gonzalez and Woods, Ch 8 through 8.1.4.

 


Introduction

In the last chapter, we discussed ways of segmenting an image into boundaries or regions.

In this chapter, we'll discuss how to represent the objects in an image once they are identified.

It's simple to describe a region by simply listing its points, but that doesn't really tell us much about it. Sure, we could count the points to give the area, or we could match sets of points to see if there's an exact match, but otherwise there aren't many properties of the object that we could identify from just a list of points.

First, we'll discuss how to encode the boundary or region (representation), and then we'll discuss how to describe each in ways that distill certain properties from the object (descriptors).

In this lecture, we'll discuss boundary representations. In the next lecture we'll discuss something called medial representations.

 


Chain Codes

Remember how we compressed a signal by only encoding the difference between successive values, not the full values themselves? Well, we can do the same thing with boundaries. Once we have identified the boundary pixels of a region, we can encode them by only storing the relative difference as we go from pixel to pixel clockwise around the contour.

If the contour is 4-connected, we only need four values to encode the direction to the next pixel. If the contour is 8-connected, we need eight numbers. See Figs. 8.1 and 8.2 in your text for an example.

By encoding relative, rather than absolute position of the contour, the representation is translation invariant. We can match boundaries by comparing their chain codes, but with the following problems:

Derivatives of Chain Codes

We can deal (somewhat) with the rotational variance of chain codes by encoding not the relative direction, but differences in the successive directions. This can be computed by subtracting each element of the chain code from the previous one and taking the result modulo n, where n is the connectivity.

This differencing allows us to rotate the object in 90-degree increments and still compare the objects, but it doesn't get around the inherent sensitivity of chain codes to rotation on the discrete pixel grid.

Resampling Chain Codes

To deal wit the sensitivity issue, we can resample the boundary onto a coarser grid and then compute the chain codes of this coarser representation. This has the advantage of smoothing out small variations, but can (for small objects) lose significant structure.

 


psi-s Curves

Another method, similar to chain codes, is to store the tangent vector at every boundary point. In fact, we don't really care about the magnitude of the tangent, only the direction. So, we can encode the angle of the tangent as a function of arc length s around the contour. Hence, this representation is called a -s curve.

Since we're encoding the angle of the tangent vector, we can choose some arbitrary reference frame (zero-degree tangent). If we let the tangent at the starting point have an angle of zero, the rest of the curve is described relative to it. Thus, our -s curve always starts and ends at zero.

The -s representation has the following advantages:

We are still limited when comparing curves with different starting points, but if we rotate the s axis of the -s representation and shift the axis so that the starting point is at zero, we can compare such curves.

 


Polygonal Approximation

Another method for compacting the boundary information is to fit line segments or other primitives to the boundary. We then need only to store the parameters of these primitives instead of discrete points. This is useful because it reduces the effects of discrete pixelization of the contour.

We can fit such lines using either merging methods or splitting methods.

Merging Methods

Merging methods add successive pixels to a line segment if each new pixel that's added doesn't cause the segment to deviate from a straight line too much.

For example, choose one point as a starting point. For each new point that we add, let a line go from the start point to the new point. Then, compute the squared error or other such measure for every point along the segment/line. If the error exceeds some threshold, keep the line from the start to the previous point and start a new line.

If we have thick boundaries rather than single-pixel thick ones, we can still use a similar approach called tunneling. Imagine that we're trying to lay straight rods along a curved tunnel, and that we want to use as few as possible. We can start at one point and lay as long a straight rod as possible. Eventually, the curvature of the "tunnel" won't let us go any further, so we lay another rod and another until we reach the end.

Splitting Methods

Splitting methods work by first drawing a line from one point on the boundary to another. Then, we compute the perpendicular distance from each point along the segment to the line. If this exceeds some threshold, we break the line at the point of greatest error. We then repeat the process recursively for each of the two new lines until we don't need to break any more.

For a closed contour, we can find the two points that lie farthest apart and fit two lines between them, one for one side and one for the other. Then, we can apply the recursive splitting procedure to each side.

 


Other Signatures

In general, a signature is any 1-D function representing the 2-D boundary. Chain codes, differenced chain codes, -s curves, etc. are all signatures. Many other forms of signatures have been proposed in the computer vision literature.

One other form of signature is to encode the radial distance r from the center of the object as a function of angle theta. This method is rotationally variant, but if (as with signatures we've already looked at) you rotate the axis of the function you can compare at different angles. This particular method is also sensitive to scaling, but it has the nice property, not found in the others yet discussed, that a scaled version of the same object produces a scaled version of the signature. We can normalize the signature by some other feature of the object, such as maximum radial distance r, length of the longest axis, etc.

 


Boundary Segments

Yet another way of describing an object is to determine the convex hull of the object. A convex hull is a minimal convex shape entirely bounding an object. Imagine a rubber band snapped around the shape and you have the convex hull. We can look at where the convex hull touches the contour and use it to divide the boundary into segments. These points are the extremal points of the object.

 


Points of Extreme Curvature

Another useful method for breaking a boundary into segments is to identify the points of maximum positive or minimum negative curvature. These extremes are where the object has exterior or interior corners and make useful key points for analysis.

 


Conclusion

Notice that as we progress from one representation to another, we're basically trying to do three things:

  1. distill more and more shape information from the representation,
  2. increase the number of transformations to which the representation is invariant, and
  3. decrease the sensitivity of the representation to noise or other object perturbations.

Much of my own recent research has been oriented towards these areas, particularly the last one.

After we talk about medial representations, we'll start talking about extracting specific features of objcts that can be used to distinguish between one object and another: area, perimeter, etc.


Vocabulary



beginning gif up gif contents news

Next: Medial Representations
Up: Lectures
Previous: Segmentation: Region Growing

© Bryan S. Morse, 1995