beginning gif up gif contents news

Next: Segmentation: Thresholding
Up: Lectures
Previous: Segmentation: Edge Linking and Boundary Detection

 

Segmentation: The Hough Transform

Last updated on Friday, October 27, 1995 at 9:00 AM.


Reading

Gonzalez and Woods, 7.2.2, Notes


Introduction

Today we'll continue our discussion of how to group isolated edge points into image structures. So far, we've talked about edge linking, edge relaxation, and optimal edge determination using graph searching. All of these require some continuous path of edge pixels, near edge pixels, or edge-cost information.

The technique we'll be talkinga about today, called the Hough Transform, doesn't require connected or even nearby edge points.


Parametric Representations of Primitives

Let's first talk about finding edges that we know are straight lines. Consider a single isolated edge point --there could be an infinite number of lines that could pass through this point.

Each of these lines can be characterized as the solution to some particular equation. The simplest form in which to express a line is the slope-intercept form:


where m is the slope of the line and b is the y-intercept (the y value of the line when it crosses the y axis). Any line can be characterized by these two parameters m and b.

We can characterize each of the possible lines that pass through point as having coordinates in some slope-intercept space. In fact, for all the lines that pass through a given point, there is a unique value of b for m:


The set of values corresponding to the lines passing through point form a line in space.

Every point in image space corresponds to a line in parameter space and each point in space corresponds to a line in image space .


Accumulators

The Hough transform works by letting each feature point vote in space for each possible line passing through it. These votes are totalled in an accumulator.

Suppose that a particular has one vote--this means that there is a feature point through which this line passes. What if it has two votes? That would mean that two feature points lie on that line. If a position in the accumulator has n votes, this means that n feature points lie on that line.


The Hough Transform Algorithm

The algorithm for the Hough transform can be expressed as follows:


For finding lines, each feature point casts a line of votes in the accumulator.


A Better Way of Expressing Lines

The slope-intercept form of a line has a problem with vertical lines: both m and b are infinite.

Another way of expressing a line is in form:


One way of interpreting this is to drop a perpendicular from the origin to the line. is the angle that the perpendicular makes with the x-axis and is the length of the perpendicular. is bounded by and is bounded by the diagonal of the image.

Instead of making lines in the accumulator, each feature point votes for a sinusoid of points in the accumulator. Where these sinusoids cross, there are higher accumulator values. Finding maxima in the accumulator still equates to finding the lines.


Circles

We can extend the Hough transform to other shapes that can be expressed parametrically. For example, a circle of fixed radius can be described fully by the location of its center .

Think of each feature (edge) point on the circle as saying, "if I'm on the circle, the center must be in one of these places". It turns out that the locus of these votes is itself a circle.

But what about circles of unknown size? In this case, we need a third parameter: the radius of the circle. So, we can parameterize circles of arbitrary size by . Instead of casting votes in a circular pattern into a two-dimensional accumulator, we cast votes in circles of successively larger size in a three-dimensional accumulator.


More Complicated Shapes

We can extend the Hough transform to find shapes of arbitrary complexity, so long as we can describe the shape with some fixed number of parameters. The number of parameters required to describe the shape dictates the dimensionality of the accumulator.


The Generalized Hough Transform

Some shapes may not be easily expressed using a small set of parameters. In this case, we must explicitly list the points on the shape.

Suppose that we make a table that contains all of the edge pixels for our target shape. We can store for each of the pixels its position relative to some reference point for the shape. We can then feature point "think" as follows: "if I'm pixel i on the boundary, the reference ooint must be at ."

This is called the Generalized Hough Transform and can be expressed as follows:


Different Orientations

For the line-matching Hough Transform, the orientation of the line is one of the parameters. For the circular form of the transform, orientation is unnecessary. For other methods, you can choose whether or not to use orientation as one of your parameters.

If you don't use an orientation parameter, you will only be able to find matches in a specific orientation.

If you do include an orientation parameter, you'll have a larger-dimensional accumulator, but you'll be able to find matches in various orientations (as many as you want to discretely parameterize).


Discrete Accumulator

This discussion is all well and good for continuous parameterizations, but we need to discretize the parameter space. The granularity with which we discretize the accumulator for the parameter space determines how accurate we'll be able to position the sought-after target.


Smoothing the Accumulator

Because of noise, discretization of the image and accumulator, and factors inherent in the application itself, we sometimes want to allow a little tolerance in the fitting of lines or other shapes to the edge pixels. We can do this by allowing a feature point to vote note just for a sharp line or curve in the accumulator, but to cast fuzzy votes for nearby possibilities. In essence, this votes not just for all lines or other shapes that pass through the feature point but also for those that pass close by.

This casting of votes for nearby possibilities can be done directly during the voting phase of the transform or afterwards through a post-processing blurring of the accumulator. The way in which you blur the accumulator depends on what nearby possibilities you want to allow.


Finding Relative Maxima in the Accumulator

If you're just looking for one thing, simply pick the largest value in the accumulator and map it back to the image space. If you're looking for other shapes, you need to find all relative maxima.

As a further test, you may want to threshold the maxima that you find by the number of points that voted for it. Relative maxima with few votes probably aren't real matches.

You'll find that this part is the real trick of getting the Hough transform to work.


Grey-level Voting

The discussion so far has been of binary feature points casting binary votes. A more effective way of doing this is to use the full range of grey-level feature detection (i.e., gradient magnitude) and cast votes proportional to the strength of the feature.


Refining the Accumulator

In the Hough transform, feature points vote for all possibilities through that point. This can cause unwanted clutter in the accumulator and false matches.

A way to deal with this is to do a second voting phase. In this phase, each point examines the locus of points in the accumulator that it voted for and finds the maximum. It then casts a single vote into a second accumulator only for this maximum. In other words, it sees where there is agreement with its neighbors and then goes along with the crowd. Gerig (1987) has shown that this removes this unnecessary clutter and leadsto only a few isolated points in the accumulator.

As another approach, suppose that after we find the global maximum we remove all votes cast by feature points on or near the corresponding match in the image space. We can then continue the process by looking for the next largest global maximum, remembering it, and removing the votes from it's points.


Vocabulary



beginning gif up gif contents news

Next: Segmentation: Thresholding
Up: Lectures
Previous: Segmentation: Edge Linking and Boundary Detection

© Bryan S. Morse, 1995