Next: Descriptors
Up: Lectures
Previous: Boundary Representation
Last updated on Wednesday, November 8, 1995 at 9:00 AM.
Gonzalez and Woods, 8.1.5
All of the methods we've discussed so far have involved encoding the
boundary in some fashion.
This gives us ready access to certain parameters of the shape, but
not to others.
Suppose we wanted to know how wide the object was, or how many
branches it made, etc.
These are things not readily available from the boundary
description.
In a sense these are structural properties of the shape that go
beyond simple boundaries.
To infer larger structure from a shape requieres that we be able to
group together parts of the boundaries that aren't necessarily
adjacent.
For example, there is an immediacy to the way we see the two sides of
an object.
If we look at telephone poll, for example, we can make immediate
judgments of how thick or thin it is.
We do so by pairing up the two opposide sides and comparing them, even
without having to trace completely around the object to ensure that
they are indeed part of the same contour.
Once we pair sides in this fashion, it's reasonable to start talking
about object middles.
These middles give structure to the shape, allowing us to identify
trunks, arms and branches, narrowing or widening sections, etc.
Such middle-related strctures are called medial representations.
The most common form of medial representation is Blum's medial
axis. (In later years, Harry Blum wrote that he prefered to call it
the symmetric axis, but others continue to call it the medial
axis as part of the larger class of medial representations.)
Blum's medial axis can be defined in a few different intuitive ways,
all equivalent:
- Grassfire Analogy.
-
Consider a fire started at the boundary that spreads with uniform
speed and burns everything in its path. As the fire spreads into the
middle, each part eventually meets other parts of the fire started
at other parts of the boundary.
When these two wavefronts meet, they quench each other because there
is nothing left for either to burn.
Since the fire spreads at uniform speed, these "quench points" must
be equidistant from two different parts of the boundary.
The locus of these quench points is the medial axis.
- Maximal Discs.
-
Suppose that at each point of the boundary you placed a circle on the
inside of the boundary such that it was tangent to it.
If you start with a very small circle and expand it outwards, it will
eventually "bump into" other parts of the boundary. When you've
grown it as large as possible, you'll have a circle that has the
following properties:
- It is entirely within the object, and
- It is tangent to the object at more than one point
The locus of centers of all such circles is the medial axis.
- Ridges in a Distance Transform.
-
A distance transform assigns to each point in the image a
minimal distance from some locus of points.
If we build a distance transform from the boundary of the object, we
end up with zero at the boundary, increasing values just inside it,
yet higher values farther inside, etc.
As we progress from away from a particular point on the boundary, we
find increasing values in the distance map until we reach a point as
which we are now closer to some other part of the boundary than the
place that we left.
So, the distance increases as you progress inwards until it forms a
ridge, a locus of points with values higher than those just
off the ridge on either side.
The locus of these ridges in the distance transform is the medial
axis.
There are, of course, various mathematical definitions in the
literature as well. They come in different forms, but they typically
correspond to one of these three intuitive definitions.
The medial axis is a continuous structure that may branch in places.
Each branch corresponds to a different structure of the object.
It these branches that give the structural decomposition of the
medial axis.
These branches also form a hierarchy. Each bump on an arm of the
shape is a branch of a branch.
At each point of the medial axis, we also have a corresponding r
value: the radius of the maximal disc, the distance from the
grassfire, or the distance from the distance map.
If we let s be arc length along any given segment of the axis,
we have a one-dimensional function
along the segment.
If
is positive, this part of the object widens as we
progress along the axis segment; if it's negative the part narrows.
The behavior of r is limited by

The easiest definition for which to consider the various properties
of the medial axis is the maximal-disk one.
Consider one such disk.
As mentioned earlier, it touches the boundary in more than one
place.
These points can come in different flavors:
- Two disjoint sets of points.
-
Most of the axis points fall into this category.
- Three disjoint sets of points.
-
These are branch points.
- Exactly one contiguous set of points.
-
These are end points.
Notice that its possible for the medial disc to touch the boundary in
a set of contiguous points rather than a single one. In other words,
the boundary is locally circular.
These sets of points don't really change the properties of the
resulting axis--it's the number of disjoint sets of points, not the
actual number of points themselves, that are important.
Sounds great so far, right?
Well, the medial axis has one problem--it is extremely sensitive to
noise in the boundary.
Since the axis has a branch for each structural piece of the object,
it has to spawn a new branch for each little bump and wiggle.
The length of the new branch isn't necessarily reflective of the
significance of the variation. It must extend all the way out to the
base of the bump itself.
Because of this sensitivity to noise, many researchers have studied
ways to prune the noisy tree-like structure produced by the medial
axis.
Ways to do this include
- Smoothing the boundary before computing the axis,
- Pairing the boundary points to the corresponding axis segments
and pruning based on length of the boundary (Ogniewicz, 1992).
By ordering the segments according to the order in which they would
be pruned, you can organize them into a top-down hierarchy.
Pruning away segments or arranging them in order of significance can
(to a limited degree) let you compare similar but slightly different
objects.
Let's consider the maximal disc in Blum's definition.
As already discussed, it touches the boundary in at least two
places.
-
The center of the circle lies on the Blum symmetric axis.
-
The midpoint of the chord between the two boundary points lies on
Brady's Smooth Locus of Symmetries or SLS.
-
The midpoint of the chord between the two boundary points lies on
Leyton's Process-Induced Symmetric Axis of PISA.
Brady's SLS definition seems to be more intuitive than Blum's because
the axis really does lie directly between the related points.
It is more complex to compute, however.
Leyton's PISA definition isn't quqite as intuitive, but the resulting
representation seems to describe the way you have to push in or push
out the boundary to produce the shape (hence the "process-induced"
in the name).
Another method, which approaches the Blum definition when taken to
the continuous limit, is the Voronoi Skeleton. I'll come back
to this later.
One can use the definitions of the axis to precisely determine the
medial axis.
There are, however, two problems:
- Computational expense.
- The discrete pixel grid.
For this reason, various algorithms have been produced that produce
medial axes that are approximations to Blum's.
One way to find the middle of something is through a
process called thinning.
Suppose that we successively peal away thin layers from the object,
but only if they don't get rid of single-pixel thick parts of the
remaining object.
Your text gives a detailed algorithm for this, so I won't go into
detail. (We'll discuss thinning methods in more detail later anyway.)
This thinning method is analogous to the grassfire definition of the
medial axis.
What is left from this thinning is a structure called a
skeleton, similar in principle to the medial axis.
Another way to compute medial axes is to use the distance-map
definition. We can explicitly compute a distance map and then
analyze it to find ridges.
This requires two steps:
- Generate the distance map, and
- Find its ridges.
Let's talk about computing distance maps.
The obvious way is to explicitly find the minimum distance from each
pixel to each of the boundary pixels. If there are
pixels and M boundary points, this is
complexity--not
cheap.
One can compute less accurate distance maps using more efficient
methods. (There are also more efficient methods for precise maps.)
For these methods, we simply use a different measure of distance.
Remember the city block and chessboard distances we discussed
earlier?
If we use a city block distance measure, we can propogate the
minimum distance from the boundary by simply labeling each pixel i
with

That is, we simply find the 4-connected neighbor that is closest to a
neighhbor and add 1.
We can generalize this by using

and expanding the "neighborhood" of i.
If we expand to 8-connected neighbors, we could use the chessboard
distance or we could more closely approximate the Euclidean distance
by letting

Since we don't really care about the actual distance, just something
proportional to the distance, we can approximate these distances with
integers that have approximately the same ratio as
.
For example, values of 7 and 5 are commonly used.
We can extend this idea yet further by extending the neighborhood to
pixels that are farther away.
Instead of looking at a
area, we could look at a
one or larger.
This decreases the computational advantage but also decreases the
error between the distance measure and a Euclidean one.
Remember from an earlier lecture that distance measures have the
following properties:
The first property ensures that our recursive distance-computing
method produces a monotonically increasing result.
The second property ensures that the result is lower-bounded by the
Euclidean distance.
There are many other uses for distance maps in Computer Vision.
We probably won't talk about others this semester, but be aware of
them.
We won't talk right now about finding ridges in functions, but we'll
come back to it later when we discuss geometric methods for local
analysis of functions (images).
If we distribute discrete points along the boundary and compute the
Voronoi diagram for the points (the locus of all points that are no
nearer to one point than another), contained within the Vornoi
diagram is a skeletal structure for the shape. If we take this to
the limit by adding more and more points along a continous contour,
this Voronoi skeleton approaches the Blum medial axis.
[Make sure to convince yourself why.]
We don't have to go to this limit, though, to get a usable skeletal
structure.
The medial structure of a three-dimensional shape isn't composed of
one-dimensional curve segments but two-dimensional manifolds within the
three-dimensional space.
These medial manifolds have useful applications for
three-dimensional vision.
- Medial Axis
- Symmetric Axis
- Smooth Locus of Symmetries
- Process-Induced Symmetric Axis
- Distance Map
- Voronoi Skeleton
Next: Descriptors
Up: Lectures
Previous: Boundary Representation
© Bryan S. Morse, 1995