Compression Research
Automatic Line Extraction
Figure 1: Noisy image
Figure 2: Lines automatically extracted from image --
these raw lines can
subsequently be approximated and stored as Bezier curves, for example.
Many images, including digital documents, are based on lines. Standard
compression algorithms, however, are based on pixels and/or pixel regions.
If the lines of a digital document are extracted, the amount of space necessary
to store the line information is much smaller than the space necessary to
store the pixel information.
Digital scans of physical documents, however, are usually less than perfect.
Noise is usually present, and gaps in the salient lines can appear.
Our algorithm, based on the cost expansion algorithm developed for "intelligent
scissors," automatically extracts lines from noisy images, bridging
arbitrarily large gaps in image data.
Once the lines have been extracted, they may be approximated by Bezier
curves or straight line segments and then stored in this compact form.
JBIG Compression
(ITU-T Recommendation T.82)
In addition to the above compression research, we have compared existing
technologies for the compression of bi-tonal and grayscale images.
For the purpose of browsing document images, JBIG compression was found
to be the most applicable current technology, and is the compression being
used right now in the Just-in-time Browsing research. JBIG allows
hierarchical encoding of bi-tonal images, and encodes the images in horizontal
regions called stripes. By applying the JBIG algorithm to multiple
bit planes of grayscale images, we will be able to preserve the grayscale
information that is important for browsing some images.
Progress of Just-in-time Browsing Research
- Preliminary Image Preparation - The JBIG algorithm is used to compress
the images in a hierarchical manner. In order to allow random access
to parts of the image interactively requested by the user, we encode the
image with an explicit reset of the arithmetic encoder state probability
status for contexts after each stripe, and we keep track of where the data
for each stripe is found within the encoded data for the entire image.
- Changes to JBIG Compression - Since a reset of the arithmetic encoder
is required after each stripe, there is a slight loss in compression during
the encoding of images. In an attempt to offset this result, we looked
into saving non-default arithmetic encoder state probability status values
for certain types of document images that could be stored locally for use
as reset values. This approach did not produce positive results and
so these changes were abandoned.
- Preliminary Server - The server responds to network requests from the
browser by sending the encoded data for the requested image stripes.
- Preliminary Just-in-time Browser application - The preliminary version
of the browser allows interactive hierarchical browsing of bi-tonal images
of relatively small sizes (about the size of the screen or smaller) over
the Internet or locally.


Figure 3: A 864 x 956 pixel document image after receiving
510 bytes, 947 bytes, and 11,276 bytes.
While a user would normally have to download all 15,251 bytes of a GIF
image (96,023 bytes if JPEG) to see if it is the image they want, the Just-in-time
Browser makes it apparent that the document is a letter after receiving
only 510 bytes of data (including response header information and information
about the image dimensions, number of bit planes, etc.). After a
total of 947 bytes, the user is also aware that the letter is to Pete,
and that it is signed by Phil. By this point, most users would have
a pretty good idea if this image was the document they wanted or not.
The entire hi-resolution image is displayable after receiving 11,276 bytes
from the server.
- Some work was done with browsing of grayscale images, as well as images
pre-processed to be interpreted as images with multiple columns instead
of just stripes, however more remains to be done with this.
- Digital Microfilm Server - A preliminary server has been written to
support the Digital Microfilm paradigm for quickly scanning through document
images, which is described below.
Future plans for Just-in-time Browsing
- Generalize Just-in-time Browsing to easily browse arbitrarily large
images.
- Generalize Just-in-time Browsing to handle an arbitrary number of grayscale
or color bit-planes
- Decrease the time users spend waiting by requesting and receiving network
data in the background, allowing continued user interaction.
- Digital Microfilm - A user interface modeled after the idea of microfilm/microfiche
viewers has been planned. This will allow users to browse through
hundreds or even thousands of related document images at a much quicker
rate by automatically loading the low-resolution images as the user scrolled
through the "film" and by allowing the user to interactively
browse whichever of the images they thought may be of interest. The
general look of the interface will be as follows:
Figure 4: The Digital Microfilm interface will allow users
to quickly scan through
large collections of document images and look more closely at any image
with Just-in-time Browsing.
- Research local storage of specialized private deterministic prediction
tables for more efficient JBIG encoding of specialized document types,
as well as possible modifications to JBIG Algorithm to improve compression
efficiency.
- Improve browsing productivity of images that are similar in appearance
by using browsing templates, which define parts of the images that contain
the best identifying features. These portions of the image can then
be automatically loaded to a better resolution to minimize the user interaction
necessary for browsing.
- Experiment with incorporating the Automatic Line Extraction compression
which is being researched to make the compression rates better, thus reducing
network bandwidth and server loads even more.
- Research the possibility of using pattern recognition techniques to
automatically determine the most important distinguishing parts of similar
images to use as adaptable browsing templates.