Compression Research
 

Automatic Line Extraction


Figure 1: Noisy image


 Figure 2: Lines automatically extracted from image -- these raw lines can
subsequently be approximated and stored as Bezier curves, for example.


Many images, including digital documents, are based on lines. Standard compression algorithms, however, are based on pixels and/or pixel regions.  If the lines of a digital document are extracted, the amount of space necessary to store the line information is much smaller than the space necessary to store the pixel information.
 

Digital scans of physical documents, however, are usually less than perfect.  Noise is usually present, and gaps in the salient lines can appear.
 

Our algorithm, based on the cost expansion algorithm developed for "intelligent scissors," automatically extracts lines from noisy images, bridging arbitrarily large gaps in image data.
 

Once the lines have been extracted, they may be approximated by Bezier curves or straight line segments and then stored in this compact form.
 
 

JBIG Compression
(ITU-T Recommendation T.82)


In addition to the above compression research, we have compared existing technologies for the compression of bi-tonal and grayscale images.  For the purpose of browsing document images, JBIG compression was found to be the most applicable current technology, and is the compression being used right now in the Just-in-time Browsing research.  JBIG allows hierarchical encoding of bi-tonal images, and encodes the images in horizontal regions called stripes.  By applying the JBIG algorithm to multiple bit planes of grayscale images, we will be able to preserve the grayscale information that is important for browsing some images.
 
 



Progress of Just-in-time Browsing Research


Figure 3: A 864 x 956 pixel document image after receiving 510 bytes, 947 bytes, and 11,276 bytes.

While a user would normally have to download all 15,251 bytes of a GIF image (96,023 bytes if JPEG) to see if it is the image they want, the Just-in-time Browser makes it apparent that the document is a letter after receiving only 510 bytes of data (including response header information and information about the image dimensions, number of bit planes, etc.).  After a total of 947 bytes, the user is also aware that the letter is to Pete, and that it is signed by Phil.  By this point, most users would have a pretty good idea if this image was the document they wanted or not.  The entire hi-resolution image is displayable after receiving 11,276 bytes from the server.

Future plans for Just-in-time Browsing
 


Figure 4: The Digital Microfilm interface will allow users to quickly scan through
large collections of document images and look more closely at any image with Just-in-time Browsing.