Color documents on the web with DjVu

Patrick Haffner, Yann LeCun, Leon Bottou, Paul Howard, Pascal Vincent, Bill Riemers

Research output: Contribution to conferencePaperpeer-review


We present a new image compression technique called `DjVu' that is specifically geared towards the compression of scanned documents in color at high resolution. With DjVu, a magazine page in color at 300 dpi typically occupies between 40 KB and 80 KB, approximately 5 to 10 times better than JPEG for a similar level of readability. Using a combination of Hidden Markov Model techniques and MDL-driven heursitics, DjVu first classifies each pixel in the image as either foreground (text, drawings) or background (pictures, photos, paper texture). The pixel categories form a bitonal image which is compressed using a pattern matching technique that takes advantage of the similarities between character shapes. A progressive, wavelet-based compression technique, combined with a masking algorithm, is then used to compress the foreground and background images at lower resolutions while minimizing the number of bits spent on the pixels that are not visible in the foreground and background planes. Encoders, decoders, and real-time, memory efficient plug-ins for various web browsers are available for all the major platforms.

Original languageEnglish (US)
Number of pages5
StatePublished - 1999
EventInternational Conference on Image Processing (ICIP'99) - Kobe, Jpn
Duration: Oct 24 1999Oct 28 1999


OtherInternational Conference on Image Processing (ICIP'99)
CityKobe, Jpn

ASJC Scopus subject areas

  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering


Dive into the research topics of 'Color documents on the web with DjVu'. Together they form a unique fingerprint.

Cite this