Notes From The Comp.Compression FAQ

Home
Introduction
Policy
FAQ
DHPC
SAKDC
MMDC
LZRW1
LZRW1-A
LZRW2
LZRW3
LZRW3-A
LZRW4
LZRW5
Support Code
Thesis
Interface
DCC91
Patents
Printing
Links
Link Here

This page is derived from the notes on patents in the comp.compression FAQ document (as it was at 21-Mar-1996).

Run Length Encoding Patents

Tsukiyama has two patents on run length encoding: US Patent 4586027 and US Patent 4872009. These were granted in 1986 and 1989 respectively. The first one covers run length encoding in its most primitive form: a length byte followed by the repeated byte. The second patent covers limiting the run length to 16 bytes and thus the encoding of the length on 4 bits.

O'Brien has patented run length encoding followed by LZ77 (US Patent 4988998).

LZ77 Patents

Waterworth patented a LZ77 variant (US Patent 4701745). This algorithm is generally referred to as as LZRW1, because Ross Williams reinvented it later and posted it on comp.compression on April 22, 1991. The same algorithm has later been patented by Gibson & Graybill (US Patent 5049881). The patent office failed to recognize that the same algorithm was patented twice, even though the wording used in the two patents is very similar.

The Waterworth patent is now owned by Stac, which won a lawsuit against Microsoft, concerning the compression feature of MS-DOS 6.0. Damages awarded were $120 million. (Microsoft and Stac later settled out of court.)

Fiala and Greene obtained in 1990 a patent (US Patent 4906991) on all implementations of LZ77 using a tree data structure. Claim 1 of the patent is much broader than the algorithms published by Fiala and Greene in Comm. ACM, April 1989. The patent covers the algorithm published by Rodeh and Pratt in 1981 (J. of the ACM, vol 28, no 1, pp 16-24). It also covers the algorithms used in LHARC, LHA and ZOO.

Notenboom (from Microsoft US Patent 4955066) uses three levels of compression, starting with run length encoding.

The Gibson & Graybill patent (US Patent 5049881) covers the LZRW1 algorithm previously patented by Waterworth and reinvented by Ross Williams. Claims 4 and 12 are very general and could be interpreted as applying to any LZ algorithm using hashing (including all variants of LZ78):

     4. A compression method for compressing a stream of input data into
     a compressed stream of output data based on a minimum number of
     characters in each input data string to be compressed, said
     compression method comprising the creation of a hash table, hashing
     each occurrence of a string of input data and subsequently searching
     for identical strings of input data and if such an identical string
     of input data is located whose string size is at least equal to the
     minimum compression size selected, compressing the second and all
     subsequent occurrences of such identical string of data, if a string
     of data is located which does not match to a previously compressed
     string of data, storing such data as uncompressed data, and for each
     input strings after each hash is used to find a possible previous
     match location of the string, the location of the string is stored
     in the hash table, thereby using the previously processed data to
     act as a compression dictionary.

Claim 12 is identical, with 'method' replaced with 'apparatus'. Since the 'minimal compression size' can be as small as 2, the claim could cover any dictionary technique of the LZ family. However the text of the patent and the other claims make clear that the patent should cover the LZRW1 algorithm only. (In any case the Gibson & Graybill patent is likely to be invalid because of the prior art in the Waterworth patent.)

Phil Katz, author of PKZIP, also has a patent on LZ77 (US Patent 5051745) but the claims only apply to sorted hash tables, and when the hash table is substantially smaller than the window size.

IBM patented (US Patent 5001478) the idea of combining a history buffer (the LZ77 technique) and a lexicon (as in LZ78).

Stac patented (US Patent 5016009 and US Patent 5126739) yet another variation of LZ77 with hashing. The '009 patent was used in the lawsuit against Microsoft (see above). Stac also has a patent on LZ77 with parallel lookup in hardware (US Patent 5003307).

Robert Jung, author of ARJ, has been granted patent US Patent 5140321 for one variation of LZ77 with hashing. This patent covers the LZRW3-A algorithm, also previously discovered by Ross Williams. LZRW3-A was posted on comp.compression on July 15, 1991. The patent was filed two months later on Sept 4, 1991. (The US patent system allowed this because of the "invention date" rule.)

Chambers US Patent 5155484 is yet another variation of LZ77 with hashing. The hash function is just the juxtaposition of two input bytes. The hash table is named 'direct lookup table'.

LZ78 Patents

One form of the original LZ78 algorithm was patented (US Patent 4464650) by its authors Lempel, Ziv, Cohn and Eastman. This patent is owned by Unisys.

The LZW algorithm used in 'compress' is patented by IBM (US Patent 4814746) and Unisys (US Patent 4558302). It is also used in the V.42bis compression standard (see question 11 on V.42bis below), in Postscript Level 2, in GIF and TIFF. Unisys sells the license to modem manufacturers for a onetime fee (contact: Welch Patent Desk, Unisys Corp., P.O. Box 500, Bluebell, PA 19424 Mailcode C SW 19). CompuServe is licensing the usage of LZW in GIF products for 1.5% of the product price, of which 1% goes to Unisys; usage of LZW in non-GIF products must be licensed directly from Unisys. For more information, see http://www.unisys.com/ or email to mailto:lzw_info@unisys.com. The IBM patent application was first filed three weeks before that of Unisys, but the US patent office failed to recognize that they covered the same algorithm. (The IBM patent is more general, but its claim 7 is exactly LZW.)

Klaus Holtz also claims that US Patent 4366551 for his "autosophy" data compression method covers LZ78 and LZW. According to Holtz, most of the largest V.42bis modem manufacturers have paid for patent licenses.

AP coding is patented by Storer (US Patent 4876541). (Get the yabba package for source code, see question 2 above, file type .Y) Storer also claims that his patent covers V.42bis.

Arithmetic Coding

IBM holds many patents on arithmetic coding (US Patent 4286256, US Patent 4295125, US Patent 4463342, US Patent 4467317, US Patent 4633490, US Patent 4652856, US Patent 4891643, US Patent 4905297, US Patent 4935882). It has patented in particular the Q-coder implementation of arithmetic coding. The JBIG standard, and the arithmetic coding option of the JPEG standard requires use of the patented algorithm. No JPEG-compatible method is possible without infringing the patent, because what IBM actually claims rights to is the underlying probability model (the heart of an arithmetic coder).

AT&T has 3 patents on arithmetic coding (US Patent 4973961, US Patent 5023611, US Patent 5025258).

Predictor Algorithm

The "predictor" algorithm was first described in the paper:

    Raita, T. and Teuhola, J. (1987), "Predictive text compression by hashing",
    ACM Conference on Information Retrieval.

This algorithm was patented (US Patent 5229768) by K. Thomas in 1993. It is used in the Internet Draft "PPP Predictor Compression Protocol" (see ftp://venera.isi.edu/internet-drafts/draft-ietf-pppext-predictor-00.txt).

General Comments

As can be seen from the above list, some of the most popular compression programs (COMPRESS, PKZIP, ZOO, LHZ, ASJ) are now covered by patents. (This says nothing about the validity of these patents.)