State of The Art Lossless Image Compression

0 downloads 0 Views 412KB Size Report
Oct 30, 1997 - In this paper we survey existing lossless compression schemes and also .... factor for applications involving online image compression and ...
1

State of The Art Lossless Image Compression Algorithms

S. Sahni, B. C. Vemuri, F. Chen, C. Kapoor, C. Leonard, and J. Fitzsimmons

Sahni, Vemuri, Chen and Kapoor are with the Dept. of Computer & Info. Science & Engineering, Univ. of Florida, Gainesville, Fl. 32611. Email:sahni|vemuri|[email protected] .edu Leonard is with the Dept. of Neuroscience, Univ. of Florida, Gainesville, Fl. 32610. Email:[email protected] .edu Fitzsimmons is with the Dept. of Radiology, Univ. of Florida, Gainesville, Fl. 32610. Email:j [email protected] .edu This research was support, in part, by the National Institute of Health under grant R01LM05944-03. Document Submitted to the IEEE Transactions on Image Processing for review. October 30, 1997

DRAFT

Abstract There are numerous applications of image processing, such as satellite imaging, medical imaging, and video where the image size or image stream size is too large and requires a large amount of storage space or high bandwidth for communication in its original form. Image compression techniques can be used e ectively in such applications. Lossless (reversible) image compression techniques preserve the information so that exact reconstruction of the image is possible from the compressed data. In this paper we survey existing lossless compression schemes and also present some new techniques such as grayscale mapping and variable bit length coding. Grayscale mapping is a simple technique that can be used to improve the compression ratios achievable by other image compression methods. The variable bit length coding technique developed by us provides greater compression than coding techniques such as Hu man coding and arithmetic coding when applied to several raw images. We also develop a general lossy+residual interpolation technique for image compression. The compression techniques surveyed and developed in this paper are compared using a variety of images.

Keywords Lossless compression, Linearization, Spatial domain, Transform domain, Coding Interpolation, S+P transform, Wavelet transform I. Introduction

Image comression plays a very important role in applications like tele-videoconferencing, remote sensing, document and medical imaging, and facsimile transmission, which depend on the ecient manipulation, storage, and transmission of binary, gray scale, or color images. Image compression techniques can be classi ed into two categories: lossless and lossy schemes. In lossless methods, the exact original data can be recovered, while in lossy schemes, only a close approximation of the original data can be obtained.

2

Image compression can be achieved via coding methods, spatial domain compression , and transform domain compression methods. Also, a combination of these methods can be used. Coding

methods are directly applied to the raw images treating them as a sequence of discrete numbers. Common coding methods include: Arithmetic, Hu man, Ziv-Lempel, and Run-Length. Spatial

domain methods which are a combination of spatial domain algorithms and coding methods, not only operate directly on the gray values in an image but also try to eliminate the spatial redundancy. In transform domain compression, the image is represented using an appropriate basis set, and the goal is to achieve a sparse coecient matrix. Discrete Cosine Transform (DCT) based compression and wavelet transform are two examples of transform domain methods. Recently, there has been a growing interest in reversible/lossless image compression of medical images. The sheer amount of data involved in the elds of picture archival and communication systems (PACS) coupled with the fact that no loss of delity can be tolerated in compression and decompression in most medical applications, has motivated this interest. There are numerous reversible image compression schemes in the literature, each has its own advantages and disadvantages [1], [2]. We survey some of the existing lossless compression schemes and discuss their pros and cons. Three new contributions to image compression are also made. We demonstrate that a simple and new technique{mapping of grayscale vaues into a contiguous range{often reduces the size of the compressed image obtained by other compression techniques. This method appears to have not been reported in the literature. Although Hu man coding and arithmetic coding are well studied coding methods, a new method developed by us{variable bit length coding{often outperforms these methods when applied to raw images. Finally, we develop a new lossy+residual scheme that outperforms previously known lossy+residual schemes as well as all other spatial domain compression methods. This paper is organized as follows. In section II we present performance measures for lossless October 30, 1997

DRAFT

3

compression. In section III we describe a preprocessing method which maps the grayscale values in an image into a contiguous range. This mapping scheme when used as a preprocessing step leads to better performance of most existing compression algorithms. Section IV describes di erent linearization schemes. These schemes can be used to convert a two-dimensional image into a one-dimensional grayscale stream. This conversion is necessary if we are to use standard coding schemes to compress images. Hu man coding, Lempel-Ziv coding, arithmetic coding, and variable bit length coding are discussed in section V. Various spatial domain compression methods including Variable Block Size Compression, JPEG's lossless compression, and a generalized lossy-residual approach developed by us are presented in section VI. In section VII we describe a transform domain compression scheme called the (S+P) transform which is based on the wavelet transform. Although experimental results are presented throughout the paper, section VIII, presents a comparison of the best compression methods from each category. The conclusions and scope for future work are discussed in section IX. II. Performance Measurement of Lossless Image Compression

The performance of a lossless image compression algorithm can be speci ed in terms of compression eciency and complexity. Compression eciency is measured by the compression ratio or by the bit rate. Compression ratio is the ratio of the size of the original image to the size

of the compressed image; and bit rate is the number of bits per pixel required by the compressed image. For example, a 256  256 8-bits per pixel image requires 256  256  8 bits = 65536  8 bits = 65,536 bytes when stored in uncompressed form. If the compressed image requires 32768 bytes, then the compression ratio is 65536=32768 = 2.0. Since the image has 256  256 = 65,536 pixels, the compressed le needs 32768  8=65536 = 4 bits per pixel, on average. Hence the bit rate is 4. October 30, 1997

DRAFT

4

The compression ratio and bit rate are related. Let b be the number of bits per pixel of the uncompressed image, CR the compression ratio, and BR the bit rate. The following equality is readily obtained

CR = b=BR The e ectiveness of a compression method can be compared to the entropy of the source image. The source entropy is de ned as the amount of information contained in a source. Suppose that the pixel gray values range from 0 to M ? 1. Let pi be the probability (i.e., frequency divided by the total number of pixels) of the gray value i. The information content of the image is given by its entropy [6]:

H (I ) = ?

MX ?1 i=0

pi log pi :

The units for entropy are bits per pixel. The e ectiveness of a lossless compression method is measured by determining how closely its bit rate approximates the source entropy, as the source entropy is a lower bound on the bit rate any lossless compression method can achieve [6]. Therefore, if the source entropy of an image is 4 bits/pixel and our lossless compressor has a bit rate of 4 bits per pixel, our lossless compressor has done the best job possible. The complexity of an image compression algorithm is measured by the number of arithmetic operations required to perform both the encoding and decoding processes. This is an important factor for applications involving online image compression and decompression where speed is crucial.

October 30, 1997

DRAFT

5

(a)

(b)

(c)

(d)

(e)

(f)

Fig. 1. Natural Images (a)lenna (b)man (c)chall (d) coral (e) shuttle (f) sphere

October 30, 1997

DRAFT

6

(a)

(b)

(c)

(d)

Fig. 2. Medical Images (a) brain1 (b) brain2 (c) slice15 (d) sag0

October 30, 1997

DRAFT

7

III. Preprocessing{Gray Level Mapping

Generally, images do not have all the gray values within the dynamic range of gray level values determined by the resolution (bits/pixel). For example, the pixels of an 8 bit per pixel image may use only 170 of the 256 possible gray values. In gray level mapping the gray values of an image are mapped to the range 0 ? n where n + 1 is the number of gray values actually present. For example, suppose we have an image whose pixels have the gray levels 1, 3, 5, 7. These values are mapped to the range 0-3 using a mapping table map[0 : 3] = [1, 3, 5, 7]. map[i] can be used to remap the gray value i to its original value. From the mapped image and the mapping table, we can reconstruct the original image. Gray level mapping can help in improving compression by increasing the correlation between adjacent pixel values. Table I gives the compression ratio achieved by the popular compression program gzip on the suite of 10 images shown in g. II and g. II. Since gzip works by replacing common sequences with the same code and since mapping does not a ect these sequences, the combination of mapping and gzip is actually inferior to using gzip alone because when mapping is used, we incur the overhead of storing the mapping table. TABLE I

Comparison of Gzip on row-major images with/without mapping

image lenna man chall coral shuttle sphere brain1 slice15 head sag1

October 30, 1997

raw mapped 1.18 1.17 1.39 1.39 1.43 1.43 1.32 1.32 1.44 1.44 1.30 1.30 1.75 1.75 1.73 1.72 1.80 1.80 1.68 1.67

DRAFT

8

IV. Linearlization Schemes

When coding schemes such as Hu man coding, arithmtic coding, and Liv-Zempel coding are used to compress a two-dimensional image, the image must rst be converted into a onedimensional sequence. This conversion is referred to as linearization. Coding schemes such as Hu man coding depend only on the frequency of occurrence of di erent gray values. Since linearization does not a ect this frequency, coding schemes in this category are una ected by the particular method used to linearize a two-dimensional image. On the other hand, coding schemes such as arithmetic coding and Liv-Zempel coding depend on the relative order of gray scale values and so are sensitive to the linearization method used. Natural images have local and global redundancy. Local redundancy causes a given neighborhood in the image to exhibit coherence or correlation (referred to as smoothness of data). Some linearization schemes are more e ective at keeping pixels that are close in the two-dimensional image, close in the one-dimensional sequence. Therefore, these schemes are more e ective in preserving the local redundancy of the image and are expected to yield better compression when coupled with a coding scheme that can take advantage of local redundancy. Some of the more popular linearization schemes are given below. Each scans the image pixels in some order to produce the one-dimensional sequence. 1. Row-Major Scan: The image is scanned row by row from top to bottom, and from left to right within each row. 2. Column-Major Scan: The image is scanned column by column from left to right, and from top to bottom within each column. 3. Diagonal Scan: The image is scanned along the antidiagonals (i.e., lines with constant row plus column value) beginning with the top-most antidiagonal. Each antidiagonal is scanned October 30, 1997

DRAFT

9

from the left bottom corner to the right top corner. 4. Snake-like Row-Major Scan: This is a variant of the row-major scan method described above. In this method, the image is scanned row by row from top to bottom, and the rows are alternately scanned from left to right and from right to left. The top-most row is scanned from left to right (as in g. 3 (a)). Snake-like variants of column-major and diagonal scans can be de ned in a similar manner (see g. 3 (b)). k=1

k=2

k=3

k=1

k=2

k=3

(a)

(b)

(c)

(d)

Fig. 3. (a)snake-like row major scan path (b)snake-like diagnal scan path (c) spiral scan path (d) peano scan path

5. Spiral Scan: In this, the image is scanned from the outside to the inside, tracing out a spiral curve starting from the top left corner of the image and proceeding clockwise (see g. 3 (c)). 6. Peano-Hilbert Scan: This scan method is due to Peano and Hilbert, and is best described recursively as in g. 3 (d). This method requires the image to be a 2k  2k image. When k is odd, the scan path starts at the leftmost pixel of the rst row and ends at the leftmost pixel of the bottom row. When k is even, the path starts at the leftmost pixel of the rst row and ends at the right-most pixel of this row. In a Peano-Hilbert scan, the image is scanned quadrant by quadrant. The scan path for a 2k  2k image for k = 1, 2, and 3 is shown in g. 3 (d). Table II gives the compression ratios achieved by gzip on our sample suite of 10 images. As October 30, 1997

DRAFT

10

noted in Section III, the mapping technique does not improve compression when compression is done using gzip. Di erent linearization schemes result in di erent compression ratios when gzip is used for compression. For example, the compression ratios for man range from 1.3 to 1.4, and those for brain1 range from 1.63 to 1.75. Although no linearization method provide highest compression for all images, the Peano-Hilbert scan did best most often. TABLE II

Comparison of different linealization schemes with/without mapping using Gzip

Image lenna man chall coral shuttle sphere brain1 slice15 head sag1

row major raw mapped 1.18 1.17 1.39 1.39 1.43 1.43 1.32 1.32 1.44 1.44 1.30 1.30 1.75 1.75 1.73 1.72 1.80 1.80 1.68 1.67

diagonal raw mapped 1.18 1.17 1.31 1.30 1.39 1.39 1.26 1.26 1.43 1.42 1.29 1.29 1.63 1.63 1.63 1.63 1.75 1.74 1.66 1.64

snake raw mapped 1.17 1.17 1.39 1.38 1.43 1.43 1.32 1.32 1.44 1.44 1.30 1.30 1.75 1.75 1.73 1.73 1.80 1.80 1.69 1.67

spiral raw mapped 1.19 1.19 1.40 1.39 1.43 1.43 1.28 1.28 1.42 1.41 1.33 1.32 1.69 1.69 1.68 1.68 1.78 1.78 1.74 1.72

peano raw mapped 1.20 1.20 1.39 1.39 1.47 1.47 1.29 1,29 1.47 1.46 1.31 1.31 1.67 1.66 1.66 1.65 1.77 1.76 1.71 1.69

Table III gives the compression ratios achieved by the Unix compression utility compress on our sample suite of 10 images. Compress did not provide better compression than provided by gzip on any of our test images. Consequently, we shall not report further results using compress V. Coding methods

Coding schemes that are designed for sequential data compression can be used to compress two-dimensional images provide we linearize these images. In this section, we rst discribe coding schemes for sequential data and then compare their performance when used to compress two-dimensional images.

October 30, 1997

DRAFT

11

TABLE III

Comparison of different linealization schemes with/without mapping using Compress Command

Image lenna man chall coral shuttle sphere brain1 slice15 head sag1

row major raw mapped 1.07 1.07 1.30 1.30 1.32 1.32 1.21 1.20 1.37 1.37 1.17 1.17 1.70 1.70 1.67 1.66 1.79 1.78 1.58 1.57

diagonal raw mapped 1.08 1.07 1.23 1.22 1.29 1.29 1.15 1.15 1.29 1.29 1.17 1.17 1.57 1.57 1.56 1.55 1.71 1.70 1.55 1.53

snake spiral raw mapped raw mapped 1.07 1.07 1.11 1.11 1.31 1.30 1.32 1.31 1.33 1.32 1.37 1.37 1.21 1.20 1.21 1.20 1.37 1.37 1.33 1.33 1.18 1.17 1.22 1.21 1.70 1.70 1.65 1.65 1.67 1.66 1.64 1.63 1.78 1.78 1.77 1.76 1.59 1.58 1.61 1.60

peano raw mapped 1.10 1.10 1.32 1.31 1.38 1.38 1.20 1.20 1.36 1.35 1.21 1.20 1.63 1.63 1.62 1.61 1.75 1.75 1.60 1.59

A. Hu man Coding

In Hu man coding each symbol of the uncompressed data is replaced by a code. The symbol codes are of variable length and symbol that occur more frequently in the uncompressed data have codes of smaller length than symbols that occur less frequently. The symbol codes satisfy a

pre x property{no code is a proper pre x of another code. By encoding frequently occurring symbols by short codes and infrequently occurring symbols by longer codes, an overall reduction in the space needed by the data is obtained. For the case of images, individual pixel values are considered to represent individual symbols and the symbol set consists of all gray values. For an 8 bits per pixel image, the size of the symbol set is 28 = 256. Assume that the data to be compressed has ve symbols: fs1, s2, s3, s4, s5g with the following probabilities: f0.1, 0.3, 0.25, 0.15, 0.2g. The Hu man codes for n symbols can be computed in

O(n log n) time time using a greedy algorithm (see [36], for example). Using the greedy code construction algorithm, we obtain the following symbol codes: f100, 11, 01, 101, 00g. Notice that no code is a pre x of another code and that the symbol s2 with higher probability has the October 30, 1997

DRAFT

12

2-bit code 11 while the symbol s1 with lower probability has the 3-bit code 100. In Hu man coding we use a xed preset coding table which is based on an estimated frequency distribution of the values to be coded. In adaptive Hu man coding the coding table is generated individually (adaptively) from the symbol probabilities (or frequencies) for the data to be coded. Although adaptive coding results in higher compression ratios, all the data to be compressed must be available before compression can start because the symbol frequencies have to be determined before we can determine the symbol codes. To decompress the data, we need the code table. Using this, the original uncompressed data can be reconstructed by examining the compressed data from begin to end. Since adaptive Hu man coding relies only on the frequency of occurrence of di erent symbols in the source le, and since neither mapping nor linearization a ect these frequencies, the compression provided by Hu man coding is insensitive to the use of mapping as well as to the type of linearization scheme used. Table V-A gives the compression ratios obtained on our 10 image suite. The results are inferior to those obtained using gzip on all 10 images. This is not surprising as gzip is essentially LZW coding followed by Hu man coding. For the head image, Hu man coding is almost as good as gzip; suggesting that on this image gzip gets most of its compression from its postprocessing Hu man coding step. Image lenna man chall coral shuttle sphere brain1 slice15 head sag1 CR 1.09 1.09 1.17 1.16 1.18 1.25 1.60 1.57 1.79 1.60 B. LZW Coding

Lempel and Ziv [34] and Welch [37] have proposed an adaptive coding method that does not require all the data to be compressed to be available at the start. Rather, their technique generates codes as it examines the source le from begin to end. In Hu man coding a variable October 30, 1997

DRAFT

13

length code is constructed for each symbol in the source le. In Lempel-Ziv coding xed-length codes are constructed, on the y, for variable-length sequences of symbols. Assume that the symbols that occur in the source le are a, b, c and that the string

ababcabc is to be compressed. First, we initialize a code-to-string dictionary to contain the single-symbol strings that are possible. For our example, these strings are a, b, and c. The string code is given by its position in the dictionary. So the code for a is 0, that for b is 1, and that for c is 2. Then starting from the rst symbol in the le to be compressed, we nd the longest pre x of the input which is in the dictionary. The longest pre x of our source le ababcabc that is in the dictionary is a. Its code 0 is output as part of the compressed le and the pre x plus the next input symbol entered into the dictionary. In our case, the string ab is entered into the dictionary. Its code is 3. The portion of the source le that remains to be compressed is babcabc. Next we nd the longest pre x of this remaining le that is in the dictionary. The pre x is b. Its code 1 is output to the compressed le, and the pre x b and the next symbol a of the input entered into the dictionary as the string ba with code 4. Following this, we again determine the longest pre x of the remaining input le that is in the dictionary. This time the longest pre x is string

ab. The code 3 for ab is output and we put string abc (c is the symbol following the pre x) into the dictionary. The code for abc is 5. The next longest pre x found is c. Its code 2 is output and ca is entered into the dictionary. The remaining input string is abc. It is in the dictionary. So its code 5 is output. Therefore, the coding of the string ababcabc is 01325. Given the encoded string, we can decode it in the reverse way. We do not report any experimental results for the LZW method as gzip is a combination of LZW and Hu man coding, and results for gzip were presented in Section IV. October 30, 1997

DRAFT

14

C. Arithmetic Coding

In this method, a message is coded as a subinterval of the interval [0, 1), where [x, y) denotes a half open interval, which includes x but excludes y [5]. There are two fundamental concepts in arithmetic coding: the probability of a symbol, and encoding interval range for a symbol. The occurrence probabilities of source symbols determine the compression eciency as well as the interval ranges of source symbols for the encoding process. These interval ranges are contained within the interval from zero to one and determine the compression output. We use a simple example to explain the process of arithmetic coding. Let's assume that the source symbols are fa, bg and the probabilities of these symbols are 0.4, 0.6, respectively. Then, based on these probabilities, the interval [0, 1) can be divided into the two sub-intervals: [0, 0.4) and [0.4, 1). The range [0, 0.4) is used to further subdivide an interval when the symbol a is encountered in the input le and the range [0.4,1) is used when the symbol b is encountered. To encode the string baab as a subinterval of [0, 1), we begin with [0, 1) as the working

subinterval and use the encoding range of the rst symbol b from the input string to narrow the working subinterval to the last 6/10 of the current working subinterval. Therefore, the interval [0, 1) is narrowed to [0.4, 1). Since the range of the second symbol a is [0, 0.4), the working subinterval [0.4, 1) is reduced to its rst 4/10 portion to obtain the new working subinterval [0.4, 0.64). The third input symbol is a, and its encoding reduces the current interval [0.4, 0.64) to its rst 4/10 portion, that is to [0.4, 0.496). Encoding the fourth symbol b produces the nal interval [0.4576, 0.496). The input string baab is encoded as the interval [0.4576, 0.496). From the symbol probabilities and the encoded interval, we can reconstruct the input le. For example, the interval [0.4576, 0.496) is a subinterval of bs interval [0.4, 1). So the rst symbol of the source string is b and the working subinterval becomes [0.4, 1). Next we see that [0.4576, October 30, 1997

DRAFT

15

0.496) is in the left 4/10 of the interval [0.4,1). So the next symbol in the source le is a and the working subinterval becomes [0.4, 0.64). Since [0.4576, 0.496) lies wholly in as portion of the working subinterval, the next source symbol is also a and the working subinterval becomes [0.4, 0.496). Now we see that [0.4576, 0.496) lies in the right 6/10 of the working subinterval. So the next source symbol is b and the working subinterval becomes [0.4576, 0.496). At this time decoding terminates because the working subinterval and the coded subinterval are the same. If each source le is assumed to end with a unique symbol (such as end-of- le), the coded le can be represented by any number in the nal working subinterval rather than by the endpoints of this interval. Although our preceding description of arithmetic coding uses high-precision oating point arithmetic, the scheme can be adapted to use integers alone [5]. Furthermore the coding scheme can be made adaptive, that is the probablilities and hence subranges of [0, 1) associated with each symbol can be made to depend on the symbol frequencies in the source le that is being coded. For this we maintain the symbol frequencies as symbols are encountered during the coding process and adjust the ranges assigned to each symbol according to the current cummulative symbol frequencies. Since decoding generates the symbols in the same order as used in coding, the cummulative frequencies can be computed during decoding and the symbol ranges adjusted accordingly. The exact method used to adjust the symbol ranges is called a model. Table IV shows the compression ratios achieved by arithmetic coding. The mapping technique improves the compression ratio very slightly. As was the case for gzip, the performance of arithmetic coding is sensitive to the linearization scheme used. Once again, no linearization scheme provided maximum compression on all images, though the Peano-Hilbert scan came close to doing this. Gzip provided more compression on the rst six images.

October 30, 1997

DRAFT

16

TABLE IV

Comparison of different linealization schemes with/without mapping using Arithmetic Coding

Image lenna man chall coral shuttle sphere brain1 slice15 head sag1

row major raw mapped 1.13 1.13 1.23 1.22 1.33 1.34 1.27 1.28 1.32 1.32 1.29 1.29 1.71 1.74 1.70 1.70 1.93 1.93 1.95 1.96

diagonal raw mapped 1.16 1.16 1.23 1.19 1.30 1.31 1.23 1.24 1.35 1.35 1.29 1.29 1.71 1.73 1.69 1.69 1.91 1.91 1.98 1.99

snake spiral raw mapped raw mapped 1.13 1.13 1.12 1.12 1.23 1.22 1.19 1.18 1.33 1.34 1.23 1.24 1.27 1.28 1.19 1.19 1.32 1.32 1.25 1.25 1.29 1.29 1.29 1.29 1.71 1.74 1.77 1.80 1.70 1.70 1.79 1.79 1.93 1.92 2.01 2.01 1.97 1.98 2.07 2.08

peano raw mapped 1.17 1.17 1.25 1.25 1.33 1.34 1.27 1.27 1.38 1.38 1.30 1.29 1.73 1.76 1.72 1.72 1.94 1.94 2.02 2.03

D. Run-Length Coding

In run-length coding we decompose the source le into segments of identical symbols, each segment is replaced by a pair of the form (symbol, number of occurrences). For example, the source le aaaabaaabbb is codes as (a, 4), (b, 1), (a, 3), (b, 3). This coding scheme works well when we have many long segments and works poorly when we have many short segments. Since our test images do not have long segments, we did not experiment with run-length coding. More details about this coding method can be found in [6]. E. Variable Bit Length Coding

In a traditionally stored source le, each symbol is stored using the same number of bits (usually 8). In variable bit length coding (VBL) di erent symbols are stored using a di erent number of bits. Since the VBL method does not seem to have appeared in the literature before, we describe it in greater detail than we described the other coding methods of this section. Suppose that our source le is a linearized image in which each pixel has gray values between 0 October 30, 1997

DRAFT

17

and 255, the source le is stored using 8 bits per pixel. However, the gray values 0 and 1 need only one bit each; values 2 and 3 need two bits each; values 4, 5, 6, and 7 need only three bits each; and so forth. In the VBL representation, each gray value is stored using the minimum number of bits it requires. To decode the compacted bit representation, we need to know the number of bits used for each pixel. Rather than store this number with each pixel, the run-length coding method is used and the pixels divided into segments, the gray values in each segment require the same number of bits. Unfortunately, for typical images the space needed to store the segment lengths, the bits per pixel in each segment, and the compacted gray values often exceeds the space needed to store the gray values in xed length format. However, we can combine adjacent segments together using the strategy given below. First we summarize the steps in VBL coding.

Create Segments The source symbols are divided into segments such that the symbols in each segment require the same number of bits. Each segment is a contiguous chunk of symbols and segments are limited to 2s symbols (best results were obtained with s = 8). If there are more than 2s contiguous symbols with the same bit requirement, they are represented by two or more segments.

Create Files Three les SegmentLength, BitsPerPixel, and Symbols are created. The rst of these les contains the length (minus one) of the segments created in step 1. Each entry in this le is s bits long. The le BitsPerPixel gives the number of bits (minus one) used to store each pixel in the segment. Each entry in this le is d bits long (for gray values in the range 0 through 255, d is 3). The le Symbols is a binary string of symbols stored in the variable bit format.

Compress Files Each of the three les created in step 2 is compressed/coded to reduce its space requirements. The compression ratios that we can achieve using VBL coding depends very much on the October 30, 1997

DRAFT

18

presence of long segments that require a small number of bits. Suppose that following step 1, we have n segments. The length of a segment and the bits per pixel for that segment are referred to as the segment header. Each segment header needs k + d bits of space. Let li and bi , respectively, denote the length and bits per symbol for segment i. The space needed to store the symbols of segment i is li  bi . The total space required for the

P three les created in step 1 is (k + d)  n + ni=1 li  bi . The space requirements can be reduced by combining some pairs of adjacent segments into one. If segments i and i + 1 are combined, then

the combined segment has length li + li+1 . Each pixel now has to be stored using maxfbi ; bi+1 g bits. Although this technique increases the space needed by the le Symbols, it reduces the number of headers by one. Let sq be the space requirements for an optimal combining of the rst q segments. De ne s0 = 0. For an instance with i > 0 segments, suppose that, in an optimal combining, C , segment i is combined with segments i ? 1, i ? 2,   , and i ? r + 1 but not with segment i ? r. The space,

si , needed by the optimal combining C is: space needed by segments 1 through i ? r + lsum(i ? r + 1; i)  bmax(i ? r + 1; i) + 11

P where lsum(a; b) = bj=a lj and bmax(a; b) = maxfba ; :::; bb g. If segments 1 through i ? r are not combined optimally in C , then we change their combining to one with smaller space requirement and hence reduce the space requirement of C . So in an optimal combining C , segments 1 through i ? r must also be combined optimally. With this observation, the space requirements for C become

si = si?r + lsum(i ? r + 1; i)  bmax(i ? r + 1; i) + 11 The only possibilities for r are the numbers 1 through i for which lsum does not exceed 2k October 30, 1997

DRAFT

19

(recall that segment lengths are limited to 2k ). Although we do not know which is the case, we do know that since C has minimum space requirement, r must yield the minimum space requirement over all choices. So we get the recurrence

si = 1ki min

lsum(i?k+1;i)2k

fsi?k + lsum(i ? k + 1; i)  bmax(i ? k + 1; i)g + k + d

Using this dynamic programming formulation, we can determine the optimal way to combine segments. Once this has been determined, the segments created in step 1 are combined and the three les of step 2 created. Decoding is quite straightforward. Table V gives the compression ratios achieved by the VBL coding method. For this method, using mapping as a preprocessor can make a dramatic impact on the achieved compression. For example, the compression ratio achieved for brain1 linearized using a Peano-Hilbert scan is 1.42 without mapping and 1.86 with mapping. Surprisingly, VBL coding with mapping and a PeanoHilbert scan outperforms arithmetic coding with a Peano-Hilbert scan on 9 of our 10 images and outperforms gzip with a Peano-Hilbert scan on 6 (it is tied with gzip on a seventh image).

Table VI compares the highest compression ratio acheived by each coding scheme. VBL coding did best on 6 of our 10 images and gzip did best on the remaining 4. VI. Spatial Domain Algorithms

Spatial domain techniques involve methods for reducing the number of bits required to represent the information contained in the image by directly operating on the raw image. Such methods usually include two stages. The rst stage involves the use of techniques such as image October 30, 1997

DRAFT

20

TABLE V

Comparison of different linealization schemes with/without mapping using VBL Coding

Image lenna man chall coral shuttle sphere brain1 slice15 head sag1

row major raw mapped 1.10 1.16 1.32 1.35 1.24 1.39 1.23 1.33 1.32 1.37 1.06 1.19 1.41 1.83 1.73 1.84 1.87 2.01 2.11 2.09

diagonal raw mapped 1.10 1.16 1.28 1.31 1.21 1.35 1.19 1.29 1.28 1.32 1.05 1.18 1.36 1.77 1.66 1.78 1.82 1.96 2.07 2.05

snake spiral raw mapped raw mapped 1.10 1.16 1.12 1.19 1.32 1.35 1.33 1.37 1.24 1.39 1.26 1.41 1.23 1.33 1.22 1.32 1.33 1.37 1.30 1.34 1.06 1.20 1.09 1.21 1.41 1.83 1.41 1.83 1.73 1.84 1.72 1.84 1.87 2.01 1.88 2.02 2.11 2.09 2.17 2.15

peano raw mapped 1.13 1.21 1.37 1.39 1.30 1.45 1.26 1.37 1.35 1.39 1.10 1.22 1.42 1.86 1.76 1.87 1.90 2.05 2.20 2.17

TABLE VI

Comparison of the best results of the coding schemes

Image lenna man chall coral shuttle sphere brain1 slice15 head sag1

Gzip Compress Hu man Arithmetic VBL 1.20 1.11 1.09 1.17 1.21 1.40 1.32 1.09 1.25 1.39 1.47 1.38 1.17 1.34 1.45 1.32 1.21 1.16 1.28 1.37 1.47 1.37 1.18 1.38 1.39 1.33 1.22 1.25 1.30 1.22 1.75 1.70 1.60 1.80 1.86 1.73 1.67 1.57 1.79 1.87 1.80 1.79 1.79 2.01 2.05 1.74 1.61 1.60 2.08 2.20

segmentation, subsampling and interpolation, etc. This is followed by a second stage involving eciently encoding the results from the rst stage. In the rst stage, image segmentation schemes that have been used include partitioning the image into regular shapes (rectangles) or into irregular shapes. The former has the advantage that it takes less storage to code the shape [3] due to simplicity of the primitive used while the latter is comparitively storage intensive because of the generality and complexity of the shapes resulting from the segmentation algorithm. Simpler October 30, 1997

DRAFT

21

shape primivites require very few bits to code the shape but are restricted in their generality of representation and hence a large number of primitives may be required to represent the entire image. On the other hand, allowing generality in shape primitives would increase the complexity of coding the primitive itself even though, a small number of these may suce to represent the entire image. We will not focus on the latter techniques since most of the existing techniques in the literature in this category are lossy compression algorithms [39]. In the following, we describe various state of the art image compression algorithms that primarily use spatial domain information either directly or in processed form (e.g., segmented data). A. Variable Block Size Compression

In [3], Ranganathan et al. developed a lossless image compression algorithm which exploits local and global redundancy present in most images. Their algorithm segments the image into variable size blocks and encodes them based on the properties exhibited by the pixels within a block. This encoding is achieved via a run-length coding scheme when all the pixels within a block are of the same gray value and a base o set coding scheme when the degree of variation of gray value within a block is small. The encoding schemes in this case are exploiting the local redundancy present in the image. For blocks of size (2,2), a block matching scheme is used to determine if an identical block exists in the neighborhood of the reference block thereby exploiting the global redundancy present in the image. When the variation of the gray level values within a block exceeds a speci c threshold, data in the block are transmitted as is. This algorithm performed better than the Hu man, Arithmetic, Lempel-Ziv, and the JPEG lossless schemes on all our test images but worse than our VBL coding in conjunction with the interpolation technique which is introduced in a subsequent section.

October 30, 1997

DRAFT

22

B. Lossy+Residual Approaches

The lossy-residual approach includes many methods whose basic concept is to send a lossy image primarily, followed by the residual with respect to the original image. Using the lossy+residual, the original image can be reconstructed exactly. The lossy image can be transmitted in many di erent forms. One way is to transmit it as the compressed data output of a lossy compression algorithm. There are numerous lossy compression algorithms that can be used to e ectively compress images with high compression ratios. The reconstructed image in this case is an approximation of the original and can be used as an estimate of the original. If the estimate is a good approximation of the original then the residual (di erence between the original and reconstructed images) will contain only small values. This residual image can be linearized to a sequence and e ectively compressed using schemes like arithmetic coding and variable bit length coding described in section V-E. The main advantage of this kind of a lossless scheme is that it also exploits the high compression ratios obtained from lossy schemes. The lossless JPEG compression standard is a simple version of lossy+residual approach. We will rst describe the lossless JPEG compression standard, then give a more general approach based on the use of interpolation. B.1 The JPEG Compression Standard There are four modes of operation in JPEG namely, sequential DCT based mode, sequential lossless mode, progressive DCT based mode, and hierarchical mode. Even though the JPEG Baseline algorithm is the simplest and most widely used compression algorithm in the JPEG

family, it is a lossy compression algorithm since quantization and nite precision in computation of DCT and LDCT introduce distortion. Since the focus of this paper is on lossless compression October 30, 1997

DRAFT

23

algorithms, we will only be concerned with the JPEG standard of a predictive coding based lossless compression algorithm. The JPEG standard speci es the following two coding methods for a lossless mode of operation.  Lossless method with Hu man coding.  Lossless method with arithmetic coding.

The JPEG lossless mode uses a predictive coding technique. The lossless mode in JPEG is fully independent of the tranform-based coding and instead applies a di erential coding to form the residuals which are then coded using either the Hu man or arithmetic coding methods. The prediction residual is formed using previously encoded pixels in the current row and/or previous row of the image. The prediction residual at a pixel location x is de ned as R = P x ? x, where

P x is the predicted value and can be any of the values de ned in the table shown in gure VII. The pixel values at locations a; b and c are available at the encoder as well as the decoder prior to processing of the location x. The chosen prediction function is encoded into the header of the compressed stream and thus is easily readable at the decoder end. When prediction methods (4) through (7) are used, values in the rst row are predicted using method (1) and those in the rst column are predicted using method (2). For more details on the JPEG lossless mode of operation, we refer the reader to [42]. c

b

a

x

Fig. 4. Image samples used in JPEG lossless prediction

The JPEG lossless coding algorithm has several advantages. It involves very simple computations and hence is easy to implement. It can easily and eciently be mapped to hardware because of the simplicity of the algorithm. October 30, 1997

DRAFT

24

TABLE VII

Predictors for JPEG Lossless

Selection-value 0 1 2 3 4 5 6 7

Prediction equation no prediction Px=a Px=b Px=c Px=a+b-c Px=a+(b-c)/2 Px=b+(a-c)/2 Px=(a+b)/2

B.2 A General Approach of Lossy+Residue Method In this section, we present a general interpolation based method for lossless compression of gray level images. The input to our algorithm is a mapped gray level image obtained using the mapping method described in Section VIII. The basic idea here is that the image is estimated using interpolation from a subsampled set of points derived from the original image. We now describe the key ingredients of our algorithm. 1. Subsampling Strategy As an initial approach, a regular grid is chosen to subsample the image. The grid is uniquely de ned by the number of pixels between two adjacent subsampled pixels. Using this parameter, the compressor and the decompressor can reproduce the grid and obtain a subsampling of the original image or ll in the subsampled (transmitted) values at the appropriate locations in the image. There is a tradeo between the sparseness of the grid and the accuracy of the estimate obtained via interpolation leading to a better compression on the di erence data le. A sparser grid o ers the advantage of sending lesser data for the control point le { a le containing the sample points. In order to combine both the advantages, the concept of interpolation with reuse is used which involves using the original image values October 30, 1997

DRAFT

25

at all locations, when performing an interpolation on sparse grids. This has the advantage of using a sparser grid and also, points other than the control points on a denser grid are all interpolated using a denser grid which yields a better estimate. Exact reconstruction is feasible at the decompressor end using the same concept of reuse. Better subsampling of points can be obtained by incorporating homogeneity constraints in the image via a quadtree decomposition or rectangle coverings [30], [31], [35]. We experimented on subsampling via a quadtree decomposition followed by di erent quadtree traversal schemes and picked one that yielded better compression results. 2. Interpolation Strategies A good estimate of the original image is contingent on the kind of interpolation used. For example, in a locally smooth image, interpolating a pixel from its immediate neighbors, in general, is better than interpolating it from far away pixels, because of the assumption of local smoothness. Hence, the interpolation scheme is a crucial factor in the e ectiveness of this compression scheme. In a later section, we present an overview of the di erent interpolation schemes that were used. One of the interpolation schemes used was gray level averaging wherein the gray value of the pixel was estimated by numerically averaging the values of the neighboring pixels. If the majority of pixels in a neighborhood have similar values and if the pixels to be used for interpolation, that is, the subsampled points, are picked from this majority, the averaging scheme can be expected to yield a good solution. However, there may be certain outliers in the neighborhood, i.e., points whose values di er greatly from the majority of the points in the neighborhood. These outliers may represent the noise pixels in general and will cause an error in approximation by interpolation. In order to avoid using these noise pixels in averaging, the outliers need to be detected [28]. Several heuristics may be used for this October 30, 1997

DRAFT

26

purpose. Many di erent heuristics were experimented with and the following (illustrated in g 5) was found to yield the best result. First, the points used for interpolation are sorted by gray value in an ascending order. If the number of points being used is greater than 4, then, during interpolation by averaging, we exclude from the computation, those points corresponding to the minimum and maximum of the gray values; else if the number of points is 3, we exclude either the minimum or the maximum depending on which one is farthest from the median value. A

125

B

204

E 127 C

120 D

disregard B, D during averaging interpolated value at E = 126

Fig. 5. Heuristic used for disregarding outliers during averaging

The interpolation results are a ected by the number and location of the neighboring pixels used in the interpolation based on averaging. Experiments were carried out using four and eight neighbors in performing the interpolation. More sophisticated interpolation schemes such as tensor product splines [23], membrane or thinplate splines [21] were also experminted with. The concept of interpolation with reuse can once again be applied in these cases. In addition to the subsampling and interpolation technqiues described above, we also developed a hierarchical subsampling and interpolation scheme that in conjunction with arithmetic coding yielded the best compression results. In this hierarchical scheme, we rst use the four corner pixels of the image and linearly interpolate at ve locations, four of which are the mid points of the segments joining the four chosen corners and one is the mid point of the October 30, 1997

DRAFT

27

square/rectangle. This process of mid point interpolation is repeated within each of the four quadrants obtained in the rst step. The hierarchical subdivision of the domain yields a quadtree which can be traversed in several ways for linearization. We choose the traversal scheme that yields the best compression ratio after application of the arithmetic coding scheme. In some cases, when the image contains large local variations in gray scale, the di erence between the interpolated value and the true value at an interpolated location may require more bits for coding than the true value. In such cases, we keep the true value and discard the interpolated value. We use one bit per quadtree decomposition level to indicate whether or not interpolated values are being considered. 3. The Actual Compression Subsampling and interpolation steps lead to a representation of the data which is well suited for compression. The information to be transmitted to the decompression end includes the subsampled pixel values (which will be called the control points) and the di erence values. The di erences may comprise of both positive and negative values and can be represented as an absolute value with a sign bit. The overall compression ratio achieved is dependent on how well this information { the control points and the di erence values { can be further compressed. The primary compression is achieved by compressing the linearized absolute di erence image. Techniques like VBL coding described in section V-E and arithmetic coding, were used for this purpose. The control point le can be regarded as a smaller image and is subjected to another level of subsampling and interpolation to increase its compression. The sign le consists of the sign bits (for di erences) in a packed format. It is further compressed using gzip. Experiments were carried out to run length code the sign bits and use Hu man coding on the run lengths. However, this scheme did no better than simply using gzip. October 30, 1997

DRAFT

28

A critical factor here is the type of linearization of the control points and the absolute di erences (in 2-D). Any natural image as already described has local as well as global redundancies. The local redundancy property causes a given neighborhood in the image to exhibit coherence or correlation (referred to as smoothness of data). The e ectiveness of a coding scheme partly is dependent on how similar/close are adjacent values in the 1D sequence. The type of ordered scan has much to do with this factor and hence, will in uence the compression obtained on a linearized image. Di erent types of linearization schemes were used in our experiments and the one that yielded the best compression was adopted. The linearization schemes experimented with included row Major, column major, spiral, snake-like row major and Peano Hilbert [26], all which were described earlier. 4. The Compression / Decompression Algorithms The following pseudo-algorithms incorporate ideas discussed in this section and give the outline of the interpolation-based scheme.

Compression: (a) Subsample the image I to obtain the control point image C1 . (b) Use an interpolation scheme (with reuse) on C1 to form an estimate I 0 of the original image. (c) Form a di erence image D = I ? I 0 . (d) Subsample the rst level control point image, C1 to obtain a second level control point image C2 . (e) Use an interpolation scheme (with reuse) on C2 to form an estimate C10 of C1 . (f) Update the di erence image D to incorporate the di erences due to second level interpolation.

October 30, 1997

DRAFT

29

(g) Peano-Hilbertize C2 and D (without the zero di erences at the control point locations) and further compress the sequences so obtained using a compression scheme for a sequence of numbers.

Decompression: (a) Decompress the compressed data to obtain the di erence image D and the second level control point image, C2 . (b) Use the same interpolation scheme as in compressor on C2 to obtain an estimate C10 of the rst level control point image C1 . (c) Reconstruct C1 = C10 + Dcontrol where Dcontrol are the di erences corresponding to rst level control points. (d) Use the same interpolation scheme as in compressor on C1 to obtain an estimate I 0 of the original image I . (e) Reconstruct the original image I = I 0 +Drest where Drest are the di erences corresponding to the points other than the rst level control points. In Table VIII, we present the results of applying three spatial domain compression algorithms to our 10 image suite. The methods chosen for comparison are JPEG lossless, the variableblock size compression scheme of Ranga et al. [3], and our proposed interpolation scheme. Compression results for each algorithm are tabulated with and without the use of the mapping preprocessor. As is evident from the table, the proposed interpolation algorithm gives the best compression for all images in our test suite. Furthermore, the mapping preprocessor coupled with interpolation resulted in enhanced compression for 6 of the 10 images.

October 30, 1997

DRAFT

30

TABLE VIII

Comparison of spatial domain schemes

Image lenna man chall. coral shuttle sphere brain1 slice15 head sag1

JPEG raw mapped 1.56 1.56 2.11 2.03 1.60 1.93 1.49 1.67 1.80 1.89 1.59 1.58 1.49 1.86 1.88 1.87 1.92 1.91 2.49 2.46

Ranga raw mapped 1.56 1.58 2.18 2.18 1.66 1.98 1.46 1.66 1.82 1.92 1.61 1.62 1.53 1.89 1.92 1.92 1.91 1.91 2.56 2.42

Interpolation raw mapped 1.65 1.64 2.33 2.31 1.66 2.05 1.56 1.77 1.90 2.01 1.70 1.70 1.56 2.01 1.99 2.02 2.03 2.07 2.56 2.53

VII. Transform Domain Algorithms

Transform domain algorithms exploit spatial frequency information contained in the image to achieve compression. For lossless compression, the most popular multiresolution transform-based scheme in the eld of medical imaging has been the S-transform (sequential transform) [38], [39], [40]. This transform is quite ecient however, the study by Kuduvalli and Rangayyan [41] shows that it may not be as e ective as a predictive coding scheme. More recently, Said and Pearlman developed a novel multiresolution transform based scheme well suited for lossless and lossy compression of images. The transform is called the S + P transform where P denotes a prediction stage. Calderbank et al., [15] have recently developed wavelet transforms that maps integers to integers which prove to be quite useful for lossless image compression. In the following, we will brei y describe these two tranform based compression methods.

October 30, 1997

DRAFT

31

A. S + P Transform based Compression

Said and Pearlman [12] developed an image multiresolution transform that is suited for both lossless (reversible) and lossy compression. This tranform requires only integer addition and bitshift operations. By using careful scaling and truncation, the number of bits used to represent the transformed image is kept low. Experimental results show that this transform yields far superior compression ratios than the single resolution linear predictive coding based schemes like the lossless JPEG and other spatial domain methods. The S transform which is similar to the multiresolution transform using Harr basis [10] can be represented by the following set of equations for a given input sequence of integers:

l[n] = b c[2n] + c2[2n + 1] c; n = 0; :::; N2 ? 1;

(1)

h[n] = c[2n] ? c[2n + 1]; n = 0; :::; N2 ? 1;

(2)

where c[n]; n = 0; :::; N ? 1 is a sequence of integers with an even N , bc denotes the oor function i.e., downward truncation. Note that the truncation removes the redundancy in the least signi cant bit becuase the sum and di erence of two integers is either two odd or two even integers. Also, the division and oor operations can be very eciently achieved via a single bit-shift. The inverse transform is given by

c[2n] = l[n] + b h[n]2+ 1 c c[2n + 1] = c[2n] ? h[n]

(3)

The 2D transform simply involves a sequential application of the 1D transform to the rows and columns of the 2D image. October 30, 1997

DRAFT

32

The performance of the S-transform can be improved by combining it with nonlinear predictive coding [12]. Instead of performing the prediction as a post transform operation, Said and Pearlman developed a scheme which predicts the high frequency components from a combination of computed low and high frequency components. Denoting the estimates by h^ [n] we can replace

h[n] by the di erences hd [n] which are given by hd [n] = h[n] ? bh^ [n] + 21 c; n = 0; 1;    ; N2 ? 1;

(4)

to form a new transformed image. The estimation/prediction equation is then given by ^h[n] =

L X 1

i=?L0

ai l[n + i] ?

H X b h[n + j ]

j =1

j

(5)

where l[n] = l[n ? 1] ? l[n] is used to obtain zero mean estimated terms. The inverse transform can be achieved with the prediction following a reverse order as given below.

h[n] = hd [n] + bh^ [n] + 21 c; n = (N=2) ? 1; (N=2) ? 2;    ; 0:

(6)

The inverse S-transform can be easily computed after recovering the values of h[n]. The coecients ai ; bi in the above prediction equations can be obtained using lter design techniques and we refer the interested reader to [12]. B. Wavelet Transform based Compression

Traditionally, the Wavelet transform is very popular for lossy compression [43], [44]. More recently, Wavelet transforms that map integers to integers were developed by several researchers [45], [17], [15]. We will brie y describe the technique developed in Calderbank et al., [15] since October 30, 1997

DRAFT

33

it is the most general of all the integer wavelet transforms. Calderbank et al., describe two methods for achieving the mapping of integers to integers using wavelet transforms. One method is a modi cation of the precoder developed by Laroia et al., [46] for information transmission. The second is a method which is a combination of the lifting scheme [47] and a reversible way of rounding. In this section, we will brie y discuss the latter scheme via an example involving a rewrite of the S-transform using the lifting scheme. In the lifting scheme, we simply rewrite the S-transform equations as follows:

d[n] = c[2n + 1] ? c[2n]

(7)

l[n] = c[2n] + bd[n]=2c

(8)

These equations are identical to the S-tranform becuase, c[2n]+ bd[n]=2c = c[2n]+ bc[2n +1]=2 ?

c[2n]=2c. The inverse tranform can be immediately written as, c[2n] = l[n] ? bd[n]=2c c[2n + 1] = d[n] + c[2n]:

(9) (10)

After simple algebraic manipulation and using the fact that k ? bk=2c = b(k + 1)=2c we observe that this is the same as the inverse transform of S. This example depicts that lifting makes it possible to obtain an integer transform simply by using truncation without sacri cing invertibility. In general, the concept of lifting from a transform point of view consists of rst partitioning an input sequence into even and odd indexed samples { called the lazy wavelet transform { followed by a repeated alternate application of lifting and dual lifting steps. The lifting step involves applying a lter to the odd samples and subtracting the result from the even samples. The dual October 30, 1997

DRAFT

34

lifting step does the opposite, it involves applying a lter to the even samples and subtracting the result from the odd samples. After N such applications of dual and primal lifting operations, the even and odd samples produce the low pass and high pass coecients (upto a scaling factor) respectively. The inverse transform can be obtained by simply reversing the operations and

ipping the signs. The lifting scheme implementation of the wavelet transform has numerous advantages over the traditional wavelet transform and we refer the reader to [15] for a detailed discussion on lifting schemes. In Table IX, we present a comparison of the compression results obtained using the integer wavelet transform [15] and the S+P technique [12] on the same 10 images used earlier for comparing the spatial domain methods. Once again, the results are presented with and without the use of the mapping preprocessor. From this table, we see that the (S+P) algorithm provided the best compression for 7 of our 10 images and tied with the wavelet method on another 2. The mapping preprocessor enhanced the performance of the S+P method on 5 of the 10 images. TABLE IX

Comparison of transform domain schemes

Image lenna man chall. coral shuttle sphere brain1 slice15 head sag1

October 30, 1997

2,2 4,2 raw mapped raw mapped 1.74 1.74 1.74 1.75 2.55 2.19 2.64 2.63 1.77 1.85 1.77 2.20 1.62 1.85 1.63 1.87 2.01 2.13 2.00 2.11 1.80 1.80 1.84 1.84 1.65 2.11 1.68 2.15 2.19 2.20 2.24 2.24 2.17 2.18 2.18 2.18 2.82 2.78 2.94 2.91

2+2,2 S+P raw mapped raw mapped 1.74 1.75 1.77 1.76 2.58 2.57 2.65 2.63 1.77 2.20 1.79 2.21 1.63 1.86 1.64 1.87 2.01 2.13 2.01 2.13 1.82 1.82 1.85 1.84 1.66 2.13 1.68 2.16 2.21 2.21 2.25 2.25 2.17 2.17 2.20 2.20 2.88 2.84 2.92 2.88

DRAFT

35

VIII. Summary Results

In this section, we present a comparison of the various lossless compression algorithms discussed in the previous sections. The e ectiveness for lossless compression is measured using the compression ratio CR which is the ratio of the size of the original image to the size of the compressed data stream. For the evaluation, selected images from the ISO test images and slices from di erent MRI volume image data sets, were used. All the images are (256; 256) with 8 bits/pixel except the images labeled sag0, sag1, and sag2, which use 12 bits/pixel. We present a summary of comparison results in table X. In this summary, we have chosen the best compression ratios obtained for each image in the image set, using spatial as well as transform domain techniques, for comparison. Results using the mapping preprocessor are indicated by *. Compression ratios for Ranga are not given for four of the 27 images because these four images are rectangular and the availabe code for Ranga is unable to handle rectangular images. As is evident from the table, the mapped/unmapped (S + P ) transform algorithm yields the best compression ratios or comparable compression ratios to the wavelet transform, for almost all the images in the image data set used in the experiments. The preprocessing { using mapping { has almost universally yielded an improvement in most methods for most images. IX. CONCLUSIONS

We have reviewed existing methods for lossless image compression and also proposed new schemes. Speci cally, we have proposed the use of a mapping preprocessor, a new coding method called VBL coding, and a spatial domain interpolation method. Experiments conducted by us show that the mapping preprocessor often results in greater compression when used with the compression methods studied in this paper; the VBL coding scheme often outperforms the Hu man and arithmetic coding methods when applied to raw linearized images; and our interpolationOctober 30, 1997

DRAFT

36

TABLE X

Comparison of the Best Results of Different Schemes(* means mapping used)

Image lenna man chall coral shuttle sphere ocean1 ocean2 nger cmpnd1 brain1 brain2 brain3 brain4 brain5 brain6 slice0 slice15 slice30 slice45 slice60 slice90 head sag0 sag1 sag2

JPEG 1.56 2.11 1.93* 1.67* 1.89* 1.59 1.69* 1.87* 1.34 2.91* 1.86* 1.84* 1.83* 1.84* 1.81* 1.81* 2.17 1.88 1.81 1.80 1.80 1.93 1.92 2.48 2.49 2.47

Ranga 1.58* 2.18 1.98* 1.66* 1.92* 1.62*

1.89* 1.85* 1.82* 1.86* 1.81* 1.81* 2.19 1.92 1.84 1.81 1.82* 1.96* 1.91 2.56 2.56 2.55

Interp Wavelet S+P 1.65 1.75* 1.77 2.33 2.63 2.65 2.05* 2.20* 2.21* 1.77* 1.87 1.87 2.01* 2.13* 2.13* 1.70 1.84 1.85 3.82* 5.50* 5.21* 4.18* 5.42* 5.21* 1.39 1.47 1.45 3.07* 3.37* 3.33* 2.01* 2.15* 2.16* 1.96* 2.15* 2.14* 1.96* 2.11* 2.11* 1.97* 2.12* 2.13* 1.94* 2.07* 2.08* 1.92* 2.08* 2.08* 2.35* 2.51* 2.54 2.02* 2.24 2.25 1.92* 2.16* 2.16 1.90* 2.13* 2.12 1.90 2.14* 2.13 2.08* 2.29* 2.29 2.07* 2.18 2.20 2.53* 2.94 2.93 2.56 2.94 2.92 2.55 2.92 2.91

based compression is the best spatial domain compressor. Best overall compression of images is obtained using the S+P method of [12] coupled with the mapping perprocessor proposed here. References [1] Rosenfeld and Kak, Digital Picture Processing, Academic Press, 1976. [2] K. R. Castleman, Digital Image Processing, Prentice Hall, 1996. [3] N.Ranganathan, S.G.Romaniuk and K.R.Namuduri, \A Lossless Image Compression Algorithm Using Variable Block Size Segmentation", IEEE Transactions on Image Processing, vol 4, No. 10, pp.1396 - 1406, October October 30, 1997

DRAFT

37

1995. [4] T. H. Cormen, C. E. Leiserson and R. L. Rivest, Introduction to Algorithms, McGraw Hill, 1990. [5] I. H. Witten, R.M. Neal, and J. G. Cleary, \Arithmetic coding for data compression", Commum. ACM, vol. 30, pp. 520-540, June 1987. [6] Weidong Kou, Digital Image Compression - Algorithms and Standards, Kluwer Academic Publishers, 1995. [7] R. J. Clarke, Digital Compression of Still Images and Video, New York, Academic Press, 1995. [8] Wallace, G. K. \The JPEG still picture compression standard", Communications of the ACM, vol 34, pp. 30-44, April 1991. [9] S. Takamura and M. Takagi, \Lossless image compression with lossy image using adaptive prediction and arithmetic coding", Proc. Data Compress. Conf., Snowbird, UT, Mar. 1994, pp. 166-174. [10] E. H. Adelson, E. P. Simoncelli and R. Hingorani, \Orthogonal pyramid transforms for image coding,", Proc. SPIE, Cambridge, MA, Oct. 1987, vol. 845, pp. 50-58.

[11] Gilbert and Strang, Wavelets and Filter Banks, Wellesley-Cambridge Press, 1996. [12] A. Said and W. A. Pearlman, \An Image Multiresolution Representation for Lossless and Lossy Compression", IEEE Trans. Image Process., vol. 5, no. 9, Sept. 1996.

[13] A.Said and W. A. Pearlman, \Reversible Image Compression via Multiresolution representation and predictive coding", IEEE Trans. Image Processing, Vol. 2094, pp. 664-674, Nov. 1993. [14] \Hierarchical Image Decomposition and Filtering Using teh S-Transform" Proc. SPIE, Medical Imaging, vol 914, pp. 799-814, 1988. [15] A. R. Calderbank, I. Daubechies, W. M. Sweldens, and Boon-Lock Yeo, \Wavelet transforms that map integers to integers", Department of Mathematics, Princeton University, August, 1996. [16] E. Majani, \Biorthogonal Wavelets for Image Compression", Proc. SPIE, VCIP 1994, Vol. 2308, pp. 478-488, Sept. 1994. [17] A. Zandi, M. Boliek, E. L. Schwartz, and M. J. Gormish, \CREW lossless/lossy medical image compression", Technical Report CRC-TR-9526, RICOH California Research Center, 1995.

[18] G. R. Kuduvalli, \Performance Analysis of Reversible Image Compression Techniques for High Resolution Digital Teleradiology", IEEE Trans. on Medical Imaging, vol. 11, No. 3, Sept. 1992. [19] \MPEG, CCITT, H.261, JPEG : Image and Image sequence compression/decompression C software engines", Portable Video Research Group, Stanford, 1991.

October 30, 1997

DRAFT

38

[20] A. K. Cline, \Curve Fitting using Splines Under Tension", Technical report, Atmos. Tech 3, 1973. [21] Demetri Terzopoulos, \The Computation of Visible-Surface Representations", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 10, No. 4, July 1988.

[22] A. K. Cline, \A Software Package for Curve and Surface Fitting Employing Splines Under Tension", Univ. of Texas at Austin, August, 1981.

[23] Paul Dierckx, Curve and Surface tting with Splines, Oxford; New York: Clarendon, 1993. [24] Richard Szeliski, \Fast Surface Interpolation Using Hierarchical Basis Functions", IEEE Trans., 1990. [25] Gonzalez and Woods,Digital Image Processing, Addison-Wesley, 1992. [26] L. Portoni, C. Combi, G. Pozzi, F. Pinciroli, H. P. Fritsch and R. Brennecke, \Some irreversible compression methods and their application to angiocardiographic digital still images", May 1996. [27] R.Sedgewick, Algorithms, Addison-Wesley, 1988. [28] Michael J.Black and Anand Rangarajan, \The Outlier Process: Unifying Line Processes and Robust Statistics" IEEE, 1994. [29] V.K.Heer and H.E.Reinfelder \A comparison of reversible methods for data compression", IEEE Transactions in Image Processing, vol 1233, 1990.

[30] Eli Shusterman and Meir Feder, \Image Compression via Improved Quadtree Decomposition Algorithms", IEEE 1994.

[31] C.H.Chien and J.K.Aggarwal, \A Normalized Quadtree Representation", Computer Vision, Graphics and Image Processing, 26, 331-346 (1984).

[32] Homer H.Chen and Thomas S.Huang, \Representation, Construction and Manipulation of Octrees", Survey, Coordinated Science Laboratory, University of Illinois, October 1985.

[33] D.Hu man, \A Method for the Construction of Minimum Redundancy Codes", Proc. IRE, Vol 40, pp. 1098-1101, 1952. [34] J. Ziv and A. Lempel, \A Universal Algorithm for Sequential Data Compression", IEEE Trans. on Information theory, Vol. 24, No. 5, pp. 530-536, 1978.

[35] San-Yuan Wu and Sartaj Sahni, \Covering Rectilinear Polygons by Rectangles", IEEE Trans. on ComputerAided Design, Vol. 9, No. 4, April 1990.

[36] E. Horowitz, S. Sahni, and D. Mehta, Fundamentals of Data Structures in C++, W. H. Freeman, 1994. [37] T. Welch, \A Technique for High-Performance Data Compression", IEEE Computer, June 1994, 8-19. October 30, 1997

DRAFT

39

[38] M. Rabbani and P. W. Jones, \Digital Image compression Techniques". Bellingham, WA: SPIE, 1991 [39] V. K. Heer and H. E. Reinfelder, \A comparison of reversible methods for data compression", Proc. SPIE-Med. Imaging IV, 1990. vol. 1233, pp. 354-365

[40] S. Takamura and M. Takagi, \Lossless image compression with lossy image using adaptive prediction and arithmetic coding", Proc. Data Compress. Conf., Snowbird, UT, Mar. 1994, pp. 155-174. [41] G. R. Kuduvalli and R. M. Rangayyan, \Performance analysis of reversible image compression techniques for high-resolution digital teleradiology", IEEE Trans. Med. Imaging, vol. 11, pp. 430-445, Sept. 1992 [42] Vasudev Bhaskaran and Konstantinos Konstantinides, Image and Video Compression Standards, Algorithms and Architectures, pp. 47-49, 1995, Kluwer Academic Publishers.

[43] R. A. Devore, B. Jawerth, and B. J. Lucier. \Image compression through wavelet transform coding". IEEE Trans. Inform. Theory, 38(2):719-746, 1992.

[44] Mladen Victor Wickerhauser, \High-Resolution Still Picture Compression", DIGITAL SIGNAL PROCESSING, 2, 204-226(1992).

[45] S. Dewitte and J. Cornelis. , \Lossless integer wavelet transform", Technical Report IRIS-TR-0041, Royal Meteorological Institute Belguim, 1996. [46] R. Laroia, S. A. Tretter, and N. Farvardin. \A simple and e ective precoding scheme for noise whitening on intersymbol interference channels", IEEE International Symposium on Circuits and Systems, vol. 2, pp. 309-312, May 1995. [47] W. Sweldens, \The lifting scheme: A custom-design construction of biorthogonal wavelets", Journal of Appl. and Comput. Harmonic Analysis, 3(2):186-200, 1996.

October 30, 1997

DRAFT