Image Processing, IEEE Transactions on

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 12, NO. 2, FEBRUARY 2003

153

Bit Vector Architecture for Computational Mathematical Morphology John C. Handley, Member, IEEE

Abstract—A real-time, compact architecture is presented for translation-invariant windowed nonlinear discrete operators represented in computational mathematical morphology. The architecture enables output values to be computed in a fixed number of operations and thus can be pipelined. Memory requirements for an operator are proportional to its basis size. An operator is implemented by three steps: 1) each component of a vector observation is used as an index into a table of bit vectors; 2) all retrieved bit vectors are “ANDed” together; and 3) the position of the first nonzero bit is used as an index to a table of output values. Computational mathematical morphology is described, the new architecture is illustrated through examples, and formal proofs are given. A modification of the basic architecture provides for increasing operators. Index Terms—Electronic printing, image processing hardware, increasing operators, lattice operators, nonlinear filters.

plication. For increasing operators, one must tally the intervals to which an observation belongs and take the maximum associated value. While easily conceived as parallel operations, implementation is cumbersome. By pre-computing interval membership information, we can produce the proper interval in a small number of simple operations. As with the comparator-based architecture, this method executes in a fixed number of operations, making it suitable for pipelining. After a motivating example application, we give a brief overview of CMM representations of increasing and nonincreasing operators. Next, we describe the architecture for nonincreasing operators, followed by an example. Finally, formal proofs of the representations underlying the architecture and its extension to increasing operators are given.

I. INTRODUCTION

II. ELECTRONIC PRINTING EXAMPLE

MAGE processing in printers and digital copiers is done in hardware due to the immense flow of image data. High-resolution color and gray-scale image data and increased printing speeds have obviated many current image processing architectures. Shape-based operations such as detecting and enhancing lines and corners of intersecting lines, are well-suited for nonlinear operators. Due to speed and cost constraints, hardware architectures must be simple and compact. Computational mathematical morphology (CMM) as developed by Dougherty and Sinha in a series of papers [3]–[5] and Dougherty and Barrera [6] offers a representation of windowed nonlinear functions that is particularly amenable to hardware implementation and statistical estimation. We deal solely with the implementation and leave motivation, historical context, and the theory of CMM to the primary literature. There is scant information on CMM implementation and to the author’s knowledge, this is the first report (although a brief description of this work with discussion appears in a conference proceedings [8]). CMM decomposes a translation-invariant, windowed image operator into a finite set of simple operations: testing whether a windowed observation is in an interval. If the result is affirmative, an associated value is returned. This representation results in a search problem: find the interval to which an observation belongs. The comparator-based architecture of [5] performs these operations in parallel. While simple, there could be many intervals to test and this architecture is too expensive for our ap-

Mathematical morphology plays a prominent role in document image processing [11]. Traditionally, binary morphology in the form of template-matching algorithms have been employed to tune image content for printing. For example, the physics of electrophotography can cause thin image structures such as lines or serifs of characters to disappear upon printing. Such image content must be detected and thickened so that they print correctly without affecting other image areas. A similar problem exists for gray-scale and color printing, but the problem is much more difficult to solve. The traditional lookup table approach to template matching is not practical owing to the enormous number of possible gray templates. To make matters worse, the document image is at a high-resolution (600 dots per inch) which requires a larger window area to observe image structures and for color documents, there are four eight-bit per pixel channels to process. A typical color image path begins with a digital front end (DFE) which converts a document in a page description language (PDL) to four channels of raster data in cyan, magenta, yellow and black (the device dependent cmyk color space). In our image path, these channels are eight-bits per pixel and must undergo line-width-tuning prior to halftoning and printing. It is too difficult to calibrate a print engine to print this image exactly and far more effective to process the image so that it prints correctly. An image processing module must inspect each image (tens of megabytes of data at tens of pages per minute), modify only required pixels and pass through the rest. Fig. 1 shows 5 pixel windowed obsample images. In this example, 5 servations are processed with an operator represented in CMM and implemented in the bit-vector architecture. The first column contains image data as they are rendered in system memory. The second column contains data as it should be processed to yield

I

Manuscript received April 4, 2002; revised October 22, 2002. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Robert D. Nowak. The author is with the Xerox Corporation, Webster, NY 14580-9701 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/TIP.2002.807362

1057-7149/03$17.00 © 2003 IEEE

154


The strategy of CMM is to represent operators with many simple, parallel operations. The bit vector architecture we propose implements these operators as a series of lookups and logical operations (AND). IV. COMPUTATIONAL MATHEMATICAL MORPHOLOGY Put simply, CMM decomposes mappings between complete lattices by partitioning pre-images or level slices of mappings into intervals, endpoints serving as a basis. For increasing mappings, only the lower endpoints are in the basis. But before describing CMM in detail, we put it into context with a brief review of some concepts from mathematical morphology. A. Classical Mathematical Morphology Classical mathematical morphology concerns the properties . A set of set mappings. Let be the set of all subsets of is translation invariant if and only if mapping for all and . The kernel of is defined by Fig. 1. Sample processed images using a CMM-represented operator. The left column shows image data as it is rendered in memory, the center column shows the image as it should be modified to print correctly (lines thickened, the remainder unaltered), and the right column shows the data in the left column processed with a filter.

the proper look when printed. The last column shows the effect of the operator. Only lines as in the first row should be affected; contone data as in the second row should appear untouched. In this example, each bit-vector contains 51 entries, demonstrating that useful operations can have relatively low complexity. III. REAL-TIME MORPHOLOGICAL IMAGE PROCESSING The literature on hardware approaches to morphological image processing is extensive, covering a number of architectures to implement classical binary and gray-scale mathematical morphology. We consider only two recent contributions here to differentiate our method. Ikenaga and Ogura describe a 2-D cellular automata architecture and chip design [10]. This hardware can perform gray-scale erosions and dilations and can process pixels in parallel. It is targeted toward video processing where the entire image can be loaded into memory and processed many times in succession. Our images are too large (potentially 600 dpi 600 dpi 11 in 17 in) to store completely in memory, nor can we afford the time for multiple processing passes. An ASIC that implements classical mathematical morphological gray-scale operations as well as more sophisticated operators such as soft and fuzzy erosions and dilations is described in [7]. This approach, while much more sophisticated than our proposal, is limited to a single 3 3 structuring element. The highly parallel representation of CMM would be quite expensive with this architecture. In general, hardware implementations of classical morphological operators are plagued by complexity that grows superlinearly with structuring element size, image resolution, or bit depth.

An erosion of a set defined by

by another set (structuring element)

is

Essentially, an erosion marks the places where the translations of one set fits inside another. A mapping is increasing if implies for all . There are two fundamental representation theorems that have analogs in CMM. The first is due to Matheron [12] and states that any increasing, translation-invariant set mapping can be decomposed into a union of erosions

A hit-or-miss transform is a set mapping where . Here, one marks the places where a first structuring element fits into a set and where a second structuring of element fits into its complement. A closed interval . two sets and is defined as all sets such that The second fundamental representation theorem, due to Banon and Barerra [1], is that any translation invariant set mapping can be represented as a union of hit-or-miss transforms

These are critical theorems from a filtering perspective—one can design a nonlinear filter by choosing erosions and implement it with simple building-blocks. These two theorems and constituent concepts have analogs in the lattice-theoretic setting of CMM. B. Increasing Operators Let be an increasing function of windowed observations of a gray-scale image, represented as a vector of length . -dimensional vectors are partially ordered:

HANDLEY: BIT VECTOR ARCHITECTURE FOR COMPUTATIONAL MATHEMATICAL MORPHOLOGY

Fig. 2. Geometric illustration of a two-pixel window increasing gray-scale filter. Pixel values range from 0 to 255 and filter output values are 1, 2, 3, 4, and 7. The filter value at x ; x is 4.

(

)

if and only if for . implies An increasing function has the property that . The simplest increasing operations is an -elemental gray-scale erosion

If let

otherwise. is a nonempty subset of a complete (here, finite) lattice, denote its least elements. Define kernel sets, , . These sets are decreasing in

An increasing operator

has the decomposition [4]

Fig. 2 shows an increasing operator on a two-pixel observation. is the entire region and sets , In Fig. 3, the set , , and are respective shaded regions. C. Nonincreasing Operators (Computational Hit-or-Miss Transform) In CMM, when erosion and dilation are combined to form a hit-or-miss transform, the result is simply outputting a value when a windowed observation fits into an interval [4]. Let be a translation-invariant mapping from a finite lattice to a finite . Let , be set, for each . Then a partition of such that

(1)

that map to The gist of (1) is that sets of points of are partitioned into intervals, nonempty subsets of the lattice having lower and upper bounds. Given an observation , find

155

Fig. 3. Shaded regions are kernels of an increasing operator. Clock-wise from upper left, we have K ,K ,K , and K K . K

(9)

(9)

(9)

(9)

(9) =

(9) =

the interval to which it belongs and report the value corresponding to that interval. Any nonempty subset of can be partitioned into intervals and there are many possible partitions of , the pre-image of . Finding index in (1) is usually difficult. The architecture of [5] is a straightforward arrangement of parallel comparators: each interval is tested in parallel. This is not practical for expansions of many useful operators. An alternative bit vector approach is given next. V. ARCHITECTURE FOR HIT-OR-MISS TRANSFORM An interval is an ordered pair of vectors and with . A windowed observation fits if and only if . In CMM, when an into interval observation fits into one interval in a set , the value is returned. By the hit-or-miss representation theorem, a set of interval sets fully determines the actions of a filter: 1) in a window, observe ; 2) check each set , to see whether belongs to an interval in the set; 3) if there is a fit , output the value , otherwise output a default value. A sequential way to implement this algorithm is to search through the interval set searching for a fit. At worst, it will reintervals, and within each quire checking all . At best, a fit could interval, checks of the form occur on the very first interval. This approach is not suitable for hardware because it requires a nondeterministic number of operations per observation and it requires storage for each interval. To gain intuition we view the gray-scale problem geometrically. Each windowed observation of an eight-bit gray scale image is a vector in an -dimensional space . Each interval is a (hyper) rectangle in is that space. A set of intervals a union of disjoint hyper-rectangles. A fit is the determination of

156


TABLE I CMM DECOMPOSITION OF A SIMPLE NON-INCREASING FILTER

which interval set contains . In the sequential algorithm, is compared to the boundaries of each hyper-rectangle until one is found to contain it or it is in no hyper-rectangle (default). The proposed architecture performs a set of logical tests in parallel by pre-computing interval membership data for each possible observation. and consider the list of inFor example, let terval indices and endpoints show in Table I where , , and . , it is possible to visualize intervals as rectangles Since as illustrated in Fig. 4. Each grouping of intervals is represented (and typically by different style of hash-marks. Were might be 25 or 49 or more) the intervals would be part of a high-dimensional space and impossible to visualize. Observaand so some output assocition (100, 91) fits into interval would be returned when (100, 91) is observed. ated with Fig. 4 depicts eight intervals associated with a two-pixel win. The intervals are grouped todowed observation gether according to actions, indicated here by different hashing. Intervals 0, 3, 4, 7 correspond to the output 1, intervals 1, 6 correspond to output 2, intervals 2, 5 correspond to output 3, and any other observation yields a default output. in Fig. 4, Considering a specific observation all observations with as first coordinate are members of intervals 2, 3, or 6. All observations with as second coordinate are members of intervals 1 or 3. Thus fits into some interval in the set 2, 3, 6 and some interval in the set 1, 3. The sole common interval is interval 3. Were this interval membership information pre-computed and stored, given an observation, one need only look up the sets and perform a set intersection to find the number of the fitting interval. And from the number of the fitting interval, one could obtain the value or action associated with this interval, thereby computing the effect of this operation on an observation. Set intersections can be implemented by “ANDing” bit vectors where each bit vector represents a set. Put a “1” in a bit vector , bit entry if is in the set and ‘0’ otherwise. For vector 00 110 010 is constructed (the bit positions are 0 through

Fig. 4. Illustration of intervals for a two-pixel window hit-or-miss transform (filter). Each hashed region represents a labeled interval plus the white background represents a default output value. Observations with first coordinate x can be members of intervals 2, 3, or 6; observations with second coordinate x can be members of intervals 1 or 3.

7) from left to right and bit vector 01 010 000 is constructed for . To determine the interval to which fits, the two bit vectors are “ANDed” together to produce 00 010 000. The only nonzero entry is in position 3 which corresponds to interval 3. Table entries such as these are computed for each component in an observation vector. To determine whether a particular obsatisfies the logical test, look up each servation component value in its table and “AND” each result. If the result is a “1,” the logical test is affirmative: the observation is in the interval, otherwise, the logical test fails. Any windowed shift-invariant operator can be expressed as a set of intervals and corresponding output values. To instantiate the data for the architecture requires two steps. Step one is to enumerate the intervals, say from 1 to , and build an array or lookup table of interval numbers and output values. Step two , builds the bit vector table. For each interval , , build a column in a bit vector table. Let the signal or image at a sample take values in . An eight-bit signal may be quantized to four high-order bits, . The rows of the bit in which case vector table consist of all possible pixel values for each sample in the window and a bit vector indicating to which intervals it can belong. Fig. 5 illustrates the general architecture. Although the example was gray-scale, it is appropriate for computational mathematical morphology, i.e., any union of intervals in a lattice. A further advantage of this method is that the architecture is programmable. One can load indicator bit vectors that are appropriate for different applications, or even modify them dynamically, if desired. This architecture is adapted to increasing filters in a later section.

HANDLEY: BIT VECTOR ARCHITECTURE FOR COMPUTATIONAL MATHEMATICAL MORPHOLOGY

157

Therefore,

Fig. 5. Architecture for nonincreasing filters. Each component of an observation (x ; . . . ; x ) is used to look up a bit vector in a table. Each of the resulting N bit vectors is ANDed together. If there is a nonzero bit in the result, it is unique, and its position is used to look up an output value, otherwise a default value is returned.

The essence of the bit vector method is to pre-compute for each and . Sets are represented as bit vectors: a “1” in position of bit vector table if and only if . If is a function that returns the first index of a nonzero component of a binary . vector, then, The sole remaining task is to associate a filter output value to a label . This can be done simply with a function where where for some . Putting it all together, we have the following representation:

VI. BIT VECTOR ARCHITECTURE THEORY

Now that we see how the architecture works, let us investigate it formally. Let be a discrete, finite totally ordered set (e.g., binary or gray-scale values) with order relation “ .” Consider , which is partially ordered by “ ” operating the product be this finite, distributive component-wise. Let lattice, where it is understood that “ ” now refers to vectors in . Consider a mapping , where is a set of lawithout loss of generality. The range could bels be gray-levels or labels. Lattice is partitioned by into a union . Further, each pre-image can of pre-images, be partitioned into a finite set of intervals, . These intervals correspond to hyper-rectangles in the previous example. Interval endpoints are vectors in : and . , For an observation such that (the CMM hit-or-miss ( is transform). Let , then a generalized “cylinder”). If . We link intervals to output values by uniquely labeling each interval. Let be a bijection from the set of disjoint intervals to a set of natural numbers . We abuse notation by, for a nonempty where be the set of labels of intervals to subset of , letting which points in belong. The “ANDing” operation in the architecture computes the intersection of cylinder labels because the label of the intersection of cylinders is the intersection of labels of cylinders. . Proposition 1: . Proof: Consider a label and let For each

where . The key to efficient filter implementation is storing for each and . Also, it is worthwhile to minimize the size of the basis. Practical methods for “reduction” remain to be investigated. Let be an increasing operator with the previous set up exnow has a total order. Recall cept that the label set and has representation [4] (2) . From (2), it is clear that

Let . Proposition 2: Proof:

.

We thus have a new representation for the operator in terms of . For hardlabels of cylinders: ware implementation, construct a bit vector for each cylinder: has a “1” in position if and “0” otherwise. If is a function returning the maximum nonzero posi. This is tion of a bit vector, then the representation that can be implemented in hardware as illustrated in Fig. 6. Memory requirements for these architectures (increasing and nonincreasing) is proportional to the size of the filter basis, which is application dependent. Not every filter benefits

158


VIII. SUMMARY

Fig. 6. Architecture for increasing filters. Each component of an observation (x ; . . . ; x ) is used to look up a bit vector in a table. Each of the resulting N bit vectors is ANDed together. The position of the first nonzero bit is used to look up an output value.

CMM is an appropriate representation for nonlinear windowed operators in our printing image processing application. A new bit vector architecture was developed to reduce implementation cost and allow new operators to be programmed. In this architecture, each component of a windowed observation is used to index a lookup table of bit vectors. The retrieved bit-vectors are “ANDed” together to produce the first nonzero bit, the position of which is used to look up the operator output value. This architecture is fully compatible with CMM and its complexity (memory) is a linear function of the product of the basis size (the number of bits in each bit vector) and the number of lookup tables (the number of pixels in the observation window). In our application, basis sizes up to a few hundred are economical in hardware. This work underscores the utility of CMM as a representation theory and provides a novel and useful improvement for its application. REFERENCES

from this representation. In the worst case, each observation could form its own interval, in which case where is the number of gray levels. A convolution is an example of a common filter that would be poorly implemented this way. But what is often required is a filter that operates locally, as hit-or-miss operators are designed to do, and these may be amenable to the bit vector architecture. VII. APPLICATION TO APERTURE FILTERS Hirata et al. describe a filtering approach that is amenable to the bit vector architecture [9]. Briefly, an aperture filter is a windowed operator that constrains observation values and the range in order to increase estimation precision. (where us the set of Consider an operator , without loss of generality) and let gray levels . In a simple version of the approach, designate as the observation in the “center” of the observation and define a window. Let transformation

An aperture filter maps to . When estimating from signal realizations, this has the effect of increasing estimation precision at the cost of estimation accuracy (and resulting suboptimality). However, under conditions analyzed in [9] this tradeoff is worthwhile. In [9], aperture filters are estimated by transforming observations by ( is a design parameter) and estimating using recursive partitioning [2]. In its simplest form (the axis parallel case of [9]), recursive partitioning tessellates the -dimensional space of transformed windowed observations into intervals in exactly the manner required for the bit vector architecture. A straightforward application of the bit vector architecture can implement aperture filters.

[1] G. J. F. Banon and J. Barrera, “Minimal representation of translation invariant set mappings by mathematical morphology,” SIAM J. Appl. Math., vol. 51, no. 6, 1991. [2] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees. Belmont, MA: Wadsworth, 1984. [3] E. R. Dougherty and D. Sinha, “Computational mathematical morphology,” Signal Process., vol. 38, pp. 21–29, 1994. [4] , “Computational gray-scale mathematical morphology on lattices (a comparator-based image algebra) Part I: Architecture,” Real Time Imag., vol. 1, pp. 69–85, 1995. [5] , “Computational gray-scale mathematical morphology on lattices (a comparator-based image algebra) Part II: Image operators,” Real Time Imag., vol. 1, pp. 283–295, 1995. [6] E. R. Dougherty and J. Barrera, “Computational gray-scale operators,” in Nonlinear Filters for Image Processing, E. R. Dougherty and J. T. Astola, Eds. New York: SPIE/IEEE Press, 1999. [7] A. Gasteratos and I. Andreadis, “Non-linear image processing in hardware,” Pattern Recognit., vol. 33, pp. 1013–1021, 2000. [8] J. C. Handley, “Architecture for computational mathematical morphology,” Proc. SPIE, pp. 67–74, 2001. [9] R. Hirata, Jr., E. R. Dougherty, and J. Barrera, “Aperture filters,” Signal Process., vol. 80, pp. 697–721, 2000. [10] T. Ikenaga and T. Ogura, “Real-time morphology processing using highly parallel 2-D cellular automata CAM ,” IEEE Trans. Image Processing, vol. 9, pp. 2018–2025, Dec. 2000. [11] R. P. Loce and E. R. Dougherty, Enhancement and Restoration of Digital Documents: Statistical Design of Nonlinear Algorithms. Bellingham, WA: SPIE, 1997. [12] G. Matheron, Random Sets and Integral Geometry. New York: Wiley, 1975.

John C. Handley (S’92–M’95) received the B.S. and M.S. degrees in mathematics from Ohio State University, Columbus, in 1978 and 1981, respectively, and the Ph.D. degree in imaging science from Rochester Institute of Technology, Rochester, NY, in 1996. He has been a Member of the Research and Technical Staff at Xerox Corporation, Webster, NY, since 1995. From 1984 to 1991, he was a Research Scientist and Programmer at Online Computer Library Center in Dublin, OH. He is the author or co-author of 38 conference and journal papers in applied statistics, document recognition, and random sets. He holds five U.S. patents. He was a co-editor of a special section on psychometric statistical procedures for the Journal of Electronic Imaging. His research interests include nonlinear image processing, pattern recognition, statistical analysis of ranked data, document understanding, and morphometrics. Dr. Handley is a member of SPIE and ASA.