Comparison of color demosaicing methods

22 downloads 1901 Views 3MB Size Report
Mar 28, 2012 ... as CFA demosaicing, and its result as the demosaiced image. ... to compare the performances reached by the demosaicing methods thanks to ...
Comparison of color demosaicing methods Olivier Losson, Ludovic Macaire, Yanqin Yang

To cite this version: Olivier Losson, Ludovic Macaire, Yanqin Yang. Comparison of color demosaicing methods. Advances in Imaging and Electron Physics, Elsevier, 2010, 162, pp.173-265. .

HAL Id: hal-00683233 https://hal.archives-ouvertes.fr/hal-00683233 Submitted on 28 Mar 2012

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Comparison of color demosaicing methods O. Lossona,∗, L. Macairea , Y. Yanga a Laboratoire LAGIS UMR CNRS 8146 – Bâtiment P2 Université Lille1 – Sciences et Technologies, 59655 Villeneuve d’Ascq Cedex, France

Keywords: Demosaicing, Color image, Quality evaluation, Comparison criteria

1. Introduction Today, the majority of color cameras are equipped with a single CCD (ChargeCoupled Device) sensor. The surface of such a sensor is covered by a color filter array (CFA), which consists in a mosaic of spectrally selective filters, so that each CCD element samples only one of the three color components Red (R), Green (G) or Blue (B). The Bayer CFA is the most widely used one to provide the CFA image where each pixel is characterized by only one single color component. To estimate the color (R,G,B) of each pixel in a true color image, one has to determine the values of the two missing color components at each pixel in the CFA image. This process is commonly referred to as CFA demosaicing, and its result as the demosaiced image. In this paper, we propose to compare the performances reached by the demosaicing methods thanks to specific quality criteria. An introduction to the demosaicing issue is given in section 2. Besides explaining why this process is required, we propose a general formalism for it. Then, two basic schemes are presented, from which are derived the main principles that should be fulfilled in demosaicing. In section 3, we detail the recently published demosaicing schemes which are regrouped into two main groups : the spatial methods which analyze the image plane and the methods which examines the frequency domain. The spatial methods exploit assumptions about either spatial or spectral correlation between colors of neighbors. The frequency-selection methods apply specific filters on the CFA image to retrieve the color image. Since these methods intend to produce “perceptually satisfying” demosaiced images, the most widely used evaluation criteria detailed in section 4 are based on the fidelity to the original images. Generally, the Mean Square Error (MSE) and the Peak Signalto-Noise Ratio (PSNR) are used to measure the fidelity between the demosaiced image and the original one. The PSNR criterion cannot distinguish the case when there are ∗ Corresponding

author Email addresses: [email protected] (O. Losson), [email protected] (L. Macaire), [email protected] (Y. Yang)

Preprint submitted to Advances in imaging and electron physics

18 décembre 2009

a high number of pixels with slight estimation errors from the case when only a few pixels have been interpolated with severe demosaicing artifacts. However, the latter case would more significantly affect the result quality of a low-level analysis applied to the estimated image. Therefore, we propose new criteria especially designed to determine the most effective demosaicing method for further feature extraction. The performances of the demosaicing methods are compared in section 5 thanks to the presented measurements. For this purpose, the demosaicing schemes are applied to twelve images of the benchmark Kodak database.

2

2. Color Demosaicing Digital images or videos are currently a preeminent medium in environment perception. They are today almost always captured directly by a digital (still) camera, rather than digitized from a video signal provided by an analog camera as they used to be several years ago. Acquisition techniques of color images in particular have involved much research work and undergone many changes. Despite major advancements, mass-market color cameras still often use a single sensor and require subsequent processing to deliver color images. This procedure, named demosaicing, is the key point of our study and is introduced in the present section. The demosaicing issue is first presented in detail, and a formalism is introduced for it. 2.1. Introduction to the Demosaicing Issue The demosaicing issue is here introduced from technological considerations. Two main types of color digital cameras are found on the market, depending on whether they embed three sensors or a single one. Usually known as mono-CCD cameras, the latter are equipped with spectrally-sensitive filters arranged according to a particular pattern. From such color filter arrays (CFA), an intermediate gray-scale image is formed, which then has to be demosaiced into a true color image. In the first subsection are compared the major implementations of three-CCD and mono-CCD technologies. Then are presented the main types of colors filter arrays released by the various manufacturers. Proposed by Bayer from Kodak in 1976, the most widespread CFA is considered in the following, not only to formalize demosaicing but also to introduce a pioneer method using bilinear interpolation. This basic scheme generates many color artifacts, which are analyzed to derive two main demosaicing rules. Spectral correlation is one of them, and will be detailed in the last subsection. The second one, spatial correlation, is at the heart of edge-adaptive demosaicing methods, that will be presented in the next section. 2.1.1. Mono-CCD vs. Three-CCD Color Cameras Digital area scan cameras are devices able to convert color stimuli from the observed scene into a color digital image (or image sequence) thanks to photosensors. Such an output image is spatially digitized, being formed of picture elements (pixels). With each pixel is generally associated a single photosensor element, which captures the incident light intensity of the color stimulus. A digital color image I can be represented as a matrix of pixels, each of them being denoted as P(x,y), where x and y are the spatial coordinates of pixel P within the image plane of size X × Y , hence (x,y) ∈ N2 and 0 6 x 6 X − 1, 0 6 y 6 Y − 1. With each pixel P is associated a color point, denoted as I(x,y) or Ix,y . This color point is defined k , k ∈ {R,G,B}, in the RGB three-dimensional color space by its three coordinates Ix,y which represent the levels of the trichromatic components of the corresponding color stimulus. The color image I may also be split into three component planes or images I k , k ∈ {R,G,B}. In each component image I k , the pixel P is characterized by level I k (P) for the single color component k. Thus, three component images I R , I G and I B , must be acquired in order to form any digital color image. 3

R

dichroic prism

G

stimulus

CCD sensor

B

100

0.40

90

0.35

R(λ )

70 60

B(λ ) G(λ )

50 40 30 20

0.20 0.15 0.10 0.05 0

−0.10

400

450

500

550

600

650

700

750

800

850

Wavelength λ (nm) (b) Relative spectral sensitivity of the Kodak KLI2113 sensor.

Gc (λ )

0.25

−0.05

10 0 350

Rc (λ )

Bc (λ )

0.30

80

Spectral sensitivity

Spectral sensitivity

(a) Beam splitting by a trichroic prism assembly.

380

[Bc ] 420

460

[Gc ] 500

540

[Rc ] 580

620

660

700

740

780

Wavelength λ (nm) (c) CIE 1931 RGB color matching functions. [Rc ], [Gc ] and [Bc ] are the monochromatic primary colors.

F IG . 1: Three-CCD technology. The two main technology families available for the design of digital camera photosensors are CCD (Charge-Coupled Device) and CMOS (Complementary Metal-Oxide Semiconductor) technologies, the former being the most widespread one today. The CCD technology uses the photoelectric effect of the silicon substrate, while CMOS is based on a photodetector and an active amplifier. Both photosensors overall convert the intensity of the light reaching each pixel into a proportional voltage. Additional circuits then converts this analog voltage signal into digital data. For illustration and explanation purposes, the following text relates to the CCD technology. The various digital color cameras available on the market may also be distinguished according to whether they incorporate only a single sensor or three. In accordance with the trichromatic theory, three-CCD technology incorporates three CCD sensors, each one being dedicated to a specific primary color. In most devices, the color stimulus from the observed scene is split onto the three sensors by means of a trichroic

4

B

G

R

stimulus

CMOS sensor (a) Wavelength absorption within the Foveon X3 sensor.

0.045

R(λ )

0.040

Spectral sensitivity

0.035 0.030

B(λ )

G(λ )

450

500

0.025 0.020 0.015 0.010 0.005 0 350

400

550

600

650

700

Wavelength λ (nm) (b) Relative spectral sensitivity of the Foveon X3 sensor endowed with an infrared filter (Lyon and Hubel, 2002).

F IG . 2: Foveon X3 technology. prism assembly, made of two dichroic prisms (see figure 1a)(Lyon, 2000). Alternately, the incident beam may be dispatched on three sensors, each one being covered with a spectrally selective filter. The three component images I R , I G and I B are simultaneously acquired by the three CCD sensors, and their combination leads to the final color image. Each digital three-CCD camera is characterized by its own spectral sensitivity functions R(λ ), G(λ ) and B(λ ) (see figure 1b for an example), which differ from the CIE color matching functions of the standard observer (see figure 1b). Since 2005, Foveon Inc. has been developing the X3 sensor, which uses a multilayer CMOS technology. This new sensor is based on three superimposed layers of photosites embedded in a silicon substrate. It takes advantage of the fact that lights of different wavelengths penetrate silicon to different depths (see figure 2a)(Lyon and Hubel, 2002). Each layer hence captures one of the three primary colors, namely blue, green and red, in the light incidence order. The three photosites associated with each pixel 5

thus provide signals from which the three component values are derived. Any camera equipped with this sensor is able to form a true color image from three full component images, as do three-CCD-based cameras. This sensor has been first used commercially in 2007 within the Sigma SD14 digital still camera. According to its manufacturer, its spectral sensitivity (see figure 2b) better fits with the CIE color matching functions than those of three-CCD cameras, providing images that are more consistent with human perception. Although three-CCD and Foveon technologies yield high quality images, the manufacturing costs of the sensor itself and of the optical device are high. As a consequence, cameras equipped with such sensors have not been so far affordable to everyone, nor widely distributed. In order to overcome these cost constraints, a technology using a single sensor has been developed. The solution suggested by Bayer from the Kodak company in 1976 (Bayer, 1976) is still the most widely used in commercial digital cameras today. It uses a CCD or CMOS sensor covered by a filter (Color Filter Array, or CFA) designed as a mosaic of spectrally selective color filters, each of them being sensitive to a specific wavelength range. At each element of the CCD sensor, only one out of the three color components is sampled, Red (R), Green (G) or Blue (B) (see figure 3a). Consequently, only one color component is available at each pixel of the image provided by the CCD charge transfer circuitry. This image if often related to as the raw image, but CFA image is preferred hereafter in our specific context. In order to obtain a color image from the latter, two missing levels must be estimated at each pixel thanks to a demosaicing algorithm (sometimes spelled demosaicking). As shown in figure 3b, many other processing tasks are classically achieved within a mono-CCD color camera (Lukac and Plataniotis, 2007). They consist for instance in raw sensor data correction or, after demosaicing, in color improvement, image sharpening and noise reduction, so as to provide a “visually pleasing” color image to the user. These processing tasks are essential to the quality of the provided image and, as a matter of fact, discriminate the various models of digital cameras, since manufacturers and models of sensors are not so numerous. The related underlying algorithms have common features or basis, and parameter tuning is often a key step leading to more or fewer residual errors. Together with noise characteristics of the imaging sensor, such artifacts may incidentally be used to typify each camera model (Bayrama et al., 2008). 2.1.2. Color Filter Arrays Several configurations may be considered for the CFA, and figure 4 shows some examples found in the literature. A few mono-CCD cameras use a CFA based on complementary color components (Cyan, Magenta and Yellow), with a 2 × 2 pattern which also sometimes includes a filter sensitive to the green light. But the very large majority of cameras are equipped with filter arrays based on R, G and B primary color components. Regardless of their arrangement and design, these arrays often include twice as many filters sensitive to the green primary as filters sensitive to blue or red light. This stems from Bayer’s observation that the human eye has a greater resolving power for green light. Moreover, the photopic luminous efficiency function of the human retina – also known as the luminosity function – is similar to the CIE 1931 green matching function Gc (λ ), with a maximum reached in the same spectral domain. Bayer 6

CCD (Bayer) CFA G R

CFA image

B G

stimulus

Color Filter Array (CFA) (a) Mono-CCD technology outline, using the Bayer Color Filter Array (CFA).

defective pixel correction

CFA data

linearization

dark current compensation

preprocessing

white balance

image processing

postprocessing

CFA image

demosaicing

estimated image post-processing

storage estimated color image

image CFA

digital zoom on the CFA image

EXIF file formatting

image compression

digital zoom on the estimated image

color correction

sharpening and noise reduction

(b) Image acquisition within a mono-CCD color camera (detailed schema). Dotted steps are optional.

F IG . 3: Internal structure of a mono-CCD color camera.

7

(a) Vertical stripes

(d) Complementary colors

(b) Bayer

(e)

“Panchromatic”, CFA2.0 (Kodak)

(c) Pseudo-random

or

(f) “Burtoni” CFA

F IG . 4: Configuration examples for the mosaic of color filters. Each square depicts a pixel in the CFA image, and its color is that of the monochromatic filter covering the associated photosite. therefore both makes the assumption that green photosensors capture luminance, whereas red and blue ones capture chrominance, and suggests to fill the CFA with more luminance-sensitive (green) elements than chrominance-sensitive (red and blue) elements (see figure 4b). The CFA using alternating vertical stripes (see figure 4a) of the RGB primaries has been released first, since it is well suited to the interlaced television video signal. Nevertheless, considering the Nyquist limits for the green component plane, Parulski (1985) shows that the Bayer CFA has larger bandwidth than the latter for horizontal spatial frequencies. The pseudo-random filter array (see figure 4c) has been inspired by the human eye physiology, in an attempt to reproduce the spatial repartition of the three cone cell types on the retina surface (Lukac and Plataniotis, 2005a). Its irregularity achieves a compromise between the sensitivity to spatial variations of luminance in the observed scene (visual acuity) and the ability to perceive thin objects with different colors (Roorda et al., 2001). Indeed, optimal visual acuity would require photosensors with identical spectral sensitivities which are constant over the spectrum, whereas the perception of thin color objects is better ensured with sufficient local density of different types of cones. Despite pseudo-random color filter arrays show interesting properties (Alleysson et al., 2008), their design and exploitation have not much been investigated so far ; for some discussions, see e.g. Condat (2009) or Savard (2007) about CFA design and Zapryanov and Nikolova (2009) about demosaicing of Bayer CFA “pseudo-random” variations. Among other studies drawing their inspiration from natural physiology for CFA design, Kröger’s work (2004) yields a new mo8

1.0

Y (λ ) Spectral sensitivity

0.8

0.6

C(λ )

0.4

M(λ )

0.2

G(λ ) 0 400

450

500

550

600

650

700

Wavelengthλ (nm) F IG . 5: Relative spectral sensitivity of the JAI CV-S3300P camera sensor (Jai Corporation, 2000). saic which mimics the retina of a cichlid fish, Astatotilapia burtoni (Günther, 1894). It is shown that this particular arrangement (see figure 4f), which includes many spatial frequencies and different geometries for color components, generates weak aliasing artifacts. This complex mosaic configuration efficiently enhances the simulated image quality (Medjeldi et al., 2009), but the effective implementation of such a sensor, and the demosaicing step of the corresponding CFA images, are open and challenging problems. Color filter arrays based on complementary primary colors have also been designed and used, with two main advantages. First, they own higher spectral sensitivity and wider bandwidth than RGB filters, which is of particular interest in noisy environments and/or when the frame rate imposes low integration period (Hirakawa, 2008). Figure 5 shows the spectral sensitivity of the JAI CV-S3300P camera sensor, equipped with the CFA of figure 4d. A few years ago, some professional still cameras used complementary color filter arrays to ensure high ISO sensitivity, as the Kodak DCS-620x model equipped with a CMY filter (Noble, 2000). As a second advantage, these CFAs make the generation of television luminance/chroma video signal almost immediate, and are sometimes embedded in PAL or NTSC color video cameras (Sony Corporation, 2000). Their usage is however largely restricted to television, since the strong mutual overlapping of C,M,Y spectral sensitivity functions makes the conversion into R,G,B primaries unsatisfactory. New types of CFA have recently been released, and used in camera models released by two major manufacturers. Since 1999, Fuji develops a new so-called Super CCD sensor, based on photosites in a 45-degree oriented honeycomb lattice (see figure 6). The HR version of 2003 (see figure 6a) allows to optimize the occupancy on the CCD surface, hence to potentially capture more light. “Square” pixels are obtained from octagonal photosites by combining the four neighbors in part, so that new pixels are created and the resolution is doubled. An alternative version of this sensor

9

Created pixel

“R pixel” Photosite “S pixel”

(a) Super CCD HR (2003)

(b) Super CCD SR (2003)

“R pixel”“S pixel”

(c) Super CCD SRII (2004)

Coupled pixels

(d) Super CCD EXR (2008)

F IG . 6: Super CCD technology. For clarity sake, photosites are represented further apart from each other than at their actual location. (SR, see figure 6b) has expanded dynamic range, by incorporating both high-sensitivity large photodiodes (“S-pixels”) used to capture normal and dark details, and smaller “Rpixels” sensitive to bright details. The EXR version (see figure 6d) takes advantage of same idea, but extra efforts have been conducted on noise reduction thanks to pixel binning, resulting in a new CFA arrangement and its exploitation by pixel coupling. As a proprietary technology, little technical detail is available on how Super CCD sensors turn the image into an horizontal/vertical grid without interpolating, or on how demosaicing associated with such sensors is achieved. A few hints may however be found in a patent using a similar imaging device (Kuno and Sugiura, 2006). In 2007, Kodak develops new filter arrays (Hamilton and Compton, 2007) as another alternative to the widely used Bayer CFA. The basic principle of this so-called CFA2.0 family of color filters is to incorporate transparent filter elements (represented as white squares on figure 4e), those filters being hence also known as RGBW or “panchromatic” ones. This property makes the underlying photosites sensitive to all wavelengths of the visible light. As a whole, the sensors associated with CFA2.0 are therefore more sensitive to low-energy stimuli than those using Bayer CFA. Such increase of global sensitivity leads to better luminance estimation, but at the expense of chromatic information estimation. Figure 7 shows the processing steps required to estimate a full color image from the data provided by a CFA2.0-based sensor. By modifying the CFA arrangement, manufacturers primarily aim at increasing the spectral sensitivity of the sensor. Lukac and Plataniotis (2005a) tackled the CFA design issue by studying the influence of the CFA configuration on demosaicing results. They considered ten different RGB color filter arrays, three of them being shown on figures 4a to 4c. A CFA image is first simulated by sampling one out of the three color components at each pixel in an original color image, according to the considered CFA pattern. A universal demosaicing framework is then applied to obtain a full-color image. The quality of the demosaiced image is finally evaluated by comparing it to the original image thanks to several objective error criteria. The authors conclude that the CFA design is critical to demosaicing quality results, but cannot advise any CFA that would yield best results in all cases. Indeed, the relative performance of filters is highly dependent on the tested image.

10

G

B G

B G

averaging R

B

G

G

R

demosaicing

G RB

G G

R

R

Color component pixels

CFA image

B

B G

R

B G

R

Color image

(reduced resolution)

(reduced resolution)

averaging

P

P

P

P

B−P B−P G−PG−P B B−P R−P R−P G G−P R−P R−P



Chrominance-luminance image (reduced resolution)

Luminance image (reduced resolution) P

P P P

P

interpolation

P

interpolation

B−P B−P B−P B−P G−P G−P G−P G−P B−P R−P B−P R−P B−P B−P R−P R−P G−P G−P G−P G−P B−P R−P B−P R−P B−P B−P R−P R−P G−P G−P G−P G−P B−P R−P B−P R−P B−P B−P R−P R−P G−P G−P G−P G−P R−P R−P R−P R−P

Chrominance-luminance image (full resolution) B

P

P

P

P

P

P

P

P

P

P

P

P

R

P

P

P

P

R

G

+

G

P

G

R

Luminance image (full resolution)

G

G

G

B G

RB G

R

B G

RB

RB G

R

B G

RB

RB

RB G

Panchromatic pixels

B G

RB

RB G

P

B G

RB

R

B G

R

Color image (full resolution)

F IG . 7: Processing steps of the raw image provided by a CFA2.0-based sensor. “Panchromatic pixels” are those associated with photosites covered with transparent filters. All in all, the Bayer CFA achieves a good compromise between horizontal and vertical resolutions, luminance and chrominance sensitivities, and therefore remains the favorite CFA in industrial applications. As this CFA is the most commonly used and has inspired some more recent ones, it will be considered first and foremost in the following text. Demosaicing methods presented hereafter are notably based on the Bayer CFA. 2.1.3. Demosaicing Formalization Estimated colors have less fidelity to color stimuli from the observed scene than those provided by a three-CCD camera. Improving the quality of color images acquired by mono-CCD cameras is still a highly relevant topic, investigated by researchers and engineers (Lukac, 2008). In this paper, we focus on the demosaicing step and examine its influence on the estimated image quality. In order to set a formalism for the demosaicing process, let us compare the acquisition process of a color image in a three-CDD camera and in a mono-CCD camera. Figure 8a outlines a three-CCD camera architecture, in which the color image of a scene is formed by combining the data from three sensors. The resulting color image I is composed of three color component planes I k , k ∈ {R,G,B}. In each plane I k , a given pixel P is characterized by the level of the color component k. A three-component vector defined as Ix,y , (Rx,y ,Gx,y ,Bx,y ) is therefore associated with each pixel – located at spatial coordinates (x,y) in image I. In a color mono-CCD camera, the color image generation is quite different, as shown in figure 8b : the single sensor delivers a raw image, hereafter called CFA image and denoted ICFA . If the Bayer CFA is considered, to each pixel with coordinates (x,y) in image ICFA is associated a single color

11

R sensor R image scene

optical device

G sensor G image

color image I

B sensor B image (a) Three-CCD camera

scene

cf

optical device

sensor

demosaicing

CFA image ICFA (b) Mono-CCD color camera

CFA filter

estimated color image Iˆ

F IG . 8: Color image acquisition outline, according to the camera type. component R, G or B (see figure 9) :  R  x,y CFA Bx,y Ix,y =  Gx,y

if x is odd and y is even, if x is even and y is odd,

(1a) (1b)

otherwise.

(1c)

The color component levels range from 0 to 255 when they are quantized with 8 bits. The demosaicing scheme F , most often implemented as an interpolation procedure, consists in estimating a color image Iˆ from ICFA . At each pixel of the estimated image, the color component available in ICFA at the same pixel location is picked up, whereas the other two components are estimated :  ˆ ˆ   (Rx,y ,Gx,y ,Bx,y ) F ˆ CFA Ix,y −→ Ix,y = (Rˆ x,y ,Gˆ x,y ,Bx,y )   ˆ (Rx,y ,Gx,y ,Bˆ x,y )

if x is odd and y is even,

(2a)

if x is even and y is odd, otherwise.

(2b) (2c)

Each triplet in equations (2) stands for a color, whose color component available at pixel P(x,y) in ICFA is denoted Rx,y , Gx,y or Bx,y , and whose other two components among Rˆ x,y , Gˆ x,y and Bˆ x,y are estimated for Iˆx,y . Before we get to the heart of the matter, let us still precise a few notations that will be most useful later in this section. In the CFA image (see figure 9), four different 12

G0,0

R1,0

G2,0

R3,0

G... 4,0

B0,1

G1,1

B2,1

G3,1

B... 4,1

G0,2

R1,2

G2,2

R3,2

G... 4,2

B0,3

G1,3

B2,3

G3,3

B... 4,3

G... 0,4

R... 1,4

G... 2,4

R... 3,4

G... 4,4

F IG . 9: CFA image from the Bayer filter. Each pixel is artificially colorized with the corresponding filter main spectral sensitivity, and the presented arrangement is the most frequently encountered in the literature (i.e. G and R levels available for the first two row pixels).

B−1,−1 G0,−1 B1,−1

R−1,−1 G0,−1 R1,−1

G−1,0

R0,0

G1,0

G−1,0

B0,0

G1,0

B−1,1

G0,1

B1,1

R−1,1

G0,1

R1,1

(a) {GRG}

G−1,−1 B0,−1 G1,−1

(b) {GBG}

G−1,−1 R0,−1 G1,−1

R−1,0

G0,0

R1,0

B−1,0

G0,0

B1,0

G−1,1

B0,1

G1,1

G−1,1

R0,1

G1,1

(c) {RGR}

(d) {BGB}

F IG . 10: 3 × 3 neighborhood structures of pixels in the CFA image. 13

structures are encountered for the 3 × 3 spatial neighborhood, as shown on figure 10. For each of these structures, the pixel under consideration for demosaicing is the central one, at which the two missing color components should be estimated thanks to the available components and their levels at the neighboring pixels. Let us denote the aforementioned structures by the color components available on the middle row, namely {GRG}, {GBG}, {RGR} and {BGB}. Notice that {GRG} and {GBG} are structurally similar, apart from the slight difference that components R and B are exchanged. Therefore, they can be analyzed in the same way, as can {RGR} and {BGB} structures. A generic notation is hence used in the following : the center pixel is considered having (0,0) spatial coordinates, and its neighbors are referred to using their relative coordinates (δ x,δ y). Whenever this notation bears no ambiguity, (0,0) coordinates are omitted. Moreover, we also sometimes use a letter (e.g. P) to generically refer to a pixel, its color components being then denoted as R(P), G(P) and B(P). The notation P(δ x,δ y) allows to refer to a pixel thanks to its relative coordinates, its colors components being then denoted Rδ x,δ y , Gδ x,δ y and Bδ x,δ y , as in figure 10. 2.1.4. Demosaicing Evaluation Outline Demosaicing objective is to generate an estimated color image Iˆ as close as possible to the original image I. Even this image is unavailable effectively, I is generally used as a reference to evaluate the demosaicing quality. Then, one either strive to obtain as a low value as possible for an error criterion, or as a high value as possible for a quality criterion comparing the estimated image and the original one. A classical evaluation procedure for the demosaicing result quality consists in (see figure 11) : 1. simulating a CFA image provided by a mono-CCD camera from a color original image provided by a three-CCD camera. This is achieved by sampling a single color component R, G or B at each pixel, according to the considered CFA arrangement (Bayer CFA of figure 9, in our case) ; 2. demosaicing this CFA image to obtain an estimated color image ; 3. comparing the original and estimated color images, so as to highlight artifacts affecting the latter. There is no general agreement on the demosaicing quality definition, which is highly dependent upon the estimated color image exploitation – as will be detailed in the next sections. In a first time, we will rely on visual examination, or else on the most used quantitative criterion (signal-to-noise ratio) for a quality result evaluation, which both require a reference image. As in most works related to demosaicing, we will here use the Kodak image database (Kodak, 1991) as a benchmark for performance comparison of the various methods, as well as for illustration purposes. More precisely, to avoid overloaded results, a representative subset of twelve of these images has been picked up as the most used set in literature. These natural images contain rich colors and textural regions, and are fully reproduced in figure 37 so that they can be referred to in the text.

14

Original image I

1. Color sampling

2. Demosaicing Estimated (simulated) image Iˆ CFA image ICFA

3. Comparison according to criteria

F IG . 11: Classical evaluation procedure for the demosaicing result quality (example of bilinear interpolation on an extract from the Kodak benchmark image “Lighthouse”). 2.2. Basic Schemes and Demosaicing Rules 2.2.1. Bilinear Interpolation The first solutions for demosaicing were proposed in the early eighties. They process each component plane separately and find the missing levels by applying linear interpolation on the available ones, in both main directions of the image plane. Such a bilinear interpolation is traditionally used to resize gray-level images (Gribbon and Bailey, 2004). Considering the {GRG} structure, the missing blue and green values at the center pixel are respectively estimated by bilinear interpolation thanks to the following equations : Bˆ

=



=

1 (B−1,−1 + B1,−1 + B−1,1 + B1,1 ) , 4 1 (G0,−1 + G−1,0 + G1,0 + G0,1 ) . 4

(3) (4)

As for the {RGR} structure, the missing red and blue component levels are estimated as follows : Rˆ = Bˆ =

1 (R−1,0 + R1,0 ) , 2 1 (B0,−1 + B0,1 ) . 2

(5) (6)

Alleysson et al. (2008) notice that such interpolation is achievable by convolution. For that purpose, consider the three planes formed of the sole levels of component k, k ∈ {R,G,B}, available in the CFA image, other component levels being set to zero. Let 15

G R G R

G

0

R

0

R

0

G

0

G

0

G

0

0

0

0

0

B G B G

B

0

0

0

0

0

0

G

0

G

0

B

0

B

0

B

G R G R

G

0

R

0

R

0

G

0

G

0

G

0

0

0

0

0

B G B G

B

0

0

0

0

0

0

G

0

G

0

B

0

B

0

B

G R G R

G

0

R

0

R

0

G

0

G

0

G

0

0

0

0

(a)

ICFA

(b)

ϕR

ICFA



(c)

ϕG

ICFA



(d)

ϕB

ICFA

0 

 F IG . 12: Definition of planes ϕ k ICFA by sampling the CFA image according to each  color component k, k ∈ {R,G,B}. The CFA image and planes ϕ k ICFA are here colorized for illustration sake. us denote ϕ k (I) the function sampling a gray-level image I according to the locations of the available color component k in the CFA : ( I(x,y) if component k is available at pixel P(x,y) in ICFA , ϕ (I)(x,y) = 0 otherwise. k

(7)

Figure 12 illustrates the special cases of planes ϕ k (ICFA ) obtained by applying functions ϕ k to ICFA . Let us also consider the convolution filters defined by the following kernels :     1 2 1 0 1 0 1 1 H G =  1 4 1  . (9) H R = H B =  2 4 2  (8) and 4 4 1 2 1 0 1 0 ˆ each color component plane Iˆk can now be In order to determine the color image I,  estimated by applying the convolution filter of kernel H k on the plane ϕ k ICFA , respectively : Iˆk = H k ∗ ϕ k (ICFA ) , k ∈ {R,G,B} .

(10)

Bilinear interpolation is easy to be implemented and not processing time consuming, but generates severe visible artifacts, as also shown in figure 11. The above scheme provides satisfying results in image areas with homogeneous colors, but many false colors in areas with spatial high frequencies – as for the fence bars in this extract. Following Chang and Tan (2006), a deep study of the causes of theses artifacts can be achieved by simulating their generation on a synthetic image (see figure 13a). In this original image, two homogeneous areas are separated by a vertical transition, which recreates the boundary between two real objects with different gray levels. At each pixel, the levels of all three color components are then equal. Levels of pixels depicting the darker left object (labeled as b) are lower than those of pixels depicting the lighter right object (labeled as h). Figure 13b shows the CFA image ICFA yielded by sampling

16

b

b

h

h

h

R

G R G R

b

b

h

h

h

G

B G B G

b

b

h

h

h

R

G R G R

b

b

h

h

h

G

B G B G

b

b

h

h

h

R

G R G R

(a) Original image

b

b+h

b

b+h

b

b+h

b

b+h

b

b+h

2 2 2 2 2

h h

h h

(b) CFA image

h h

3h+b 4

b

b

b

3b+h 4

h

h

h

h

h

h

h

h

b

b

3h+b 4

h

h

h

b

3b+h 4

h

b

3h+b 4

h

h

(d) Rˆ plane

h

b

h

(c) Estimated image

h

h

(e) Gˆ plane

h h

h

b

b+h

b

b+h

b

b

b+h

b

b

b+h

b

b+h

b b

b

2 2 2 2 2

h

h

h

h

h

h

h

h

h

h

(f) Bˆ plane

F IG . 13: Demosaicing by bilinear interpolation of an gray-level image with a vertical ˆ Gˆ and Bˆ planes are here colorized for illustration transition. The CFA image and R, sake. a single color component per pixel according to the Bayer CFA. The result of bilinear interpolation demosaicing applied to this image is given on figure 13c. Figures 13d ˆ Gˆ and B. ˆ On Rˆ and Bˆ planes, to 13f give details on the three estimated color planes R, this demosaicing algorithm generates a column of intermediate-level pixels, whose value is the average of the two object levels. On the green plane, it produces a jagged pattern on both edge sides, formed of pixels alternating between two intermediate levels – a low one (3b + h)/4 and a high one (3h + b)/4. As a whole, the edge area is formed of a square 2 × 2 pattern of four different colors repeated alongside the transition (see the estimated image in figure 13c). This demosaicing procedure has hence generated two types of artifacts : erroneously estimated colors (hereafter referred to as “false colors”), and an artificial jagged pattern (so-called “zipper effect”), which are both studied in section 4.2. According to the horizontal transition location relative to the CFA mosaic, the generated pattern may be either orange-colored as in figure 13c or with bluish colors as in figure 14c. These two dominant-color patterns may be actually observed in the estimated image of figure 11. 2.2.2. Main Demosaicing Rules Let us examine the component-wise profiles of the middle pixel row in the original image 13a and its corresponding estimated image 13c. Dissimilarities between these profiles on R, G and B planes are underlined on figure 15 : the transition occurs at identical horizontal locations on the three original image planes, but this is no more the case for the estimated image. Such inconsistency among the demosaicing results for different components generates false colors in the estimated image formed from 17

b

b

b

h

h

R

G R G R

b

b

b

h

h

G

B G B G

b

b

b

h

h

R

G R G R

b

b

b

h

h

G

B G B G

b

b

b

h

h

R

G R G R

(a) Refrence image

(b) CFA image

(c) Estimated image

F IG . 14: Variant version of image 13a, demosaiced by bilinear interpolation as well. their combination. It can also be noticed that the transition corresponds, in each color plane of the original image, to a local change of homogeneity along the horizontal direction. Bilinear interpolation averages the levels of pixels located on both sides of the transition, which makes the latter less sharp. In accordance with the previous observations, we can state that two main rules have to be enforced so as to improve demosaicing results : spatial correlation and spectral correlation. – Spectral correlation. The transition profiles plotted in figure 15 are identical for the original image component planes, which conveys strict correlation between components. For a natural image, Gunturk et al. (2002) show that the three color components are also strongly correlated. The authors apply a bidimensional filter built on a low-pass filter h0 = [1 2 1]/4 and a high-pass one h1 = [1 − 2 1]/4, so as to split each color component plane into four subbands resulting from row and column filtering : (LL) both rows and columns are low-pass filtered ; (LH) rows are low-pass and columns high-pass filtered ; (HL) rows are high-pass and columns low-pass filtered ; (HH) both rows and columns are high-pass filtered. For each color component, four subband planes are obtained in this way, respectively representing data in rather homogeneous areas (low-frequency information), horizontal detail (high-frequency information in the horizontal direction), vertical detail (high-frequency information in the vertical direction) and diagonal detail (high-frequency information in both main directions). The authors then compute a correlation coefficient rR,G between red and green components over each subband according to the following formula : X−1 Y −1

rR,G = s

∑ ∑ Rx,y − µ R x=0 y=0 s

X−1 Y −1

∑ ∑ (Rx,y

x=0 y=0

− µ R )2



Gx,y − µ G

X−1 Y −1



∑ ∑ (Gx,y

x=0 y=0

,

(11)

2 − µ G)

in which Rx,y (respectively Gx,y ) is the level at (x,y) pixel in the red (respectively green) component plane within the same subband, µ R and µ G being the average of Rx,y and Gx,y levels over the same subband planes. The correlation coefficient between the blue and green components is similarly computed. Test results on 18

A

A

A

h

A

h

R b

b

h

h

G b

b

h

h

B b

b

(a) Original image

(b) Estimated image

F IG . 15: Component-wise profiles of middle pixel row levels A-A in the original and estimated images. Black dots stand for available levels, and white dots for estimated levels.

19

twenty natural images show that those coefficients are always greater than 0.9 in subbands carrying spatial high frequencies at least in one direction (i.e. LH, HL and HH). As for the subband carrying low frequencies (LL), coefficients are lower but always greater than 0.8. This reveals a very strong correlation between levels of different color components in a natural image, especially in areas with high spatial frequencies. Lian et al. (2006) confirm, using a wavelet coefficient analysis, that high-frequency information is not only strongly correlated between the three component planes, but almost identical. Such spectral correlation between components should be taken into account to retrieve the missing components at a given pixel. – Spatial correlation. A color image can be viewed as a set of adjacent homogeneous regions whose pixels have similar levels for each color component. In order to estimate the missing levels at each considered pixel, one therefore should exploit the levels of neighboring pixels. However, this task is difficult at pixels near the border between two distinct regions due to high local variation of color components. As far as demosaicing is concerned, this spatial correlation property avoids to interpolate missing components at a given pixel thanks to neighbor levels which do not belong to the same homogeneous region. These two principles are generally taken into account sequentially by the demosaicing procedure. In the first step, demosaicing often consists in estimating the green component using spatial correlation. According to Bayer’s assumption, the green component has denser available data within the CFA image, and represents the luminance of the image to be estimated. Estimation of red and blue components (assimilated to chrominance) is only achieved in a second step, thanks to the previously interpolated luminance and using the spectral correlation property. Such a way of using both correlations is used by a large number of methods in the literature. Also notice that, although red and blue component interpolation is achieved after the green plane has been fully populated, spectral correlation is also often used in the first demosaicing step to improve the green plane estimation quality. 2.2.3. Spectral Correlation Rules In order to take into account the strong spectral correlation between color components at each pixel, two main hypotheses are proposed in the literature. The first one assumes a color ratio constancy and the second one is based on color difference constancy. Let us examine the underlying principles of each of these assumptions before comparing both. Interpolation based on color hue constancy, suggested by Cok (1987), is historically the first one based on spectral correlation. According to Cok, hue is understood as the ratio between chrominance and luminance, i.e. R/G or B/G. His method proceeds in two steps. In the first step, missing green values are estimated by bilinear interpolation. Red (and blue) levels are then estimated by weighting the green level at the given pixel with the hue average of neighboring pixels. For instance, interpolation of the blue level at the center pixel of {GRG} CFA structure (see figure 10a) uses the four diagonal

20

neighbors where this blue component is available :   1 B−1,−1 B1,−1 B−1,1 B1,1 ˆ ˆ B = G· . + + + 4 Gˆ −1,−1 Gˆ 1,−1 Gˆ −1,1 Gˆ 1,1

(12)

This bilinear interpolation between color component ratios is based on the local constancy of this ratio within an homogeneous region. Kimmel (1999) justifies the color ratio constancy assumption thanks to a simplified approach that models any color image as a Lambertian object surface observation. According to the Lambertian model, such a surface reflects the incident light to all directions with equal energy. The intensity I(P) received by the photosensor element associated to each pixel P is therefore independent of the camera position, and can be represented as : D E (13) I(P) = ρ ~N(P),~l ,

where ρ is the albedo (or reflection coefficient), ~N(P) is the normal vector to the surface element which is projected on pixel P, and ~l is the incident light vector. As the albedo ρ characterizes the object material, this quantity is different for each color component (ρ R 6= ρ G 6= ρ B ), and the three color components may be written as : D E I R (P) = ρ R ~N(P),~l , D E I G (P) = ρ G ~N(P),~l , D E I B (P) = ρ B ~N(P),~l .

(14) (15) (16)

Assuming that any object is composed of one single material, coefficients ρ R , ρ G and ρ B are then constant at all pixels representing an object. So, the ratio between two color components is also constant : D E ρ k ~N(P),~l k (P) ′ ρk I D E = ′ = constant, = (17) K k,k = k′ I (P) ρ k′ ~N(P),~l ρk

where (k,k′ ) ∈ {R,G,B}2 . Although this assumption is simplistic, it is locally valid and can be used within the neighborhood of the considered pixel. Another simplified and widely used model of correlation between components relies on the color difference constancy assumption. At a given pixel, this can be written as : D D E E ′ ′ ′ Dk,k = I k (P) − I k (P) = ρ k ~N(P),~l − ρ k ~N(P),~l = constant, (18) where (k,k′ ) ∈ {R,G,B}2 . As the incident light direction and amplitude are assumed to be locally constant, the color component difference is also constant within the considered pixel neighborhood.

21

As a consequence, the chrominance interpolation step in Cok’s method may be rewritten by using component difference averages, for instance :  1 Bˆ = Gˆ + (B−1,−1 − Gˆ −1,−1 ) + (B1,−1 − Gˆ 1,−1 ) + (B−1,1 − Gˆ −1,1 ) + (B1,1 − Gˆ 1,1 ) , 4 (19) instead of equation (12). The validity of this approach is also justified by Lian et al. (2007) on the ground of spatial high frequency similarity between color components. The color difference constancy assumption is globally consistent with the ratio rule used in formula (12). By considering the logarithmic non-linear transformation, the ′ difference D2k,k ,(k,k′ ) ∈ {R,G,B}2 , can be expressed as :  k     ′  ′ I (P) D2k,k = log10 k′ = log10 I k (P) − log10 I k (P) . (20) I (P)

Furthermore, we propose to compare those two assumptions expressed by equations (17) and (18). In order to take into account spectral correlation for demosaicing, it turns out that the difference of color components presents some benefits in comparison to their ratio. The latter is indeed error-prone when its denominator takes low values. This happens for instance when saturated red and/or blue components lead to comparatively low values of green, making the ratios in equation (12) very sensitive to red and/or blue blue small variations. Figure 16a is a natural image example which is highly saturated in red. Figures 16c and 16d show the images where each pixel value is, respectively, the component ratio R/G and difference R − G (pixel levels being normalized by linear dynamic range stretching). It can be noticed that these two images actually carry out less high-frequency information than the green component plane shown on figure 16b. A Sobel filter is then applied to these two images, so as to highlight the highfrequency information location. The Sobel filter output module is shown on figures 16e and 16f. In the right-hand parrot plumage area where red is saturated, the component ratio plane contains more high-frequency information than the component difference plane, which makes it more artifact-prone when demosaiced by interpolation. Moreover, high color ratio values may yield to estimated component levels beyond the data bounds, which is undesirable for the demosaicing result quality. To overcome these drawbacks, a linear translation model applied on all three color components is suggested by Lukac and Plataniotis (2004a, 2004b). Instead of equation (17), the authors reformulate the color ratio rule by adding a predefined constant value β to each component. The new constancy assumption, which is consistent with equation (17) in homogeneous areas, now relies on the ratio : ′

K2k,k =

Ik + β , ′ Ik + β

(21)

where (k,k′ ) ∈ {R,G,B}2 , and where β ∈ N is a ratio normalization parameter. Under this new assumption on the normalized ratio, the blue level interpolation formulated in

22

(a) Original image

(b) G plane

(c) R/G ratio plane

(d) R − G difference plane

(e) Sobel filter output on the R/G plane

(f) Sobel filter output on the R − G plane

F IG . 16: Component ratio and difference planes on a same image (“Parrots” from the Kodak database).

23

equation (12) under the ratio rule now becomes1 :    1 B−1,−1 + β B1,−1 + β B−1,1 + β B1,1 + β . (22) Bˆ = −β + Gˆ + β · · + + + 4 Gˆ −1,−1 + β Gˆ 1,−1 + β Gˆ −1,1 + β Gˆ 1,1 + β In order to avoid too different values for the numerator and denominator, Lukac and Plataniotis advise to set β = 256, so that the normalized ratios R/G and B/G range from 0.5 to 2. They claim that this assumption improves the interpolation quality in areas of transitions between objects and of thin details. In our investigation of the two main assumptions used for demosaicing, we finally compare the estimated image quality in both cases. The procedure depicted on figure 11 is applied on twelve natural images selected from Kodak database : the demosaicing schemes presented above, respectively using component ratio and difference, are applied to the simulated CFA image. To evaluate the estimated color image quality in comparison with the original image, we then compute an objective criterion, namely the peak signal-to-noise ratio (PSNR) derived from the mean square error (MSE) between the two images. On the red plane for instance, these quantities are defined as :

MSE R =

1 XY

X−1 Y −1

∑∑

x=0 y=0

R

PSNR = 10 · log10



R R Ix,y − Iˆx,y

2552 MSE R



.

2

,

(23) (24)

As the green component is bilinearly interpolated without using spectral correlation, only red and blue estimated levels vary according to the considered assumption. The PSNR is hence computed on these two planes. Results displayed in table 1 show that using the color difference assumption yields better results than using the simple ratio rule K, which is particularly noticeable for image “Parrots” of figure 16a. The normalized ratio K2 , which is less prone to large variations than K in areas with spatial high frequencies, leads to higher values for PSNRR and PSNRB . However, the color difference assumption generally outperforms ratio-based rules according to the PSNR criterion, and is most often used to exploit spectral correlation in demosaicing schemes.

1 The authors use, in this interpolation formula, extra weighting factors depending on the local pattern and dropped here for conciseness.

24

Image 1 (“Parrots”) 2 (“Sailboats”) 3 (“Windows”) 4 (“Houses”) 5 (“Race”) 6 (“Pier”) 7 (“Island”) 8 (“Lighthouse”) 9 (“Plane”) 10 (“Cape”) 11 (“Barn”) 12 (“Chalet”) Average

D 38.922 31.321 37.453 27.118 36.085 32.597 34.481 31.740 35.382 32.137 34.182 30.581 33.500

PSNRR K 36.850 31.152 36.598 26.985 35.838 31.911 34.376 31.415 35.058 31.863 33.669 30.413 33.011

K2 38.673 31.311 37.348 27.146 36.073 32.563 34.470 31.696 35.347 32.118 34.143 30.565 33.454

D 38.931 31.154 37.093 27.007 35.999 32.570 34.402 31.569 34.750 31.842 33.474 29.517 33.192

PSNRB K 38.678 30.959 36.333 26.889 35.819 32.178 34.208 31.093 34.324 31.532 33.193 29.263 32.872

K2 38.936 31.129 36.676 27.008 35.836 32.217 34.399 31.289 34.411 31.693 33.363 29.364 33.027

TAB . 1: Peak signal-to-noise ratios (in decibels) for red (PSNRR ) and blue (PSNRB ) planes of twelve Kodak images (Eastman Kodak and various photographers, 1991), demosaiced under the color difference D (see equation (18) and interpolation formula (19)), under the color ratio K (see equation (17) and interpolation formula (12)) and under the normalized ratio K2 (β = 256) (see equation (21) and interpolation formula (22)) constancy rules. For each color component and image, the value printed in bold typeface highlights the best result.

25

3. Demosaicing Schemes In this section, the main demosaicing schemes proposed in the literature are described. We distinguish two main procedures families, according to whether they scan the image plane or chiefly use the frequency domain. 3.1. Edge-adaptive Demosaicing Methods Estimating the green plane before R and B ones is mainly motivated by the double amount of G samples in the CFA image. A fully populated G component plane will subsequently make the R and B plane estimation more accurate. As a consequence, the G component estimation quality becomes critical in the overall demosaicing performance, since any error in the G plane estimation is propagated in the following chrominance estimation step. Important efforts are therefore devoted to improve the estimation quality of the green component plane – usually assimilated to luminance –, especially in high-frequency areas. Practically, when the considered pixel lies on an edge between two homogeneous areas, missing components should be estimated along the edge rather than across it. In other words, neighboring pixels to be taken into account for interpolation should not belong to distinct objects. When exploiting the spatial correlation, a key issue is to determine the edge direction from CFA samples. As demosaicing methods presented in the following text generally use specific directions and neighborhoods in the image plane, some useful notations are introduced in figure 17. 3.1.1. Gradient-based Methods Gradient computation is a general solution to edge direction selection. Hibbard’s method (1995) uses horizontal and vertical gradients, computed at each pixel where the G component has to be estimated, in order to select the direction which provides the best green level estimation. Let us consider the {GRG} CFA structure for instance (see figure 10a). Estimating the green level Gˆ at the center pixel is achieved in two successive steps : 1. Approximate the gradient module (hereafter simply referred to as gradient for simplicity) according to horizontal and vertical directions, as : ∆x ∆y

= |G−1,0 − G1,0 | , = |G0,−1 − G0,1 | .

(25) (26)

2. Interpolate the green level as :  if ∆x < ∆y ,   (G−1,0 + G1,0 )/2 (G0,−1 + G0,1 )/2 if ∆x > ∆y , Gˆ =   (G0,−1 + G−1,0 + G1,0 + G0,1 )/4 if ∆x = ∆y .

(27a) (27b) (27c)

Laroche and Prescott (1993) suggest to consider a 5 × 5 neighborhood for partial derivative approximations thanks to available surrounding levels, for instance ∆x = |2R − R−2,0 − R2,0 |. Moreover, Hamilton and Adams (1997) combine both approaches. 26

x′

−2

−2

−1

−1

0

1

2

−2

x

−1

1 2

y′

y P(δ x, δ y) ∈ N4 P(δ x, δ y) ∈ N4′

−1

0

−1 1

x

2

−2

1

2

2

y

2

x

y

(δ x, δ y) ∈ {(0, − 1) , (−1,0) , (1,0) , (0,1)} (δ x, δ y) ∈ {(−1, − 1) , (1, − 1) , (−1,1) , (1,1)}

⇔ ⇔

−2

−1

−1 1

1

(c) N4′ neighborhood

−2

0

0

−1

1

(b) N4 neighborhood

(a) Directions

−2

−2

−2

−1

2

x

−2

−1

0

1

1

2

2

y

1

2

x

y

(d) N8 neighborhood (N8 , N4 ∪ N4′ )

(e) N9 pixel set (N9 , N8 ∪ P(0,0))

F IG . 17: Notations for the main spatial directions and considered pixel neighborhoods. R−2,−2 G−1,−2 R0,−2 G1,−2 R2,−2 G−2,−1 B−1,−1 G0,−1 B1,−1 G2,−1 R−2,0 G−1,0

R0,0

G1,0

R2,0

G−2,1 B−1,1

G0,1

B1,1

G2,1

R−2,2 G−1,2

R0,2

G1,2

R2,2

F IG . 18: 5 × 5 neighborhood with central {GRG} structure in the CFA image. 27

To select the interpolation direction, these authors take into account both gradient and Laplacian second-order values by using the green levels available at nearby pixels and red (or blue) samples located 2 pixels apart. For instance, to estimate the green level at {GRG} CFA structure (see figure 18), Hamilton and Adams use the following algorithm : 1. Approximate the horizontal ∆x and vertical ∆y gradients thanks to absolute differences as : ∆x = |G−1,0 − G1,0 | + |2R − R−2,0 − R2,0 | , ∆y = |G0,−1 − G0,1 | + |2R − R0,−2 − R0,2 | . 2. Interpolate the green level as :  (G−1,0 + G1,0 ) /2 + (2R − R−2,0 − R2,0 ) /4     (G 0,−1 + G0,1 ) /2 + (2R − R0,−2 − R0,2 ) /4 Gˆ =  (G0,−1 + G−1,0 + G1,0 + G0,1 ) /4    + (4R − R0,−2 − R−2,0 − R2,0 − R0,2 ) /8

(28) (29)

if ∆x < ∆y , if ∆x > ∆y ,

(30a) (30b)

if ∆x = ∆y .

(30c)

This proposal outperforms Hibbards’ method. Indeed, precision is gained not only by combining two color component data in partial derivative approximations, but also by exploiting spectral correlation in the green plane estimation. It may be noticed that formula (30a) for the horizontal interpolation of green component may be split into one left Gˆ g and one right Gˆ d side parts : Gˆ g Gˆ d

= G−1,0 + (R − R−2,0 ) /2,

= G1,0 + (R − R2,0 ) /2,   Gˆ = Gˆ g + Gˆ d /2.

(31) (32) (33)

Such interpolation is derived from the color difference constancy assumption, and hence exploits spectral correlation for green component estimation. Also notice that, in these equations, horizontal gradients are assumed to be similar for both red and blue components. A complete formulation has been given by Li and Randhawa (2005). As these authors show besides, the green component may more generally be estimated by a Taylor series as long as green levels are considered as a continuous function g which is differentiable in both main directions. The above equations (31) and (32) may then be seen as first-order approximations of this series. Indeed, in Gˆ g case for instance, the horizontal approximation is written as g(x) = g(x − 1) + g′ (x − 1) ≈ g(x − 1) + (g(x) − g(x − 2)) /2. Using the local constancy property of color component difference yields Gˆ x − Gˆ x−2 = Rx − Rx−2 , from which expression (31) is derived. Li and Randhawa suggest an approximation based on the second-order derivative, Gg estimation becoming : Gˆ g = G−1,0 + (R − R−2,0 ) /2 + (R − R−2,0 ) /4 − (G−1,0 − G−3,0 ) /4, 28

(34)

for which a neighborhood size of 7 × 7 pixels is required. The additional term compared to (31) enables to refine the green component estimation. Similar reasoning may be used to select the interpolation direction. According to the authors, increasing the approximation order in such a way improves estimation results under the mean square error (MSE) criterion. Another proposal comes from Su (2006), namely to interpolate the green level as a weighted sum of values defined by equations (30a) and (30b). Naming the latter respectively Gˆ x = (G−1,0 + G1,0 ) /2 + (2R − R−2,0 − R2,0 ) /4 and Gˆ y = (G0,−1 + G0,1 ) /2 + (2R − R0,−2 − R0,2 ) /4, horizontal and vertical interpolations are combined as : Gˆ =



w1 · Gˆ x + w2 · Gˆ y w1 · Gˆ y + w2 · Gˆ x

if ∆x < ∆y , if ∆x > ∆y ,

(35a) (35b)

where w1 and w2 are the weighting factors. Expression (30c) remains unchanged (i.e. Gˆ = Gˆ x + Gˆ y /2 if ∆x = ∆y ). The smallest level variation term must be weighted by the highest factor (i.e. w1 > w2 ) ; expressions (30a) and (30b) incidentally correspond to the special case w1 = 1, w2 = 0. Incorporating terms associated to high level variations allows to undertake high-frequency information in the green component interpolation expression itself. Su sets w1 to 0.87 and w2 to 0.13, since these weighting factor values yield the minimal average MSE (for the three color planes) over a large series of demosaiced images. Other researchers, like Hirakawa and Parks (2005) or Menon et al. (2007), use the filterbank approach in order to estimate missing green levels, before selecting the horizontal or vertical interpolation direction at {GRG} and {GBG} CFA structures. This enables to design five-element mono-dimensional filters which are optimal towards criteria specifically designed to avoid interpolation artifacts. The proposed optimal filters (e.g. hopt = [−0.2569 0.4339 0.5138 0.4339 − 0.2569] for Hirakawa and Parks’ scheme) are close to the formulation of Hamilton and Adams2 . 3.1.2. Component-consistent Demosaicing Hamilton and Adam’s method selects the interpolation direction on the basis of horizontal and vertical gradient approximations. But this may be inappropriate, and unsatisfying results may be obtained in areas with textures or thin objects. Figure 19 shows an example where horizontal ∆x and vertical ∆y gradient approximations do not allow to take the right decision for the interpolation direction. Wu and Zhang (2004) propose a more reliable way to select this direction, still by using a local neighborhood. Two candidate levels are computed to interpolate the missing green value at a given pixel : one using horizontal neighbors, the second using vertical neighboring pixels. Then, the missing R or B value is estimated in both horizontal and vertical directions with each of these G candidates. A final step consists in selecting the most appropriate interpolation direction, namely that minimizing the gradient sum on the color difference planes (R − G and B − G) in the considered pixel neighborhood. This interpolation direction 2 No detail will be here given about how R and B components are estimated by the above methods, for their originality mainly lies in the G component estimation.

29

250 200

R−2,0

R2,0

R

150

G−1,0

100 50

G1,0

∆x

= |G−1,0 − G1,0 | + |2R − R−2,0 − R2,0 | = 15

−2

−1

0

1

2

x

250 200

R0,−2

R

G0,−1

150

R0,2 G0,1

100 50

∆y = |G0,−1 − G0,1 | + |2R − R−2,0 − R2,0 | = 17 −2

−1

0

1

2

y

F IG . 19: Direction selection issue in Hamilton and Adams’ interpolation scheme (1997), on an extract of the original image “Lighthouse” containing thin details. Plots highlight the R and G component values used for horizontal and vertical gradient computations : color dots represent available levels in the CFA image, whereas white dots are levels to be estimated. As ∆x < ∆y , horizontal neighboring pixels are wrongly used in Gˆ estimation. This is shown on the lower right subfigure, together with the erroneous demosaicing result (at center pixel only).

30

allows to select the levels – computed beforehand – to be taken into account for the missing component estimation. More precisely, Wu and Zhang’s approach proceeds in the following steps : 1. At each pixel where the green component is missing, compute two candidate levels : one denoted as Gˆ x by using the horizontal direction (according to equation (30a)), and another Gˆ y by using the vertical direction (according to (30b)). For other pixels, set Gˆ x = Gˆ y = G. 2. At each pixel where the green component is available, compute two candidate levels (one horizontal and one vertical) for each of the missing red and blue components. At {RGR} CFA structure these levels are expressed as (see figure 10c) : Rˆ x Rˆ y Bˆ x Bˆ y

1 = G + (R−1,0 − Gˆ x−1,0 + R1,0 − Gˆ x1,0 ), 2 1 = G + (R−1,0 − Gˆ y−1,0 + R1,0 − Gˆ y1,0 ), 2 1 = G + (B0,−1 − Gˆ x0,−1 + B0,1 − Gˆ x0,1 ), 2 1 = G + (B0,−1 − Gˆ y0,−1 + B0,1 − Gˆ y0,1 ). 2

(36) (37) (38) (39)

3. At each pixel with missing green component, compute two candidate levels for the missing chrominance component (i.e. Bˆ at R samples, and conversely). At {GRG} CFA structure, the blue levels are estimated as (see figure 10a) : Bˆ x

1 = Gˆ x + ∑ (B(P) − Gˆ x (P)), 4 P∈N ′

(40)

1 = Gˆ y + ∑ (B(P) − Gˆ y (P)), 4 P∈N ′

(41)

4

Bˆ y

4

where N4′ is composed of the four diagonal pixels (see figure 17c). 4. Achieve the final estimation at each pixel P by selecting one component triplet out of the two candidates computed beforehand in both horizontal and vertical directions. So as to use the direction for which variations of (R − G) and (B − G) component differences are minimal, the authors suggest the following selection criterion :

ˆ B) ˆ G, ˆ = (R,



(Rˆ x ,Gˆ x ,Bˆ x ) if ∆x < ∆y , (Rˆ y ,Gˆ y ,Bˆ y ) if ∆x > ∆y ,

(42a) (42b)

where ∆x and ∆y are, respectively, the horizontal and vertical gradients on the difference plane of estimated colors. More precisely, these gradients are computed by considering all distinct (Q,Q′ ) pixel pairs, respectively row-wise and column-wise, within the 3 × 3 window centered at P (see figure 17e) : 31

∆x

=



(Q,Q′ )∈N9 ×N9 y(Q)=y(Q′ )

∆y

=



(Q,Q′ )∈N9 ×N9 x(Q)=x(Q′ )

x   Rˆ (Q) − Gˆ x (Q) − Rˆ x (Q′ ) − Gˆ x (Q′ )

  + Bˆ x (Q) − Gˆ x (Q) − Bˆ x (Q′ ) − Gˆ x (Q′ ) , y   Rˆ (Q) − Gˆ y (Q) − Rˆ y (Q′ ) − Gˆ y (Q′ )

  + Bˆ y (Q) − Gˆ y (Q) − Bˆ y (Q′ ) − Gˆ y (Q′ ) .

(43)

(44)

This method uses the same expressions as Hamilton and Adams’ ones in order to estimate missing color components, but improves the interpolation direction decision by using a 3 × 3 window – rather than a single row or column – in which the gradient of color differences (R − G and B − G) is evaluated so as to minimize its local variation. Among other attempts to refine the interpolation direction selection, Hirakawa and Parks (2005) propose a selection criterion which uses the number of pixels with homogeneous colors in a local neighborhood. The authors compute the distances between the color point of the considered pixel and those of its neighbors in the CIE L∗ a∗ b∗ color space (defined in section 4.3.2), which better fits with the human perception of colors than RGB space. They design an homogeneity criterion with adaptive thresholding which reduces color artifacts due to incorrect selection of the interpolation direction. Chung and Chan (2006) nicely demonstrate that green plane interpolation is critical to the estimated image quality, and suggest to evaluate the local variance of color difference as an homogeneity criterion. The selected direction corresponds to minimal variance, which yields green component refinement especially in textured areas. Omer and Werman (2004) use a similar way to select the interpolation direction, except that the local color ratio variance is used. These authors also propose a criterion based on a local corner score. Under the assumption that demosaicing generates artificial corners in the estimated image, they apply the Harris corner detection filter (Harris and Stephens, 1988), and select the interpolation direction which provides the fewest detected corners. 3.1.3. Template Matching-based Methods This family of methods aims at identifying a template-based feature in each pixel neighborhood, in order to interpolate according to the locally encountered feature. Such strategy has been first implemented by Cok in a patent dating back to 1986 (Cok, 1986)(Cok, 1994), in which the author classifies 3 × 3 neighborhoods into edge, stripe or corner features (see figure 20). The algorithm original part lies in the green component interpolation at each pixel P where it misses (i.e. at center pixel of {GRG} or {GBG} CFA structures) : 1. Compute the average green level available at the four nearest neighbor pixels of P (i.e. belonging to N4 , as defined on figure 17b). Examine whether each of these four green levels is lower (b), higher (h), or equal to their average. Sort these four

32

values in descending order, let G1 > G2 > G3 > G4 , and compute their median M = (G2 + G3 )/2. 2. Classify P neighborhood as : (a) edge if 3 h and 1 b are present, or 1 h and 3 b (see figure 20a) ; (b) stripe if 2 h and 2 b are present and opposite by pairs (see figure 20b) ; (c) corner if 2 h and 2 b are present and adjacent by pairs (see figure 20c). In the special case when two values are equal to the average, the encountered feature is taken as : (a) a stripe if the other two pixels b and h are opposite ; (b) an edge otherwise. 3. Interpolate the missing green level according to the previously identified feature : (a) for an edge, Gˆ = M ; (b) for a stripe, Gˆ = CLIPGG32 (M − (S − M)), where S is the average green level over the eight neighboring pixels labeled as Q in figure 20d ; (c) for a corner, Gˆ = CLIPGG32 (M − (S′ − M)), where S′ is the average green level over the four neighboring pixels labeled as Q in figure 20e, which are located on both sides of the borderline between b and h pixels. Function CLIPGG32 simply limits the interpolated value to range [G3 ,G2 ] :   α G2 ∀α ∈ R, CLIPG3 (α ) = G2   G3

if G3 6 α 6 G2 , if α > G2 , if α < G3 .

(45)

This method, which classifies neighborhood features into three groups, encompasses three possible cases in an image. But the criterion used to distinguish the three features is still too simple, and comparing green levels with their average may not be sufficient to determine the existing feature adequately. Moreover, in case of a stripe feature, interpolation does not take into account this stripe direction. Chang and Tan (2006) also implement a demosaicing method based on templatematching, but apply it on the color difference planes (R − G and B − G) in order to interpolate R and B color components, G being estimated beforehand thanks to Hamilton and Adams’ scheme described above. The underlying strategy consists in simultaneously exploiting the spatial and spectral correlations, and relies on a local edge information which causes fewer color artifacts than Cok’s scheme. Although color difference planes carry less high-frequency information than color component planes (see figure 16), they can provide relevant edge information in areas with high spatial frequencies. 3.1.4. Adpative Weighted-Edge Method Methods described above, as template-based or gradient-based ones, achieve interpolation according to the local context. They hence require prior neighborhood classification. The adaptive weighted-edge linear interpolation, first proposed by Kimmel 33

h h

P

h

b b

h

P

b

h

P

h

h

b

b

(a) Edge

(b) Stripe

(c) Corner

Q

Q Q

b h

Q

P

Q

Q

h

h b

Q

b Q

Q P

h Q

b

Q

Q

(d) Stripe neighborhood

(e) Corner neighborhood

F IG . 20: Feature templates proposed by Cok to interpolate the green component at pixel P. These templates, which are defined modulo π /2, provide four possible Edge and Corner features, and two possible Stripe features. (1999), is a method which merges these two steps into a single one. It consists in weighting each locally available level by a normalized factor as a function of a directional gradient. For instance, interpolating the green level at center pixel of {GRG} or {GBG} CFA structures is achieved as : w0,−1 · G0,−1 + w−1,0 · G−1,0 + w1,0 · G1,0 + w0,1 · G0,1 , Gˆ = w0,−1 + w−1,0 + w1,0 + w0,1

(46)

where wδ x,δ y coefficients are the weighting factors. In order to exploit spatial correlation, these weights are adjusted according to the locally encountered pattern. Kimmel suggests to use local gradients to achieve weight computation. In a first step, directional gradients are approximated at a CFA image pixel P by using the levels of its neighbors. Gradients are respectively defined in horizontal, vertical, x′ -diagonal (top-right to bottom-left) and y′ -diagonal (top-right to bottom-right) directions (see figure 17a) over a 3 × 3 neighborhood by the following generic expressions : ∆x (P) = (P1,0 − P−1,0 )/2,

(47)

∆y (P) = (P0,−1 − P0,1 )/2,

  √  √  max (G1,−1 − G)/ 2 , (G−1,1 − G)/ 2 ∆ (P) = √  (P1,−1 − P−1,1 )/2 2 x′

34

(48)

at G locations,

(49a)

elsewhere,

(49b)

  √  √  max (G−1,−1 − G)/ 2 , (G1,1 − G)/ 2 ′ ∆y (P) = √  (P−1,−1 − P1,1 )/2 2

at G locations,

(50a)

elsewhere,

(50b)

where Pδ x,δ y stands for the neighboring pixel of P, with relative coordinates (δ x,δ y), in the CFA image. Here, R, G or B is not specified, since these generic expressions apply to all CFA image pixels, whatever the considered available component. However, we notice that all differences involved in equations (47) and (48) imply levels of a same color component. The weight wδ x,δ y in direction d, d ∈ {x, y, x′ , y′ }, is then computed from directional gradients as : wδ x,δ y = q

1 1 + ∆d (P)2 + ∆d (Pδ x,δ y )2

,

(51)

where direction d used to compute the gradient ∆d is defined by the center pixel P and its neighbor Pδ x,δ y . At the right-hand pixel (δ x,δ y) = (1,0) as an example, the horizontal direction x is used for d ; ∆d (P) and ∆d (P1,0 ) are therefore both computed by expression (47) defining ∆x , and the weight is expressed as : w1,0 = p

1 . 1 + (P−1,0 − P1,0 )2 /4 + (P2,0 − P)2 /4

(52)

Definition of weight wδ x,δ y is built so that a local transition in a given direction yields a high gradient value in the same direction. Consequently, weight wδ x,δ y is close to 0 for the neighbor Pδ x,δ y and does not contribute much to the final estimated green level according to equation (46). On the opposite, weight wδ x,δ y is equal to 1 when the directional gradients are equal to 0. Adjustments in weight w computation are proposed by Lu and Tan (2003), who use a Sobel filter to approximate the directional gradient, and the absolute – instead of square – value of gradients in order to boost computation speed. Such a strategy is also implemented by Lukac and Plataniotis (2005b). Once then green plane has been fully populated thanks to equation (46), red and blue levels are estimated by using component ratios R/G and B/G among neighboring pixels. Interpolating the blue component is for instance achieved according to two steps (the red one being processed in a similar way) : 1. Interpolation at red locations (i.e. for {GRG} CFA structure) : ˆ Bˆ = G·

B(P) ∑ w(P) · G(P) ˆ

P∈N4′

∑ w(P)

B

ˆ = G·

B

B

B

w−1,−1 · Gˆ−1,−1 + w1,−1 · Gˆ1,−1 + w−1,1 · Gˆ−1,1 + w1,1 · Gˆ1,1 −1,−1

1,−1

−1,1

1,1

w−1,−1 + w1,−1 + w−1,1 + w1,1

P∈N4′

(53)

35

.

2. Interpolation at other CFA locations with missing blue level (i.e. at {RGR} and {BGB} structures) : ˆ

Bˆ = G·

B(P) ∑ w(P) · G(P) ˆ

P∈N4

∑ w(P)



= G·







w0,−1 · Gˆ0,−1 + w−1,0 · Gˆ−1,0 + w1,0 · Gˆ1,0 + w0,1 · Gˆ0,1 −1,0

0,−1

1,0

0,1

w0,−1 + w−1,0 + w1,0 + w0,1

P∈N4

.

(54) Once all missing levels have been estimated, Kimmel’s algorithm (1999) achieves green plane refinement by using the color ratio constancy rule. This iterative refinement procedure is taken up by Muresan et al. (2000) with a slight modification : instead of using all N8 neighboring pixels in step 1 below, only neighboring pixels with green available component are considered. The following steps describe this refinement scheme : 1. Correct the estimated green levels with the average of two estimations (one on the blue plane, the other on the red one), so that the constancy rule is locally enforced for color ratio G/R : 1 ˆ R ˆ B G +G , Gˆ = 2

where :

G(P)

Gˆ R ⌢

(55) G(P)

∑ w(P)· R(P) ˆ

⌢ P∈N 4

, R·

and

∑ w(P)

P∈N4

Gˆ B

∑ w(P)· B(P) ˆ

⌢ P∈N 4

, B·

∑ w(P)

,

P∈N4



B and R standing either for an estimated level or an available CFA value, according to the considered CFA structure ({GRG} or {GBG}).

2. Correct then red and blue estimated levels at green locations, by using weighted R/G and B/G ratios at the eight neighboring pixels : ⌢



R(P) ∑ w(P) · ⌢

G(P)

P∈N8 Rˆ = G · ∑ w(P)

and

(56)

P∈N8

B(P) ∑ w(P) · ⌢

G(P)

P∈N8 Bˆ = G · ∑ w(P)

. (57)

P∈N8

3. Repeat the two previous steps twice. This iterative correction procedure gradually enforces more and more homogeneous G/R and G/B color ratios, whereas the green component is estimated by using spectral correlation. Its convergence is however not always guaranteed, which may cause trouble for irrelevant estimated values. When a level occurring in any color ratio denominator is very close or equal to zero, the associated weight may not cancel the resulting bias. Figure 21c shows some color artifacts which are generated in this case. In pure yellow areas, quasi-zero blue levels cause a saturation of the estimated green component at R and B locations, which then alternate with original green levels. 1 , Smith (2005) suggests to compute adaptive weights as wδ x,δ y = 1+4|∆d (P)|+4|∆ d (P )| δ x,δ y

in order to reduce the division bias and contribution of pixels on both edge sides. 36

(a) Original image

(b) Estimated image before correction

(c) Estimated image after correction

F IG . 21: Demosaicing result achieved by Kimmel’s method (1999), before and after the iterative correction steps. Generated artifacts are pointed out on image (c). Lukac et al. (2006) choose to apply adaptive weighting on color difference planes for R and B component estimations, which avoids the above-mentioned artifacts during the iterative correction step. Tsai and Song (2007) take up the latter idea, but enhance the green plane interpolation procedure : weights are adapted to the local topology thanks to a preliminary distinction between homogeneous and edge areas. 3.1.5. Local Covariance-based Methods In his PhD dissertation, Li (2000) presents an interpolation scheme to increase the resolution of a gray-level image. Classical interpolation methods (bilinear and bicubic), based on spatial invariant models, tend to blur transitions and generate artifacts in high-frequency areas. Li’s approach exploits spatial correlation by computing a local level covariance, without relying on directional gradients as do the above-mentioned methods in this section. Beyond resolution enhancement, the author applies this approach to demosaicing (Li and Orchard, 2001). In the CFA image, each of R, G or B color component plane may be viewed as a sub-sampled version of its respective, fully-populated estimated color plane. According to this consideration, a missing level in a given color plane is interpolated by using local covariance, preliminarily estimated from neighboring levels available in the same plane. The underlying principle of this method may be better understood by considering the resolution enhancement problem first. More precisely, figure 22 illustrates how the resolution of a gray-level image can be doubled thanks to geometric duality, in a twostep procedure. The first step consists in interpolating P2i+1,2 j+1 level (represented by a white dot in figure 22a) from available P2(i+k),2( j+l) levels (black dots). The following linear combination of N4′ neighbors is used here : 1

Pˆ2i+1,2 j+1 =

1

∑ ∑ α2k+l P2(i+k),2( j+l),

(58)

k=0 l=0

in which αm coefficients, 0 6 m 6 3, of ~α are computed as follows (see justification

37

Aˆ 03 2i−2 2j−2

2i

2i−1

2j−1

aˆ0

2i+2

2i+1

aˆ3

2i−2 2i−1 2j−2 Aˆ 01

aˆ0

A03

A01 a1

2j

aˆ1

aˆ2 a0 a3

2j+1

2i+2

2i+1

2j−1

2j

aˆ1

2i

a1

a0 a3 aˆ3 a2

2j+1

a2

aˆ2

2j+2

2j+2

(a) Interpolating lattice P2i+1,2 j+1 from lattice P2i,2 j

(b) Interpolating lattice Pi, j (i + j odd) from lattice Pi, j (i + j even)

F IG . 22: Geometric duality between the low-resolution covariance and the highresolution covariance. Black dots are the available levels at low resolution, and the white dot is the considered pixel to be interpolated. In subfigure (b), diamonds represent pixels estimated in the previous step.

Aˆ 03 2i−2 2j−2

2i−1

Aˆ 01

2j−1 2j 2j+1

aˆ1

2i

2i+1

2i+2

2i− 1 2j−2

2i

2i+1

2i+2

2i+3

2i− 1 2j−2

Aˆ 01

aˆ0 A01 a0 a1 a3 a2aˆ3 aˆ2

2j−1

aˆ0 aˆ3

2j

aˆ1 aˆ a0 a3 2 a1 a2

2j+1

G (a) Interpolating G at R or B locations

2i+2

2i+3

A01 a0 a1 a3 a2aˆ3

A03 2j

2i+1

aˆ0

2j−1

aˆ1

2j+1

aˆ2

2j+2

2j+2

2j+2

2i



R

B

(b) Interpolating R at B locations

R





B

(c) Interpolating R at G locations

F IG . 23: Geometric duality between covariances used in demosaicing. Color dots are the available components in the CFA image, and the white dot is the considered pixel to be interpolated. In subfigures (b) and (c), diamonds represent pixels estimated in the previous step, and spatial coordinates are shifted one pixel right.

38

and details in Li and Orchard, 2001) : ~α = A−1~a.

(59)

This expression incorporates the local covariance matrix A , [Am,n ], 0 6 m,n 6 3 between the four neighboring levels considered pair-wise (e.g. A03 in figure 22a), and the covariance vector ~a , [am ], 0 6 m 6 3 between the pixel level to be estimated and those of its four available neighbors (see figure 22a)3 . The main issue is to get these covariances for the high-resolution image from levels which are available at low resolution. This is achievable by using the geometric duality principle : once covariance is computed in a local neighborhood of the low-resolution image, the equivalent covariance at high resolution is estimated by geometric duality which considers pixel pairs in the same direction at both resolutions. Under this duality principle, a0 is for instance estimated by aˆ0 , A03 being replaced by Aˆ 03 (see figure 22). The underlying assumption to approximate am by aˆm and Am,n by Aˆ m,n , is that the local edge direction is invariant to image resolution. The second step consists in estimating remaining unavailable levels, as for the white dot on figure 22b. Interpolation then relies on exactly the same principle as above, except that the available pixel lattice is now the previous one rotated by π /4. Applying this method to demosaicing is rather straightforward : 1. Fill out the green plane at R and B locations by using : Gˆ =



α (P)G(P),

(60)

P∈N4

where α coefficients are computed according to expression (59) and figure 23a. 2. Fill out the two other color planes, by exploiting the assumption of color difference (R − G and B − G) constancy. For the red plane as example : (a) At B locations, interpolate the missing red level as :   ˆ Rˆ = Gˆ + ∑ α (P) R(P) − G(P) ,

(61)

P∈N4′

where α coefficients are computed according to figure 23b. (b) At G locations, interpolate the missing red level as : ⌢  ˆ Rˆ = G + ∑ α (P) R(P) − G(P) ,

(62)

P∈N4



where α coefficients are computed according to figure 23c, R being a value either available in ICFA or estimated. 3 Notations used here differ from those in the original publication (i.e. R and ~r for covariances) in order to avoid any confusion.

39

Although this method yields satisfying results (see next subsection), some limits may be pointed out. First, it requires the covariance matrix A to be invertible so that α coefficients can be computed. Li shows that this condition may not be verified in homogeneous areas of the image. Second, computing covariance matrices is a greedy processing task. To overcome those drawbacks, the author proposes a hybrid approach by using covariance-based interpolation only in edge areas, and a simple method (like bilinear interpolation) in homogeneous areas. This scheme avoids the covariance matrix invertibility issue, while decreasing computation time – since edge areas generally take up a small part of the whole image. Leitão et al. (2003) observe that this method performs worse in textured areas than edge areas. They advise, for covariance estimation, to avoid considering pixels which are too far from the pixel to be interpolated. Asuni and Giachetti (2008) refine the detection scheme of areas in which the covariance estimation is appropriate for interpolation. These authors also improve the covariance matrix conditioning by adding a constant to pixel levels where they reach very low values. Tam et al. (2009) raise the covariance mismatch problem, which occurs when the geometric duality property is not satisfied, and solve it by extending the covariance matching into multiple directions. Multiple low-resolution training windows are considered, and the one that yields the highest covariance energy is retained to apply the linear interpolation according to generic equation (58). Lukin and Kubasov (2004) incorporate covariance-based interpolation for the green plane estimation, in a demosaicing algorithm combining several other techniques – notably Kimmel’s. In addition, it is suggested to split nonhomogeneous areas into textured and edge ones. The interpolation step is then achieved specifically to each kind of high-frequency contents. 3.1.6. Comparison Between Edge-adaptive Methods. Finally, it is relevant to compare results achieved by the main exposed propositions which exploit spatial correlation. The key objective of these methods is to achieve the best estimation of green plane as possible, on which relies subsequent estimation of red and blue ones. Hence, we propose to examine the peak signal-to-noise ratio PSNRG (see expression (24)) of the estimated green plane, according to the experimental procedure described on figure 11. Table 2 shows the corresponding results, together with those achieved by bilinear interpolation for comparison. It can be noticed that all methods based on spatial correlation provide significant improvement in regard to bilinear interpolation. Among the six tested methods, Cok’s (1986) and Li’s (2001) estimate missing green levels by using only available green CFA samples, like bilinear interpolation ; all three generally provide the worst results. The green plane estimation may therefore be improved by using information from R and B components. In Kimmel’s algorithm for instance (1999), green plane quality is noticeably enhanced, for 10 images out of 12, thanks to corrective iterations based on spectral correlation (see results of columns Kimmel0 and Kimmel1 ). From these results may be asserted that any efficient demosaicing method should take advantage of both spatial and spectral correlations, simultaneously and for each color plane interpolation. Both methods proposed by Hamilton and Adams (1997) and by Wu and Zhang (2004) use the same expression to interpolate green levels, but different rules to select the interpolation direction. A comparison of respective results 40

Image 1 2 3 4 5 6 7 8 9 10 11 12 Average

Bilinear 38.982 32.129 37.477 28.279 36.709 33.168 35.682 32.804 35.477 32.512 34.308 30.251 33.981

Hamilton 44.451 37.179 43.161 34.360 42.603 38.148 40.650 39.434 40.544 37.367 38.979 34.451 39.277

Kimmel0 40.932 33.991 39.870 31.643 39.291 34.913 37.605 36.261 37.470 34.224 35.934 31.248 36.115

Kimmel1 28.244 37.947 38.207 34.673 41.477 38.659 40.978 39.514 39.603 38.342 38.321 35.145 37.592

Wu 44.985 39.374 43.419 35.352 43.515 39.176 43.121 40.193 41.013 38.125 39.194 35.943 40.284

Cok 39.320 32.984 38.161 30.420 38.103 33.762 36.734 35.073 36.219 33.117 34.837 30.150 34.907

Li 39.999 34.305 38.780 30.705 38.849 34.354 38.356 35.747 36.656 36.656 35.107 30.173 35.807

TAB . 2: Peak Signal-to-Noise Ratio (in decibels) of the green plane (PSNRG ), estimated by various interpolation methods. For each image, the best result is printed in bold typeface. Tested methods are here referred to chiefly by their first author’s name : 1. Bilinear interpolation – 2. Hamilton and Adams’ gradient-based method (1997) – 3 and 4. Kimmel’s adaptive weighted-edge method (1999), before (Kimmel0 ) and after (Kimmel1 ) corrective iterations – 5. Wu and Zhang’s componentconsistent scheme (2004) – 6. Cok’s method based on template matching (1986) – 7. Li’s covariance-based method (2001). show that careful selection of the interpolation direction is important for overall performance. This is all the most noticeable that, compared to other algorithms, computation complexity is rather low for both Hamilton and Adams’ and Wu and Zhang’s methods. Indeed, they do not require any corrective iteration step nor covariance matrix estimation step, which are computation-expensive operations. 3.2. Estimated Color Correction Once the two missing components have been estimated at each pixel, a post-processing step of color correction is often applied to remove artifacts in the demosaiced image. To remove false colors in particular, a classical approach consists in strengthening spectral correlation between the three estimated color components. Such a goal may be reached first by median filtering, as described below. An iterative update of initial interpolated colors is also sometimes achieved, as Kimmel’s corrective step (1999) presented in subsection 3.1.4. A still more sophisticated algorithm proposed by Gunturk et al. (2002) is described in detail in the second part of this section. Among other correction techniques of estimated colors, Li (2005) builds a demosaicing scheme by using a iterative approximation strategy with a spatially-adaptive stopping criterion ; he also studies the influence of the number of corrective iteration steps on the estimated image quality. Let us also mention here regularization schemes based on the Bayesian framework, as Markov Random Fields (see e.g. Mukherjee et al., 2001), which are however poorly adapted to real-time implementation.

41

3.2.1. Median Filtering One of the most widespread techniques in demosaiced image post-processing is median filtering. Such a filter has been used for years to remove impulse noise in graylevel images, but also efficiently removes color artifacts without damaging local color variations. Freeman (1988) was the first person to take advantage of the median filter to remove demosaicing artifacts. Applied to the estimated planes of color differences R − G and B − G, this filter noticeably improves the estimation provided by bilinear interpolation. As shown on figure 16d, these planes contain little high-frequency information. False estimated colors, which result from inconsistency between the local interpolation and those achieved in a neighborhood, may hence be more efficiently corrected on these planes while preserving object edges. Median filtering is implemented in several works of the demosaicing literature. For instance, Hirakawa and Parks (2005) propose to iterate the following correction – without giving more details about the number of iteration steps nor the filter kernel size –, defined at each pixel as : Rˆ ′ Gˆ ′ Bˆ ′

= Gˆ + M RG ,  1 ˆ = R + M GR + Bˆ + M GB , 2 = Gˆ + M BG ,

(63) (64) (65)

′ where Rˆ ′ , Gˆ ′ and Bˆ ′ denote the filtered estimated components, and M kk is the output ′ value of the median filter applied on estimated planes of color differences Iˆk − Iˆk , (k,k′ ) ∈ {R,G,B}2 . Lu and Tan (2003) use a slight variant of the latter, but advise to apply it selectively, since median filtering tends to attenuate color saturation in the estimated image. An appropriate strategy is proposed for the pre-detection of artifactprone areas, where median filtering is then solely applied. However, Chang and Tan (2006) notice that median filtering applied to color difference planes, which still bear some textures around edges, tends to induce “zipper” artifact in these areas. In order to avoid filtering across edges in the color difference planes, edge areas are preliminarily detected thanks to a Laplacian filter. Some artifacts may however remain in the median filtered image, which is mainly due to separate filtering of color difference planes (R − G and B − G). An alternative may be to apply a vector median filter on the estimated color image while exploiting spectral correlation. The local output of such a filter is the color vector which minimizes the sum of distances to all other color vectors in the considered neighborhood. But according to Lu and Tan (2003), the vector filter brings out little superiority – if any – in artifact removal, compared with the median filter applied to each color difference plane. The authors’ justification is that the estimation errors may be considered as additive noise which corrupts each color plane. These noise vector components are loosely correlated. In such conditions, Astola et al. (1990) show that vector median filtering does not achieve better results than marginal filtering on the color difference planes.

42

3.2.2. Alternating Projection Method As previously mentioned in section 2.2.3, pixel levels bear strong spectral correlation in high spatial frequency areas of a natural color image. From this observation, Gunturk et al. (2002) aim at increasing the correlation of high-frequency information ˆ Gˆ and Bˆ component planes, while keeping the CFA image data. between estimated R, These two objectives are enforced by using two convex constraint sets, on which the algorithm alternately projects estimated data. The first set is named “Observation” and ensures that interpolated data are consistent with those available in the CFA image. The second set, named “Detail”, is based on a decomposition of each R, G and B plane into four frequency subbands thanks to a filterbank approach. A filterbank is a set of passband filters which decompose (analyze) the input signal into several subbands, each one carrying the original signal information in a particular frequency subband. On the opposite, a signal may be reconstructed (synthesized) in a filterbank by recombination of its subbands. The algorithm uses an initially estimated image as starting point ; it may hence be considered as a – sophisticated – refinement scheme. To get the initial estimation Iˆ0 , any demosaicing method is suitable. The authors suggest to use Hamilton and Adams’ scheme to estimate the green plane Iˆ0G , and a bilinear interpolation to get the red Iˆ0R and blue Iˆ0B planes. Two main steps are achieved then, as illustrated on figure 24a : 1. Update the green plane by exploiting high-frequency information of red and blue planes. This enhances the initial green component estimation. (a) Use available red levels of the CFA image (or Iˆ0R ) to form a downsampled plane I0R of size X/2 ×Y /2, as illustrated on figure 24b. (b) Sample, at the same R locations, green levels from the initial estimation Iˆ0G G(R) to form a downsampled plane Iˆ , also of size X/2 ×Y /2. 0

(c) Decompose the downsampled plane I0R into four subbands :   I0R,LL (x,y) = h0 (x) ∗ h0 (y) ∗ I0R (x,y) ,   I0R,LH (x,y) = h0 (x) ∗ h1 (y) ∗ I0R (x,y) ,   I0R,HL (x,y) = h1 (x) ∗ h0 (y) ∗ I0R (x,y) ,   I0R,HH (x,y) = h1 (x) ∗ h1 (y) ∗ I0R (x,y) ,

(66) (67) (68) (69)

G(R) and do the same with plane Iˆ0 . In their proposition, Gunturk et al. use a low-pass filter H0 (z) and a high-pass filter H1 (z) to analyze each plane respectively in low and high frequencies, as described above in subsection 2.2.1. G(R) (d) Use the low-frequency subband (LL) of Iˆ0 and the three subbands of R I0 with high frequencies (LH, HL and HH) to synthesize a re-estimated

43

Iˆ0G G(B) Iˆ0

Iˆ0B

Iˆ0B



1

Iˆ1B,LL Iˆ1B,LH Iˆ1B,HL IˆB,HH

Iˆ1B

1

3

2

Iˆ1G,LL Iˆ1G,LH Iˆ1G,HL IˆG,HH

4

G plane update

ICFA

IˆG

I˜1B

IˆB

Iterations

IntermediaFinal 6 7 estim. te estim. 5 Iˆ1 Alternating projection of R and B Iˆ

}

1

IˆR Iˆ1G

Iˆ1G

G(B) I˜0

I˜1R

1 Iˆ1R,HH

Iˆ0R

0 I0B,LL I0B,LH I0B,HL I0B,HH

I0B

Initial estim. Iˆ0

G(R) I˜0

}

G(R) Iˆ0

Iˆ1R,LL Iˆ1R,LH IˆR,HL

Iˆ1R ≡

I0R



Iˆ0R

I0R,LL I0R,LH I0R,HL I0R,HH G(R),LL Iˆ0 ˆI G(R),LH 0 G(R),HL Iˆ0 G(R),HH Iˆ0 G(B),LL Iˆ0 G(B),LH Iˆ0 G(B),HL Iˆ0 ˆI G(B),HH

(a) Procedure outline. G channel update : ① Extraction of downsampled X/2 × Y /2 planes (see details on figure 24b) – ② Subband analysis of each downsampled plane – ③ Synthesis of re-estimated downsampled G(R) G(B) green planes I˜0 and I˜0 at R and B locations – ④ Insertion of these planes into Iˆ0G (see details on figure 24c). Alternating projection of R and B components : ⑤ Subband analysis of intermediate estimation Iˆ1 planes – ⑥ Synthesis of re-estimated red and blue planes – ⑦ Projection of these planes onto the “Observation” constraint set (see details on figure 24d). Rˆ Rˆ Rˆ Rˆ

R Rˆ Rˆ Rˆ R Rˆ Rˆ Rˆ Iˆ0R

R Rˆ R Rˆ

I0R

G Gˆ R G Gˆ R Gˆ B G Gˆ B G G Gˆ R G Gˆ R Gˆ B G Gˆ B G IˆG 0

R˜ R˜ R˜ R˜

R R R R G˜ RG˜ R G˜ RG˜ R

G(R) Iˆ0

G G˜ R G G˜ R G˜ B G G˜ B G G G˜ R G G˜ R G˜ B G G˜ B G

G Gˆ R G Gˆ R Gˆ B G Gˆ B G G Gˆ R G Gˆ R Gˆ B G Gˆ B G IˆG

Iˆ1G

0

Bˆ B Bˆ B

Bˆ Bˆ Bˆ Bˆ

Bˆ B Bˆ B

Iˆ0B

Bˆ Bˆ Bˆ Bˆ

B B B B

G˜ BG˜ B G˜ BG˜ B G(B) I˜0

G B G B

R G R G

G B G B

R˜ R˜ R˜ R˜

R˜ R˜ R˜ R˜

(c) Insertion of re-estimated downsampled green planes into Iˆ0G .

R R˜ R R˜

R˜ R˜ R˜ R˜

R R˜ R R˜

IˆR R G R G

ICFA B˜ B˜ B˜ B˜

B˜ B˜ B˜ B˜

B˜ B˜ B˜ B˜ I˜1B

I0B

(b) Extraction of downsampled planes from initial estimation.

R˜ R˜ R˜ R˜ I˜1R

G(R) I˜0

Gˆ R Gˆ R Gˆ R Gˆ R

R˜ R˜ R˜ R˜

B˜ B˜ B˜ B˜

B˜ B B˜ B

B˜ B˜ B˜ B˜

B˜ B B˜ B

B˜ B˜ B˜ B˜

IˆB

(d) Projection of re-estimated red and blue planes onto the “Observation” set.

F IG . 24: Demosaicing procedure proposed by Gunturk et al. (2002) from an initial estimation Iˆ0 .

44

G(R) downsampled green plane I˜0 : h i h i G(R) G(R),LL I˜0 (x,y) = g0 (x) ∗ g0 (y) ∗ Iˆ0 (x,y) + g0 (x) ∗ g1 (y) ∗ I0R,LH (x,y)

h i h i +g1 (x) ∗ g0 (y) ∗ I0R,HL (x,y) + g1 (x) ∗ g1 (y) ∗ I0R,HH (x,y) . (70) Filters G1 (z) and G0 (z) used for this synthesis have impulse responses g1 = [1 2 −6 2 1]/8 and g0 = [−1 2 6 2 −1]/8, respectively. (e) Apply above instructions (a)-(d) similarly on the blue plane Iˆ0B , which yields G(B) a second re-estimated downsampled green plane I˜ . 0

(f) Insert these two re-estimated downsampled estimations of the green plane G(R) G(B) at their respective locations in plane Iˆ0G (i.e. I˜0 at R locations, and I˜0 at B locations, as illustrated on figure 24c). A new full-resolution green plane Iˆ1G is obtained, which forms an intermediate estimated color image Iˆ1 together with planes Iˆ0R and Iˆ0B from the initial estimation. 2. Update red and blue planes by alternating projections. (a) Projection onto the “Detail” set : this step insures that high-frequency information is consistent between the three color planes, while preserving as much details as possible in the green plane. To achieve this, a) analyze the three color planes Iˆ1R , Iˆ1G and Iˆ1B of the intermediate image Iˆ1 into four subbands by using the same filterbank as previously (composed of H0 (z) and H1 (z)) ; b) use the low-frequency subband of the red plane and the three high-frequency subbands of the green plane to synthesize a re-estimated red plane I˜1R , similarly to equation (70). At last, c) repeat the same operations on the blue plane to estimate I˜1B . (b) Projection onto the “Observation” set : this step insures that estimated values are consistent with the ones available (“observed”) in the CFA. The latter are simply inserted in re-estimated planes I˜1R and I˜1B at corresponding locations, as illustrated on figure 24d. (c) Repeat above instructions (a) and (b) several times (the authors suggest to use eight iterations). In short, high-frequency subbands at red and blue CFA locations are used first to refine the initial estimation of green color plane. The high-frequency information of red and blue planes is then determined by using green plane details so as to remove color artifacts. This method achieves excellent results, and is often considered as a reference in demosaicing benchmarks. However, its computation cost is rather high, and its performance depends on the quality of initial estimation Iˆ0 . A non-iterative implementation of this algorithm has been recently proposed (Lu et al., 2009), which achieves the same results as alternating projection at convergence, but at about height times faster speed. Chen et al. (2008) exploit both subband channel decomposition and median filtering : a median filter is applied on the difference planes IˆR,LL − IˆG,LL and IˆB,LL − IˆG,LL 45

of low-frequency subbands. Components are updated thanks to formulas proposed by Hirakawa and Parks (see equations (63) to (65)), but on each low-frequency subband. High-frequency subbands are not filtered, in order to preserve spectral correlation. The final estimated image is synthesized from the four frequency subbands, as in the alternating projection scheme of Gunturk et al.. Compared to the latter, median filtering mainly improves the demosaicing result on chrominance planes. Menon et al. (2006) notice that Gunturk et al.’s method tends to generate zipper effect along object boundaries. To avoid such artifact, a corrective technique is proposed, which uses the same subband decomposition principle but pre-determines the local edge direction (horizontal or vertical) on the estimated green plane. The authors suggest to use this particular direction to correct green levels by replacing high-frequency components with those of the available component (R or B) at the considered pixel. As the same direction is used to correct estimated Rˆ and Bˆ levels at G locations on the color difference planes, this technique insures interpolation direction consistency between color components, which has been shown to be important in subsection 3.1.2. 3.3. Demosaicing using the Frequency Domain Some recent demosaicing schemes rely on a frequency analysis, by following an approach originated by Alleysson et al. (2005). The fundamental principle is to use a frequency representation of the Bayer CFA image4 . In the spatial frequency domain, such a CFA image may be represented as a combination of a luminance signal and two chrominance signals, all three being well localized. Appropriate frequency selection therefore allows to estimate each of these signals, from which the demosaiced image can be retrieved. Notice that frequency-based approaches do not use Bayer’s assumption that assimilates green levels to luminance, and blue and red levels to chrominance components. 3.3.1. Frequency Selection Demosaicing A simplified derivation of Alleysson et al.’s approach has been proposed by Dubois (2005), whose formalism is retained here to present the general framework of frequencydomain representation of CFA images. Let us assume that, for each component k of a color image, k ∈ {R,G,B}, there exists an underlying signal f k . Demosaicing then consists in computing an estimation fˆk (coinciding with Iˆk ) at each pixel. Let us assume similarly that there exists a signal f CFA which underlies the CFA image. This signal is referred to as CFA signal and coincides with ICFA at each pixel. The CFA signal value at each pixel with coordinates (x,y) may be expressed as the sum of spatially sampled f k signals : f CFA (x,y) =



f k (x,y)mk (x,y) ,

(71)

k=R,G,B

4 Let

us make here clear that frequency (i.e. spatial frequency), expressed in cycles per pixel, corresponds to the inverse number of adjacent pixels representing a given level series according to a particular direction in the image (classically , the horizontal or vertical direction).

46

where mk (x,y) is the sampling function for the color component k, k ∈ {R,G,B}. For the Bayer CFA of figure 9, this set of functions is defined as :   1 1 − (−1)x 1 + (−1)y , 4   1 1 + (−1)x+y , mG (x,y) = 2   1 B m (x,y) = 1 + (−1)x 1 − (−1)y . 4  L   1  R  1 1 f f 4 2 4 △ With the definition  f C1  =  − 41 12 − 41   f G , the expression of 1 fB f C2 − 14 0 4 becomes : mR (x,y) =

f CFA (x,y) = =

(72) (73) (74)

f CFA

  f L (x,y) + f C1 (x,y)(−1)x+y + f C2 (x,y) (−1)x − (−1)y   f L (x,y) + f C1 (x,y)e j2π (x+y)/2 + f C2 (x,y) e j2π x/2 − e j2π y/2 .(75)

The CFA signal may therefore be interpreted as the sum of a luminance component f L at baseband, a chrominance component f C1 modulated at spatial frequency (horizontal and vertical) (0.5,0.5), and of another chrominance component f C2 modulated at spatial frequencies (0.5,0) and (0,0.5). Such interpretation may be easily checked on an achromatic image, in which f R = f G = f B : the two chrominance components are then equal to zero. Provided that functions f L , f C1 and f C2 can be estimated at each pixel from the CFA signal, estimated color levels fˆR , fˆG and fˆB are simply retrieved as :  L   R   fˆ fˆ 1 −1 −2 G  fˆ  =  1 1 0   fˆC1  . (76) B 1 −1 2 fˆC2 fˆ To achieve this, the authors take the Fourier transform of the CFA signal (75) :

F CFA (u,v) = F L (u,v) + F C1 (u − 0.5,v − 0.5) + F C2 (u − 0.5,v) − F C2 (u,v − 0.5), (77) expression in which terms are, respectively, the Fourier transforms of f L (x,y), of f C1 (x,y)(−1)x+y , △



and of the two signals defined as f C2a (x,y) = f C2 (x,y)(−1)x and f C2b (x,y) = − f C2 (x,y)(−1)y . It turns out that the energy of a CFA image is concentrated in nine zones of the frequency domain (see example of figure 25), centered on spatial frequencies according to equation (77) : energy of luminance F L (u,v) is mainly concentrated at the center of this domain (i.e. at low frequencies), whereas that of chrominance is located on its border (i.e. at high frequencies). More precisely, the energy of F C1 (u − 0.5,v − 0.5) is located around diagonal zones (“corners” of the domain), that of F C2 (u − 0.5,v) along 47

v +0.5

C1

C2b

C1

L 0

C2a

C2a

C1 -0.5 -0.5

(a) “Lighthouse” CFA image

C2b 0

C1 +0.5

u

(b) Normalized energy (frequencies in cycles/pixel)

F IG . 25: Localization of the energy (Fourier transform module) of a CFA signal in the frequency domain (Alleysson et al., 2005). u axis of horizontal frequencies, and that of F C2 (u,v − 0.5) along v axis of vertical frequencies. These zones are quite distinct, so that isolating the corresponding frequency components is possible by means of appropriately designed filters. But their bandwidth should be carefully selected, since the spectra of the three functions mutually overlap. In these frequency zones where luminance and chrominance cannot be properly separated, the aliasing phenomenon might occur and color artifacts be generated. In order to design filter bandwidths which achieve the best possible separation of luminance (L) and chrominance (C1, C2), Dubois (2005) proposes an adaptive algorithm that mainly handles the spectral overlap between chrominance and high-frequency luminance components. The author observes that spectral overlap between luminance and chrominance chiefly occurs according to either the horizontal or the vertical axis. Hence he suggests to estimate f C2 by giving more weight to the sub-component of C2 (C2a or C2b) that is least prone to spectral overlap with luminance. The implemented weight values are based on an estimation of the average directional energies, for which Gaussian filters (with standard deviation σ = 3.5 pixels and modulated at spatial frequencies (0,0.375) and (0.375,0) cycles per pixel) are applied to the CFA image. 3.3.2. Demosaicing by Joint Frequency and Spatial Analyses Frequency selection is also a key feature used by Lian et al. (2007), who propose a hybrid method based on an analysis of both frequency and spatial domains. They state that the filter used by Alleysson et al. for luminance estimation may not be optimal. Moreover, since the parameters defining its bandwidth (see figure 26a) depend on the image content, they are difficult to be adjusted (Lian et al., 2005). Although lowpass filtering the CFA image allows to extract the luminance component, it removes 48

1.5

0.5

v

v 0.5

C1

Amplitude

Amplitude

1

C2

0 0.5

0.5 0

v

C2

0.5 0

−0.5

v

u

0 −0.5

−0.5

u

u

L

C2 −0.5

0.5

0.5

r1

r1 −0.5

0 0.5

0 −0.5

u

L

C2 −0.5

C1

0.5

1 0.5

r2

−0.5

(a) Alleysson et al.

(b) Lian et al. (filter used at G locations)

F IG . 26: Filters (bandwidth and spectrum) used to estimate luminance, as proposed by Alleysson et al. (2005) and Lian et al. (2007). the high-frequency information along horizontal and vertical directions. As the human eye is highly sensitive to the latter, such loss is prejudicial to the estimation quality. Lian et al. then notice that F C2 components in horizontal and vertical directions have same amplitudes but opposite signs5 . Consequently, the luminance spectrum F L at G locations is obtained as the CFA image spectrum from which C1 (“corner”) component has been removed (see details in Lian et al., 2007). A low-pass filter is proposed to this purpose, which cancels C1 while preserving the high-frequency information along horizontal and vertical axes. This filter is inspired from Alleysson et al.’s, reproduced on figure 26a, but its bandpass is designed to remove C1 component only (see figure 26b). The main advantage of this approach is that luminance L spectrum bears less overlap with the spectrum of C1 than that of C2 (see example of figure 25b), which makes the filter design easier. From these observations, Lian et al. propose a demosaicing scheme with three main steps (see figure 27) : ˆ at G locations, by applying a low-pass 1. Estimate the luminance (denoted as L) filter on the CFA image to remove C1. Practically, the authors suggest to use the following 5 × 5 kernel, which gives very good results at low computational cost :   0 1 −2 1 0  1 −4 6 −4 1   1   −2 6 56 6 −2  (78) H=  . 64  1 −4 6 −4 1  0 1 −2 1 0 2. Estimate the luminance at R and B locations by a spatial analysis. As isolating the spectrum of component C2 is rather difficult, the authors suggest an adaptive algorithm based on color difference constancy (exploiting spectral correlation) and adaptive weighted-edge linear interpolation (exploiting spatial correlation) :

5 We

keep here notations used by Alleysson et al. for C1 and C2, although switched by Lian et al.

49

G R G B G B G R G ... ...

... ...

... ...

ICFA



...

Lˆ Lˆ

1.



... 2.(a)



... ...

Lˆ ˆ Bˆ R,

... ...

... ...

Lˆ ˆ Bˆ R,

Lˆ ˆ Bˆ R,

... ...

Lˆ ˆ Bˆ R,

... ...

Lˆ ˆ Bˆ R,

2.(b)

...

Lˆ Lˆ Lˆ

... ...

Lˆ Lˆ Lˆ

... ...

IˆL(G)

Lˆ Lˆ Lˆ

... ... ...

... ...

IˆL

3.

Gˆ Rˆ Rˆ Rˆ Rˆ Rˆ Rˆ

Bˆ Bˆ Bˆ Gˆ Gˆ ... Rˆ ... ... Rˆ ... ... Rˆ ... ...

... ... ... ...

... ... ... ...



2.(c) Répétition

F IG . 27: Demosaicing scheme proposed by Lian et al. (2007) : 1. Luminance estimation at G locations – 2.(a) Pre-estimation of R and B components at G locations – 2.(b) Luminance estimation at R and B samples – 2.(c) Repetition of steps (a) and (b) – 3. Final color image estimation from the fully-populated luminance plane. Notation IˆL used here for illustration sake coincides at each pixel with the luminance signal of expression (75), namely Lˆ x,y , IˆL (x,y) ≡ fˆL (x,y). (a) Pre-estimate R and B components at G locations, by simply averaging the levels of the two neighboring pixels at which the considered component is available. (b) Estimate the luminance at R and B locations by applying, on the component difference plane L − R or L − B, a weighted interpolation adapted to the local level transition. For instance, luminance Lˆ at R locations is estimated as follows :  ˆ ˆ − R(P) ∑ w(P) L(P) P∈N4 Lˆ = R + . (79) ∑ w(P) P∈N4

For the same {GRG} CFA structure, weights w(P) ≡ wδ x,δ y are expressed by using the relative coordinates Pδ x,δ y of the neighboring pixel as : wδ x,δ y =

1 , 1 + R0,0 − R2δ x,2δ y + Lˆ δ x,δ y − Lˆ −δ x,−δ y

(80)

which achieves an adaptive weighted-edge interpolation, as in Kimmel’s method (see section 3.1.4). (c) Repeat the previous steps to refine the estimation : a) re-estimate R component (then B similarly) at G locations, by averaging L − R levels at neighboring R locations ; b) re-estimate L at R (then B) locations according to equation (79) (weights w(P) remaining unchanged). 3. From the fully-populated luminance plane IˆL , estimate the two missing components at each pixel of the CFA image by using bilinear interpolation :   k L (81) Iˆx,y = Iˆx,y + H k ∗ ϕ k ICFA − IˆL (x,y), 50

where ϕ k (I)(x,y), k ∈ {R,G,B} is the plane defined by expression (7) and shown on figure 12, and where convolution kernels H k which achieve bilinear interpolation are defined by expressions (8) and (9)6 . The above approach does not require to design specific filters in order to estimate C1 and C2 components, as do methods using the frequency domain only (Dubois uses for instance complementary asymmetric filters). Lian et al. show that their method globally outperforms other demosaicing schemes according to MSE (or PSNR) criterion. The key advantage seems to lie in exploiting the frequency domain at G locations only. According to results presented by Lian et al. (2007), luminance estimations are less error-prone than green level estimations provided by methods which chiefly scan the spatial image plane (shown in table 2). 3.4. Conclusion An introduction to the demosaicing issue and to its major solutions has been exposed in the above section. After having described why such a processing task is required in mono-CCD color cameras, the various CFA solutions have been presented. Focusing on the Bayer CFA, we have detailed the formalism in use throughout the paper. The simple bilinear interpolation has allowed us to introduce both artifact generation that demosaicing method have to overcome, and two major rules widely used in the proposed approaches : spatial and spectral correlations. The vast majority of demosaicing methods strive to estimate the green plane first, which bear the most high-frequency information. The quality of this estimation strongly influences that of red and blue planes. When exploiting spatial correlation, we experimentally show that a correct selection of the interpolation direction is crucial to reach a high interpolation quality for green levels. Moreover, component-consistent directions should be enforced in order to avoid color artifact generation. Spectral correlation is often taken into account by interpolating on the difference, rather than ratio, of component planes. An iterative post-processing step of color correction is often achieved, so as to improve the final result quality by reinforcing spectral correlation. Demosaicing methods may exploit spatial and/or frequency domains. The spatial domain has been historically used first, and many studies are based on it. More recently, authors exploit the frequency domain, which opens large perspectives. Such approaches indeed allow to avoid using – at least partially or in a first step – the heuristic rule of color difference constancy to take spectral correlation into account. In all cases where such assumptions are not fulfilled, even locally, exploiting the frequency domain is an interesting solution. Dubois foresaw several years ago (2005) that frequency selection approaches are preeminently promising. This will be corroborated is the next sections, dedicated to the objective quality evaluation of images demosaiced by the numerous presented methods. Already mentioned criteria (MSE and PSNR) will be completed by measures suited to human color perception, and new specific ones dedicated to the local detection of demosaicing artifacts.

6 Notice that ϕ k (I) may equally be expressed as ϕ k (I)(x,y) = I(x,y)mk (x,y), where sampling functions mk are defined by (72) to (74).

51

4. Objective Evaluation Criteria for Demosaiced Images 4.1. Introduction The performances reached by different demosaicing schemes applied to the same CFA image can be very different. Indeed, different kinds of artifacts which alter the image quality, can be generated by demosaicing schemes. A description of these artifacts is given in subsection 4.2. Measuring the performance reached by a demosaicing scheme requires to evaluate the quality of its output image. Indeed, such a measurement helps to compare the performances of the different schemes. For this purpose, we always follow the same experimental procedure (see figure 11). First, we simulate the color sampling by keeping only one out of the three color components at each pixel of the original image I, according to the Bayer CFA mosaic. Then, we apply the considered demosaicing scheme to obtain the estimated color image Iˆ (hereafter called demosaiced image) from the CFA samples. Finally, we measure the demosaicing quality by comparing the original and demosaiced images. The main strategy of objective comparison is based on error estimation between the original and demosaiced images. In subsection 4.3, we present the most used criteria for objective evaluation of the demosaiced image. The objective criteria are generally based on a pixel-wise comparison between the original and the estimated colors. These fidelity criteria are not specifically sensitive to one given artifact. Hence, in subsection 4.5, we present new measurements which quantify the occurrences of demosaicing artifacts. Since demosaicing methods intend to produce “perceptually satisfying” images, the most widely used evaluation criteria are based on the fidelity to the original images. Rather than displaying images, our goal is to apply automatic image analysis procedures to the demosaiced images in order to extract features. These extracted features are mostly derived from either colors or detected edges in the demosaiced images. Since the quality of features is sensitive to the presence of artifacts, we propose to quantify the demosaicing performance by measuring the rates of erroneously detected edge pixels. This evaluation scheme is presented in the last subsection. 4.2. Demosaicing Artifacts The main artifacts caused by demosaicing are blurring, false colors and zipper effect. In this part, we present those artifacts on examples and explain their causes by considering the spatial and frequency domains. 4.2.1. Blurring Artifact Blurring is located in areas where high frequency information, representing precise details or edges, is altered or erased. Figure 28, illustrates different blurring levels according to the used demosaicing scheme. A visual comparison between the original image 28b and image 28c which has been demosaiced by bilinear interpolation, shows that this scheme causes severe blurring. Indeed, some details of the parrot plumage are not retrieved by demosaicing and blurring is generated by low-pass filtering. As stated in section 2.2.1, this interpolation can be achieved by a convolution applied to

52

(a) Original Image.

(b)

(c)

(d)

F IG . 28: Blurring in the demosaiced image. Image (b) is an extract from the original image (a), located by a black box. Images (c) and (d) are the corresponding extracts of the images respectively demosaiced by bilinear interpolation and by Hamilton and Adams’ (1997) schemes. each sampled color component plane (see expression (10)). The corresponding filters, whose masks H k are given by expressions (8) and (9), reduce high frequencies. Hence, fine details may be not properly estimated in the demosaiced image (see figure 28c). This artifact is less visible in image 28d, which has been demosaiced by Hamilton and Adams’ scheme (1997). A visual comparison with image 28c shows that this scheme, presented in section 3.1.1, generates a small amount of visible blurring. It first estimates vertical and horizontal gradients, then interpolates the green levels along the direction with the lowest gradient module, i.e. by using as homogeneous levels as possible. This selection of neighbors used to interpolate the missing green level at a given pixel, tends to avoid blurring. 4.2.2. Zipper Effect Let us examine figure 29, and more precisely images 29b and 29d which are extracted from the original “Lighthouse” image 29a. Images 29c and 29e are the corresponding extracts from the demosaicing result of Hamilton and Adams’ scheme (1997). On image 29e, one can notice repetitive patterns in transition areas between homogeneous ones. This phenomenon is called zipper effect. The main reason for this artifact is the interpolation of levels which belong to homogeneous areas representing different objects. It occurs at each pixel where the interpolation direction (horizontal or vertical) is close to that of the color gradient computed in the original image. Image 29c does not contain any zipper effect, since the interpolation direction is overall orthogonal to that of a color gradient, hence close to the transition direction between homogeneous areas. Oppositely, image 29e contains strong zipper effect. In this area with high spatial frequencies along the horizontal direction, the scheme often fails to determine the correct gradient direction (see section 3.1.2 and

53

(b)

(c)

(b)

(c) (d)

(e)

(d)

(a) Original image

(e)

(f) Demosaiced image

F IG . 29: Zipper effect due to erroneous selection of the interpolation direction. Images (b) and (d) are two extracts from the original image (a), located by black boxes. Images (c) and (e) are the corresponding extracts from the image (f) demosaiced by Hamilton and Adams’s scheme (1997). (b)

(a) Original image

(c)

(b)

(c)

(d) Demosaiced image

F IG . 30: False colors on a diagonal detail. Image (b) is an extract from the original image (a), located by a black box. Image (c), on which artifacts are circled in black, is the corresponding extract from image (d) demosaiced by Hamilton and Adams’s scheme (1997). figure 19). The other main reason is related to the arrangement, in the CFA image, of pixels whose green level is not available. Indeed, these pixels where the green levels can be erroneously estimated, are arranged in staggered locations. 4.2.3. False Colors False color at a pixel corresponds to a large distance between the original color and the estimated one, in the acquisition color space RGB. Figures 30c and 31c show that this phenomenon is not characterized by a specific geometrical structure in the image. Incorrect estimation of the color components may cause perceptible false colors, in particular in areas with high spatial frequencies.

54

(a) Original Image

(b)

(c)

F IG . 31: False colors generated on a textured area. (b) Extract from the original image (a), located by a black box. (c) Extract demosaiced by Wu and Zhang scheme (2004), with artifacts circled in black. 4.2.4. Artifacts Described in the Frequency Domain The representation of the CFA color samples in the frequency domain, proposed by Alleysson et al. (2005), also allows to explain the reasons why artifacts are generated by demosaicing schemes. As seen in section 3.3.1, the CFA image signal is made up of a luminance signal, mainly modulated at low spatial frequencies, and of two chrominance signals, mainly modulated at high frequencies (see figure 25 page 48). Therefore, demosaicing can be considered as an estimation of luminance and chrominance components. Several schemes which analyze the frequency domain (Alleysson et al., 2005; Dubois, 2005; Lian et al., 2007) estimate the missing levels by selective filters applied to the CFA image. The four possible artifacts caused by frequency analysis are shown in figure 32 extracted from (Alleysson et al., 2005) : excessive blurring, grid effect, false colors and watercolor. When the bandwidth of the filter applied to the CFA image to estimate the luminance is too narrow, an excessive blurring occurs in the demosaiced image (see figure 32a). When the bandwidth of this filter is too wide, it may select high frequencies in zones of chrominance. Such a case can result in a grid effect, especially visible in flat (homogeneous) areas of the image (see figure 32b). Moreover, false colors appear when the chrominance filters overlap with the luminance filter in the frequency domain (see figure 32c). Finally, when the chrominance filter is too narrow, watercolor effect may appear as colors which are “spread beyond” the edges of an object (see figure 32d). These artifacts are caused by a bad conception of the selective filters used to estimate luminance and chrominance. They can also be generated by demosaicing methods which spatially scan the image. Indeed, several spatial demosaicing schemes generate blurring and false colors since they tend to under-estimate luminance and over-estimate chrominance. Kimmel’s (1999) and Gunturk et al.’s (2005) schemes also generate grid effect and watercolor. 4.3. Classical Objective Criteria All the described artifacts are due to errors in color component estimation. The classical objective evaluation criteria sum up the errors between levels in the original and demosaiced images. At each pixel, the error between the original and demosaiced images is quantized thanks to a distance between two color points in a threedimensional color space (Busin et al., 2008). In this subsection, we regroup the most 55

(a) Blurring

(b) Grid effect

(c) False color

(d) Watercolor

F IG . 32: Four kinds of artifacts caused by demosaicing (Alleysson et al., 2005).

56

widely used measurements into two categories, namely the fidelity and perceptual criteria. 4.3.1. Fidelity Criteria These criteria use colors coded in the RGB acquisition color space in order to estimate the fidelity of the demosaiced image compared with the original image. 1. Mean Absolute Error. This criterion evaluates the mean absolute error between the original image I and ˆ Denoted by MAE, it is expressed as (Chen et al., 2008; the demosaiced image I. Li and Randhawa, 2005) : ˆ = MAE(I,I)

1 3XY

X−1 Y −1

∑ ∑∑

k=R,G,B x=0 y=0

k k Ix,y − Iˆx,y ,

(82)

k is the level of the color component k at the pixel whose spatial coorwhere Ix,y dinates are (x,y) in the image I. X and Y are respectively the number of columns and rows of the image. The MAE criterion can be used to measure the estimation errors of a specific color component. For example, this criterion is evaluated on the red color plane as :

ˆ = 1 MAE R (I,I) XY

X−1 Y −1



R R . − Iˆx,y ∑ ∑ Ix,y

x=0 y=0

(83)

MAE values range from 0 to 255, and the demosaicing quality is considered as better as its value is low. 2. Mean Square Error. This criterion measures the mean quadratic error between the original image and the demosaiced image. Denoted by MSE, it is defined as (Alleysson et al., 2005) : ˆ = MSE(I,I)

1 3XY

X−1 Y −1

k k 2 − Iˆx,y ) . ∑ ∑ ∑ (Ix,y

(84)

k=R,G,B x=0 y=0

The MSE criterion can also measure the error on each color plane, as in equation (23). The optimal quality of demosaicing is reached when MSE is equal to 0, whereas the worst is measured when MSE is close to 2552 . 3. Peak Signal-to-Noise Ratio. The PSNR criterion is a widely used distortion measurement to estimate the quality of image compression. Many authors (e.g. Alleysson et al., 2005; Hirakawa and Parks, 2005; Lian et al., 2007; Wu and Zhang, 2004) use this criterion to quantify the performance reached by demosaicing schemes. The PSNR is expressed in decibels as : ˆ = 10 · log10 PSNR(I,I) 57



 d2 , ˆ MSE(I,I)

(85)

where d is the maximum color component level. When the color components are quantized with 8 bits, d is set to 255. Like the preceding criteria, PSNR can be applied to a specific color plane. For the red color component, it is defined as : ˆ = 10 · log10 PSNR (I,I) R



 d2 . ˆ MSE R (I,I)

(86)

The higher the PSNR value is, the better is the demosaicing quality. The PSNR measured on demosaiced images generally ranges from 30 to 40 dB (i.e. MSE ranges from 65.03 to 6.50). 4. Correlation. A correlation measurement between the original image and the demosaiced image is used by Su and Willis (2003) to quantify the demosaicing performance. The correlation criterion between two gray-level images I and Iˆ is expressed as : ˆ = " C(I,I)

ˆ ˆ µ µ − XY I I ∑ ∑ x,y x,y x=0 y=0 ! ! #1/2 " #1/2 , (87) X−1 Y −1 X−1 Y −1 2 ∑ ∑ Iˆx,y − XY µˆ 2 ∑ ∑ Ix,y 2 − XY µ 2 x=0 y=0 x=0 y=0 X−1 Y −1

!

where µ and µˆ are the mean gray levels in the two images. When a color demosaiced image is considered, one estimates the correlation level Ck I k , Iˆk , k ∈ {R,G,B}, between the original and demosaiced color planes. The mean of the three correlation levels is used to measure the quality of demosaicing. The correlation levels C range between 0 and 1, and a measurement close to 1 can be considered as a satisfying demosaicing quality. 4.3.2. Perceptual Criteria The preceding criteria are not well consistent with quality estimation provided by the human visual system. That is the reason why new measurements have been defined, which operate in perceptually uniform color spaces (Chung and Chan, 2006). 1. Estimation Error in the CIE L∗ a∗ b∗ color space. The CIE L∗ a∗ b∗ color space is recommended by the International Commission on Illumination to measure the distance between two colors (Busin et al., 2008). This space is close to a perceptually uniform color space which has not been completely defined yet. So, the Euclidean distance in the CIE L∗ a∗ b∗ color space is a perceptual distance between two colors. The three color components (R,G,B) at a pixel are first transformed into (X,Y,Z) components according to a CIE XY Z linear operation. Then, the color components CIE L∗ a∗ b∗ are expressed as :

58

L∗

=

(

p 116 × 3 Y /Yn − 16 903.3 ×Y /Yn a∗ b∗

if Y /Yn > 0.008856, otherwise,

(88a) (88b)

= 500 × ( f (X/Xn ) − f (Y /Yn )) ,

(89)

= 200 × ( f (Y /Yn ) − f (Z/Zn )) ,

(90)

with : ( √ 3 x f (x) = 16 7.787x + 116

if Y /Yn > 0.008856 , otherwise,

(91a) (91b)

where the used reference white is characterized by the color components (Xn ,Yn , Zn ). We can notice that L∗ represents the eye response to a specific luminance level, whereas a∗ and b∗ components correspond to chrominance. The component a∗ represents an opposition of colors Red–Green, and b∗ corresponds to an opposition of colors Blue–Yellow. The color difference is defined as the distance between two color points in this color space. Then, the estimation error caused by demosaicing is the mean error processed with all image pixels : ∆E

L ∗ a∗ b∗

ˆ = 1 (I,I) XY

X−1 Y −1

∑∑

x=0 y=0

∗ ∗ ∗

s

∑ ∗ ∗

k=L ,a ,b∗

k − Iˆk Ix,y x,y

2

.

(92)

The lower ∆E L a b is, the lower is the perceptual difference between the original and demosaiced images, and the higher is the demosaicing quality. 2. Estimation Error in the S-CIE L∗ a∗ b∗ color space. In order to introduce spatial perception properties of the human visual system, Zhang and Wandell (1997) propose a new perceptual color space, called S-CIE L∗ a∗ b∗ . The color components (R,G,B) are first transformed into the color space XY Z which does not depend on the acquisition device. Then, these color components are converted into the antagonist color space AC1C2 , where A represents the perceived luminance and C1 , C2 , the chrominance information in terms of opposition of colors Red–Green and Blue–Yellow, respectively. The three component planes are then separately filtered by Gaussian filters with specific variances, which approximate the contrast sensitivity functions of the human visual system. The three filtered components A, C1 and C2 are converted back into (X,Y,Z) components, which are then transformed into CIE L∗ a∗ b∗ color space thanks to equations (88) and (89). Once the color components L∗ , a∗ and b∗ have been computed, the estimation error ∆E in S-CIE L∗ a∗ b∗ is defined by equation (92). This measurement was used by Li (2005), Su (2006) and Hirakawa and Parks (2005) to measure the demosaicing quality. 59

3. Normalized Color Difference in the CIE L∗ u∗ v∗ color space. The CIE proposes another perceptually uniform color space called CIE L∗ u∗ v∗ , whose luminance L∗ is the same as that of CIE L∗ a∗ b∗ color space. The chrominance components are expressed as : u∗ = 13 × L∗ × (u′ − u′n ), ∗





v = 13 × L × (v

(93)

− v′n ),

(94)

with : 4X , X + 15Y + 3Z 9Y v′ = , X + 15Y + 3Z

u′ =

(95) (96)

where u′n et v′n are the chrominance of the reference white. The criterion of normalized color difference NCD is expressed as (Li and Randhawa, 2005; Lukac and Plataniotis, 2004b) : X−1 Y −1 r

∑ ∑

ˆ = NCD(I,I)

x=0 y=0

k=L∗ ,u∗ ,v∗

X−1 Y −1 r

∑ ∑

x=0 y=0





k − Iˆk Ix,y x,y

k=L∗ ,u∗ ,v∗

k Ix,y

2

2

,

(97)

k is the level of color component k, k ∈ {L∗ , u∗ , v∗ }, at the pixel having where Ix,y (x,y) spatial coordinates. This normalized measurement ranges from 0 (optimal demosaicing quality) to 1 (worst demosaicing quality).

Among other measurements found in the literature, let us also mention Buades et al. (2008). These authors first consider artifacts as noise which corrupts the demosaiced image, and propose an evaluation scheme based on specific characteristics of white noise. Unfortunately, the evaluation is only achieved by subjective appreciation. More interesting is the suggestion to use gray-level images for demosaicing evaluation. Indeed, color artifacts are then not only easily visually identified, but may also also be analyzed by considering the chromaticity. The rate of estimated pixels whose chromaticity is higher than a threshold reflects the propensity of a given demosaicing scheme to generate false colors. 4.4. Artifact-sensitive Measurements The objective measurements presented above are based on an evaluation of the color estimation error. None of these measurements quantify the specific presence of each kind of artifact within the demosaiced images. Though, it would be interesting to isolate specific artifacts during the evaluation process. In this part, we present measurements which are sensitive to specific kinds of artifacts by taking their properties into account. 60

250

200

L level

150

100

50

P1l

0 100

P1 P2 P1r ≡ P2l 110

120

P2r 130

P3 P4 P4r P3r ≡ P4l

P3l 140

150

160

170

180

x spatial coordinate F IG . 33: Vertical edge pixels associated with their left and right pixels. Vertical edge pixels P1 , P2 , P3 and P4 are represented by solid lines, while pixels corresponding to extrema are located by dashed lines. The left (resp. right) extremum of a vertical edge pixel Pi is denoted Pil (resp. Pir ). One single extremum may be associated with two different vertical edge pixels, for example P1r ≡ P2l . 4.4.1. Blurring Measurement The blurring measurement proposed by Marziliano et al. (2004) is sensitive to the decrease of local level variations in transition areas. The authors notice that blurring corresponds to an expansion of these transition areas, and propose to measure the transition widths to quantify this artifact. The evaluation scheme analyzes the luminance ˆ The planes of the original and demosaiced images, respectively denoted as L and L. transition width increase, evaluated at the same pixel locations in both images, yields an estimation of the blurring caused by demosaicing. This blurring measurement consists in the following successive steps : 1. Apply the Sobel filter to the luminance plane L according to the horizontal direction, and threshold its output. The pixels detected in this way are called vertical edge pixels. 2. At each vertical edge pixel P, examine the luminance levels of pixels located on the same row as P in the luminance plane L. The pixel Pl (resp. Pr ) corresponds to the first local luminance extremum located on the left (resp. the right) of P. To each vertical edge pixel P, associate in this way a pair of pixels Pl and Pr , one of them corresponding to a local luminance maximum and the other one to a minimum (see figure 33). 3. The transition width at P is defined as the difference between the x coordinates 61

of pixels Pl and Pr . 4. Compute the blurring measurement as the mean transition width estimated over all vertical edge pixels in the image. 5. From the spatial locations of vertical edge pixels in L – which have been detected in step 1 –, steps 2 to 4 are performed on the luminance plane Lˆ of the demosaiced image. A blurring measurement is then obtained for this plane. 6. The two measurements, obtained respectively for the original and demosaiced images, are compared in order to estimate blurring caused by the considered demosaicing scheme. 4.4.2. Zipper Effect Measurements As far as we know, the single proposition for zipper effect measurement was given by Lu and Tan (2003). This artifact is characterized at a pixel by an increase of the minimal distance between its color and those of its neighbors. This measurement therefore relates to the original color image. ˆ compared with the original The zipper effect measurement in a demosaiced image I, image I, is computed by these successive steps : 1. At each pixel P in the original image I, identify the neighboring pixel P′ whose color is the closest to that of P in CIE L∗ a∗ b∗ color space : P′ = argmin kI(P) − I(Q)k ,

(98)

Q∈N8

where N8 is the 8-neighborhood of P and k·k is the Euclidean distance in CIE L∗ a∗ b∗ color space. The color difference is then computed as :

∆I(P) = I(P) − I(P′ ) .

(99)

2. At the same locations as P and P′ , compute their color difference in the demosaiced image Iˆ :

ˆ ′ ) . ˆ ∆I(P) = ˆI(P) − I(P

(100)

ˆ − ∆I(P). 3. Compute the color difference variation ϕ (P) = ∆I(P)

4. Threshold the color difference variation, in order to detect the pixels P where zipper effect occurs. If |ϕ (P)| > Tϕ , the pixel P in the demosaiced image presents a high variation of the difference between its color and that of P′ . More precisely, when ϕ (P) is lower than −Tϕ , the demosaicing scheme has reduced the color difference between pixels P and P′ . On the other hand, when ϕ (P) > Tϕ , the difference between the color of P and that of P′ has been highly increased in Iˆ compared with I ; so, the pixel P is considered as affected by zipper effect. The authors propose to set the threshold Tϕ to 2.3 5. Compute the rate of pixels affected by zipper effect in the demosaiced image :  (101) ZE% = Card P(x,y) | ϕ (P) > Tϕ . 62

(a) Original image I

(b) Demosaiced image Iˆ

(c) Zipper effect map

F IG . 34: Over-detection of the zipper effect by Lu and Tan’s measurement (2003), in a synthetic image. In the detection map (c), pixels affected by zipper effect are labeled as ×, and the ground-truth (determined by visual examination) is labeled as gray. A pixel labeled both as × and gray corresponds to a correct detection, whereas a pixel labeled only as × corresponds to an over-detection of the zipper effect. The effectiveness of this measurement was illustrated by its authors with a synthetic image7 . However, by applying it to images of Kodak Database, we will show in section 5.2 that it tends to over-detect zipper effect in the demosaiced images. Two reasons explain this over-detection. First, a pixel whose color is correctly estimated and which has neighboring pixels whose colors are erroneously estimated, can be considered as being affected by zipper effect (see figure 34). Second, we notice that all the pixels detected by Lu and Tan’s measurement are not located in areas with perceptible alternating patterns which correspond to zipper effect. Indeed, all the artifacts which can increase the minimal difference between the color of a pixel and those of its neighbors do not always bear the geometric properties of zipper effect. An example of this phenomenon is found on the zipper effect detection result of figure 38c4 : almost all pixels are detected as affected by zipper effect, although the demosaiced image 38b4 does not contain this repetitive and alternating pattern. To avoid over-detection, we propose a scheme – hereafter referred to as directional alternation measurement – which quantifies the level variations over three adjacent pixels along the horizontal or vertical direction in the demosaiced image. Two reasons explain why the direction of zipper effect is mainly horizontal or vertical. Demosaicing schemes usually estimate the green color component first, then the red and blue ones by using color differences or ratios. However, along a diagonal direction in the CFA image, all the green levels are either available or missing. Since there is no alternating pattern between estimated and available levels along this diagonal direction, there are few alternating estimation errors which characterize zipper effect. Secondly, edges of objects in a natural scene tend to follow horizontal and vertical directions. We propose to modify the selection of neighboring pixels used to decide, thanks to Lu and Tan’s criterion, whether the examined pixel is affected by zipper effect. We 7 This

image is not available.

63

require the selected adjacent pixels to present a green alternating pattern specific to zipper effect. Moreover, this series of three adjacent pixels has to be located along transitions between homogeneous areas, so that the variations of levels associated with this transition are not taken into account. The zipper effect detection scheme based on directional alternation, which provides a measurement for this artifact, consists in the following successive steps : 1. At a give pixel P, determine the local direction (horizontal or vertical) along which the green variations are the lowest in the original image. This direction is selected so that the green level dispersion is the lowest :

σ x (P) =

2 1 1 G Ix+i,y − µ x (P) ∑ 3 i=−1 (102)

and

σ y (P) =

2 1 1 G Ix,y+i − µ y (P) , ∑ 3 i=−1 (103)

G (respecwhere µ x (P) (respectively, µ y (P)) is the mean of the green levels Ix+i,y G tively, Ix,y+i ), i ∈ {−1,0,1}, in the original image I. The determined direction δ is that for which the directional variance is the lowest :   (104) δ = argmin σ d (P) . d∈{x,y}

Thanks to this step, the green levels of the three selected adjacent pixels are locally the most homogeneous. 2. Evaluate the alternation amplitude at pixel P, between the three adjacent pixels along direction δ , in the original and estimated images. When δ is horizontal, the amplitude on a plane I is computed as : α x (I,P) = Ix−1,y − Ix,y + Ix,y − Ix+1,y − Ix−1,y − Ix+1,y ,

(105)

When δ is vertical, the amplitude is computed as :

α y (I,P) = Ix,y−1 − Ix,y + Ix,y − Ix,y+1 − Ix,y−1 − Ix,y+1 .

(106)

When the three green levels present an alternating “high-low-high” or “low-highlow” pattern, α δ (I,P) is strictly positive, otherwise zero. 3. Compare the alternation amplitudes on the G plane image I and  of the original  ˆ When α δ IˆG ,P > α δ I G ,P , the alternation that of the demosaiced image I. amplitude of green levels has been amplified by demosaicing along the direction δ . The pixel P is retained as a candidate pixel affected by zipper effect. 4. Apply to these candidate pixels a modified the scheme proposed by Lu and Tan, except that the neighboring pixel P′ whose color is the closest to P has to be one of the two neighboring pixels along the selected direction δ .

64

4.4.3. False Colors Measurement We also propose a measurement for the false color artifact (Yang et al., 2007). At a pixel in the demosaiced image, any mere error in the estimated value a color component can be considered as a false color. However, the human visual system cannot actually distinguish any subtle color difference which is lower than a specific threshold (Faugeras, 1979). We consider that the estimated color at a pixel is false when the absolute difference between an estimated color component and the original one is higher than a threshold T . The proposed measurement FC% is the ratio between the number of pixels affected by false colors and the image size :     100 k k ˆ Card P(x,y) | max Ix,y − Ix,y > T . FC% = (107) XY k=R,G,B

FC% is easy to be implemented and expresses the rate of pixels affected by false colors as a measurement of the performance reached by a demosaicing scheme. Moreover, this criterion can be also used to locate pixels affected by false colors. However, as classical fidelity criteria, it requires the original image in order to compare the efficiency of demosaicing schemes. 4.5. Measurements Dedicated to Low-level Image Analysis Since the demosaicing methods intend to produce “perceptually satisfying” demosaiced images, the most widely used evaluation criteria are based on the fidelity to the original images. Rather than displaying images, our long-term goal is pattern recognition by means of feature analysis. These features extracted from the demosaiced images are mostly derived from either colors or detected edges. Artifacts generated by demosaicing (mostly blurring and false colors) may affect the performance of edge detection methods applied to the demosaiced image. Indeed, blurring reduces the sharpness of edges, and false colors can give rise to irrelevant edges. That is why we propose to quantify the demosaicing performance by measuring the rates of erroneously detected edge pixels.

4.5.1. Measurements of Sub- and Over-detected Edges. The edge detection procedure is sensitive to the alteration of high spatial frequencies caused by demosaicing. Indeed, low-pass filtering tends to generate blurring, and so to smooth edges. Moreover, when the demosaicing scheme generates false colors or zipper effect, it may give rise to abnormally high values of a color gradient module. The respective expected consequences are sub- and over-detection of edges. Notice that the different demosaicing algorithms are more or less efficient in avoiding to generate blurring, false color and zipper effect artifacts. So, we propose a new evaluation scheme which performs these successive steps (Yang et al., 2007) : 1. Apply a hysteresis thresholding of the module of the color gradient proposed by Di Zenzo (1986), in order to detect edges in the original image I. The same edge detection scheme with the same parameters is applied to the demosaiced ˆ image I. Edge detection is performed as follows : 65

(a) Compute the square module of the Di Zenzo gradient at each pixel in image I as :

k∇Ik2

=

θ

=

  q 1 a + c + (a − c)2 + 4b2 , 2   1 2b , arctan 2 a−c

(108) (109)

where coefficients a, b and c are computed by approximating the partial derivatives of the image function I : a = b = c =



∂I ∂x



∂I ∂y

2



2



∂I ∂I ∂x ∂y

2 2 2 ∆x (I R ) + ∆x (I G ) + ∆x (I B ) ,

≈ ∆x (I R )∆y (I R ) + ∆x (I G )∆y (I G ) + ∆x (I B )∆y (I B ), 2 2 2 ∆y (I R ) + ∆y (I G ) + ∆y (I B ) .

Each approximative partial derivative ∆d (I k ), d ∈ {x,y} , k ∈ {R,G,B}, is computed thanks to the Deriche operator (Deriche, 1987). (b) Find the local maxima of the vector gradient module k∇Ik.

(c) Among pixels which are associated with local maxima, detect the edge pixels thanks to a hysteresis thresholding, parametrized by a low threshold Tl and a high threshold Th .

2. Store the edge detection result for the original image in a binary edge map B, ˆ Notice that these two maps, and similarly for the demosaiced image edges in B. in which edge pixels are labeled as white, may be different due to artifacts in the demosaiced image. 3. In order to quantify the influence of demosaicing on edge detection quality, we propose to follow the strategy developed by Martin et al. (2004). Edge maps B and Bˆ are compared by means of two successive operators (see figure 35a) : ˆ in order to enhance (a) Apply the XOR logical operator to edge maps B and B, the differences between them in a new binary map J ; (b) Apply the AND logical operator to maps J and B, which results in the binary sub-detected edge map SD. Similarly, the AND logical operator is ˆ which results in the binary over-detected edge applied to maps J and B, map OD. Pixels labeled as white in map SD are edge pixels which are detected in the ˆ Pixels labeled as original image I but undetected in the demosaiced image I. white in the image OD are edge pixels erroneously detected in the demosaiced ˆ compared with edge pixels detected in I. image I,

66

Original image I

Original edges B

AND

Sub-detection map SD = J AND B

AND

Over-detection map OD = J AND Bˆ

Edge detection Difference map J = B XOR Bˆ

XOR Edge detection Demosaiced image Iˆ

Demosaiced edges Bˆ

(a) General scheme

SD I

B

J

OD





(b) Example

F IG . 35: Steps to measure the quality of edge detection. ˆ J In subfigure (b), over-detected edge pixels are labeled as × (in bold typeface) in B, and OD, in order to distinguish them from sub-detected edge pixels (labeled as ×).

67

f SD SD (X) OD (X)

g OD

f and OD g from SD and OD, on an example. Pixels labeled as F IG . 36: Computing SD dotted × belong to pairs of shifted edge pixels, and are dropped out in the final detection maps.

4. Compute the rates of sub- and over-detected edge pixels respectively as : SD%

=

OD%

=

 100 Card P(x,y) | SDx,y 6= 0 , XY  100 Card P(x,y) | ODx,y 6= 0 . XY

(110) (111)

Finally, the rate of erroneously detected edge pixels is expressed as ED% = SD% + OD% . 4.5.2. Measurements Based on Shifted Edges By visually examining the map J in figure 35b, we notice the presence of many pairs of adjacent edge pixels. In such edge pairs, one pixel is detected in B only (i.e. subdetected), and the other one in Bˆ only (i.e. over-detected). For example, the map J of figure 35b presents five pairs of adjacent pixels composed of a sub-detected edge pixel (labeled as ×) and an over-detected edge pixel (labeled as × in bold typeface). These cases do not result from a bad edge detection, but from a spatial shift of edge pixels between the original and demosaiced images. A sub-detected (respectively, overdetected) edge pixel is shifted when at least one of its neighbors is an over-detected (respectively, sub-detected) edge pixel. Such pairs of pixels are hereafter called pairs of shifted (edge) pixels. In order to characterize the effect of demosaicing on edge detection precisely, we want to distinguish pairs of shifted edge pixels from other edge pixels. For this purpose, we represent unshifted sub- and over-detected edge pixels as two binary maps f and g respectively denoted as SD OD, and defined as : 68

f x,y 6= 0 SD

g ODx,y 6= 0

⇔ ⇔

 6 0 , SDx,y 6= 0 ∧ 6 ∃Q(x′ ,y′ ) ∈ N8 (P(x,y)) ODx′ ,y′ =  ODx,y 6= 0 ∧ 6 ∃Q(x′ ,y′ ) ∈ N8 (P(x,y)) SDx′ ,y′ 6= 0 ,

(112) (113)

where symbol ∧ represents the logical AND operator. f and g Figure 36 illustrates, from the example of figure 35, how maps SD OD are f g obtained. In this figure, maps SD and OD used to build SD and OD are superimposed in order to highlight the pairs of shifted edge pixels. f and g From the two binary maps SD OD, we compute the rates of sub- and overdetected unshifted edge pixels as : f% SD

g OD%

= =

n o 100 f x,y 6= 0 , Card P(x,y) | SD XY n o 100 ODx,y 6= 0 . Card P(x,y) | g XY

(114) (115)

These rates are used to evaluate precisely the quality of edge detection in demosaiced images. 4.6. Conclusion In this section, we have presented the techniques of objective evaluation of demosaicing quality. For this purpose, we have first presented the most occurred artifacts caused by demosaicing. Blurring, false colors and zipper effect damage the quality of the demosaiced images. Then, we have presented classical criteria which total the errors between the original and estimated colors through the image. These criteria have some limits since they provide a global estimation of the demosaicing quality and do not reflect the judgment of an observer. Indeed, they do not quantify the occurrences of artifacts which can be identified by an observer. Therefore, we have described measurements dedicated to three kinds of artifacts. In the computer vision context, most images are acquired by color mono-sensor cameras in order to be automatically processed. So, the quality of demosaicing affects the quality of low-level image analysis schemes. That is the reason why we have proposed criteria which are based on the quality of edge detection.

69

5. Quality Evaluation Results 5.1. Results of Classical Criteria The quality of demosaicing results, achieved by the ten methods detailed in section 2, has been first evaluated thanks to classical criteria. For this purpose, the twelve mostly used images of Kodak benchmark database are considered (see figure 37)8 . These images, all of size 768 × 512 pixels, have been selected in order to present a significant variety of homogeneous regions, colors and textured areas. Table 3 displays the results obtained with criteria which measure the fidelity of each demosaiced image to its corresponding original image, namely the mean absolute error (MAE, expression (82)), the peak signal-to-noise ratio (PSNR, expression (85)) and the correlation criterion (C, expression (87)). Table 4 shows, for the same images and demosaicing schemes, the results obtained with perceptual criteria, namely the estima∗ ∗ ∗ tion error in CIE L∗ a∗ b∗ color space (∆E L a b , expression (92)), the estimation error ∗ ∗ ∗ in S-CIE L∗ a∗ b∗ color space (∆E S-L a b ) and the criterion of normalized color difference (NCD, expression (97)) between the demosaiced image and its original image. These two tables show that for a given method, the performances measured with a specific criterion vary from an image to another one. This confirms that obtaining a good color estimation from the CFA image is all the more difficult as the image is rich in high spatial frequency areas. For instance, the PSNR of images demosaiced by bilinear interpolation ranges from 24.5 dB for image 4 (“House”), which contains a lot of high frequency areas, to 36 dB for image 1 (“Parrots”), which contains a lot of homogeneous regions. It can be noticed that the two methods which chiefly use the frequency domain provide better results than those which only scan the spatial domain. Moreover, the method proposed by Dubois (2005) achieves the best average results over the twelve images, whatever the considered criterion. We also notice that the different criteria provide similar performance rankings for the methods on a given image. 5.2. Results of Artifact-sensitive Measurements 5.2.1. Zipper Effect Measurements In order to compare the relevance of the results provided by the two zipper effect measurements described in section 4.4.2, we propose to use the following procedure. First, a ground truth is built for the zipper effect by visually examining the demosaiced image and defining whether each pixel is affected by zipper effect or not. Then, the two measurements are applied, in order to provide binary maps where pixels which are affected by zipper effect are labeled as white. A final comparison step of these binary maps with the ground truth quantifies the performance of each objective measurement, by counting pixels where zipper effect is correctly detected, sub-detected and overdetected. Figure 38 displays the results on four image extracts of size 10 × 10 pixels. It shows that the directional alternation measurement generally fits better with the ground truth 8 This

database is available at http://www.math.purdue.edu/~lucier/PHOTO_CD

70

1. Parrots

2. Sailboat

3. Windows

4. Houses

5. Race

6. Pier

7. Island

8. Lighthouse

9. Plane

11. Barn

12. Chalet

10. Cape

71

F IG . 37: The twelve benchmark images picked up from Kodak database. Images 5 and 8 are presented vertically for illustration purpose, but have been analyzed in landscape orientation.

Image

Criterion

Bilinear

Cst. Hue

Hamilton

Wu

Cok

Kimmel

Li

Gunturk

Dubois

Lian

1

MAE PSNR C

1.542 36.256 0.9966

1.358 38.082 0.9978

0.938 42.868 0.9993

0.949 42.984 0.9993

1.257 39.069 0.9982

1.784 31.883 0.9912

1.379 38.132 0.9978

0.877 43.186 0.9993

0.879 43.259 0.9993

0.796 44.199 0.9995

2

MAE PSNR C

4.352 28.956 0.9830

3.381 31.396 0.9905

1.829 36.324 0.9970

1.565 37.831 0.9978

2.897 32.561 0.9928

2.241 34.418 0.9952

2.515 33.499 0.9942

1.339 39.951 0.9987

1.154 41.433 0.9990

1.415 39.303 0.9984

3

MAE PSNR C

1.978 34.454 0.9909

1.578 36.779 0.9946

0.980 41.773 0.9983

0.994 41.641 0.9982

1.407 37.915 0.9958

1.264 38.620 0.9965

1.484 37.111 0.9950

0.907 42.713 0.9987

0.900 43.062 0.9987

0.786 43.832 0.9989

4

MAE PSNR C

7.329 24.551 0.9596

5.655 27.350 0.9799

2.629 33.409 0.9950

2.607 33.535 0.9951

3.986 29.885 0.9888

3.077 31.858 0.9928

4.130 29.588 0.9881

2.055 36.452 0.9975

2.022 36.479 0.9975

1.975 36.445 0.9975

5

MAE PSNR C

2.276 33.611 0.9863

1.822 36.120 0.9926

1.112 41.430 0.9978

1.078 41.795 0.9980

1.591 37.701 0.9949

1.230 39.659 0.9967

1.556 37.515 0.9946

0.896 43.237 0.9985

0.895 43.354 0.9986

0.860 43.785 0.9987

6

MAE PSNR C

3.589 30.191 0.9783

2.857 32.400 0.9874

1.605 37.353 0.9960

1.511 37.748 0.9963

2.404 33.579 0.9905

1.949 35.344 0.9935

2.370 33.372 0.9900

1.247 40.409 0.9980

1.167 40.894 0.9982

1.215 40.399 0.9980

7

MAE PSNR C

2.880 32.341 0.9861

2.264 34.719 0.9921

1.263 39.713 0.9975

1.084 41.613 0.9984

1.931 36.141 0.9944

1.518 37.788 0.9961

1.652 37.451 0.9958

0.964 42.913 0.9988

0.826 44.680 0.9992

1.022 42.144 0.9986

8

MAE PSNR C

3.849 29.186 0.9775

3.079 31.716 0.9875

1.571 38.419 0.9973

1.546 38.594 0.9974

2.344 34.663 0.9936

1.874 36.172 0.9956

2.284 34.708 0.9938

1.234 42.913 0.9985

1.164 41.547 0.9987

1.195 41.072 0.9986

9

MAE PSNR C

2.362 32.565 0.9973

1.929 34.931 0.9984

1.306 39.462 0.9995

1.318 39.347 0.9994

1.769 35.985 0.9988

1.394 38.181 0.9993

1.802 35.601 0.9987

1.043 42.030 0.9997

1.114 41.735 0.9997

0.994 42.353 0.9997

10

MAE PSNR C

3.772 29.557 0.9769

2.936 31.960 0.9870

1.840 36.542 0.9955

1.801 36.643 0.9955

2.661 32.891 0.9895

1.969 35.202 0.9939

2.739 32.549 0.9887

1.311 40.220 0.9981

1.290 40.172 0.9981

1.319 39.972 0.9980

11

MAE PSNR C

3.164 31.433 0.9849

2.497 33.718 0.9909

1.701 37.746 0.9964

1.741 37.455 0.9962

2.346 34.560 0.9925

1.971 35.995 0.9949

2.535 33.802 0.9913

1.442 39.217 0.9975

1.368 39.575 0.9977

1.326 39.963 0.9979

12

MAE PSNR C

4.366 27.564 0.9752

3.317 29.938 0.9859

2.057 33.381 0.9936

1.965 34.237 0.9948

3.091 29.957 0.9859

2.244 32.196 0.9915

3.310 29.333 0.9838

1.530 36.630 0.9970

1.453 37.690 0.9976

1.469 36.687 0.9970

Avg.

MAE PSNR C

3.455 30.889 0.9827

2.723 33.259 0.9904

1.569 38.202 0.9969

1.513 38.619 0.9972

2.307 34.575 0.9930

1.876 35.610 0.9947

2.313 34.388 0.9926

1.237 40.823 0.9983

1.186 41.157 0.9985

1.198 40.846 0.9984

TAB . 3: Demosaicing quality results, for twelve color images from Kodak database, according to fidelity criteria : mean absolute error (MAE), peak signal-to-noise ratio (PSNR, in decibels), and correlation (C) between the original image and the demosaiced image. For each image and each criterion, the best result is written in bold typeface. The tested methods are : 1. Bilinear interpolation – 2. Constant-hue-based interpolation (Cok, 1987) – 3. Gradient-based method (Hamilton and Adams, 1997) – 4. Component-consistent scheme (Wu and Zhang, 2004) – 5. Method based on template matching (Cok, 1986) – 6. Adaptive weighted-edge method (Kimmel, 1999) – 7. Covariance-based method (Li and Orchard, 2001) – 8. Alternating projection method (Gunturk et al., 2002) – 9. Frequency selection method (Dubois, 2005) – 10. Method based on frequency and spatial analyses (Lian et al., 2007). 72

Image

Criterion ∗ ∗ ∗ ∆E L a b

1

∗ ∗ ∗

∆E S-L a b NCD ∗ ∗ ∗

2

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

3

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

4

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

5

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

6

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

7

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

8

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

9

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

10

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

11

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

12

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

Avg.

∆E L a b ∗ ∗ ∗ ∆E S-L a b NCD

∗ ∗ ∗

∗ ∗ ∗

∗ ∗ ∗

∗ ∗ ∗

∗ ∗ ∗

∗ ∗ ∗

∗ ∗ ∗

∗ ∗ ∗

∗ ∗ ∗

∗ ∗ ∗

∗ ∗ ∗

Bilinear

Cst. Hue

Hamilton

Wu

Cok

Kimmel

Li

Gunturk

Dubois

Lian

1.439 2.605 0.0098

1.289 2.605 0.0089

1.002 2.318 0.0067

1.010 2.268 0.0068

1.229 2.682 0.0083

1.655 5.701 0.0119

1.387 3.537 0.0094

0.969 2.193 0.0064

0.952 1.967 0.0064

0.899 2.007 0.0060

3.382 6.477 0.0251

2.562 5.965 0.0194

1.538 3.756 0.0113

1.335 2.954 0.0099

2.275 5.360 0.0169

1.772 4.653 0.0136

2.078 4.578 0.0152

1.196 3.079 0.0089

1.078 2.440 0.0079

1.223 3.021 0.0091

2.048 3.715 0.0140

1.663 3.483 0.0114

1.132 2.659 0.0077

1.148 2.615 0.0078

1.492 3.339 0.0101

1.491 4.283 0.0102

1.653 3.990 0.0112

1.108 2.594 0.0074

1.066 2.280 0.0072

0.981 2.229 0.0066

5.467 11.293 0.0441

4.246 10.635 0.0338

2.167 5.729 0.0172

2.099 5.166 0.0169

3.138 7.918 0.0249

2.356 6.125 0.0193

3.315 7.886 0.0261

1.735 4.850 0.0140

1.676 4.507 0.0136

1.652 4.327 0.0132

1.780 3.925 0.0139

1.474 3.824 0.0114

0.965 2.753 0.0074

0.931 2.661 0.0072

1.273 3.401 0.0099

1.040 3.462 0.0082

1.299 3.344 0.0100

0.861 2.437 0.0065

0.843 2.361 0.0064

0.816 2.304 0.0062

3.511 6.883 0.0261

2.762 6.417 0.0209

1.729 4.333 0.0128

1.641 3.809 0.0122

2.419 5.781 0.0179

1.943 4.806 0.0151

2.485 5.675 0.0183

1.393 3.589 0.0104

1.334 3.209 0.0099

1.343 3.323 0.0100

2.671 5.231 0.0206

2.047 4.808 0.0161

1.259 3.135 0.0096

1.088 2.496 0.0083

1.789 4.254 0.0138

1.407 3.563 0.0113

1.592 3.580 0.0121

1.021 2.597 0.0079

0.895 1.991 0.0068

1.051 2.635 0.0081

3.338 6.474 0.0243

2.629 5.984 0.0193

1.561 3.811 0.0111

1.526 3.465 0.0110

2.170 5.039 0.0156

1.806 4.404 0.0133

2.195 4.857 0.0157

1.260 3.208 0.0090

1.188 2.860 0.0085

1.224 2.963 0.0087

2.155 4.568 0.0150

1.725 4.136 0.0122

1.221 3.175 0.0086

1.208 2.984 0.0086

1.613 3.909 0.0113

1.277 3.346 0.0093

1.709 4.663 0.0119

0.996 2.791 0.0071

1.005 2.697 0.0072

0.959 2.478 0.0068

3.259 6.239 0.0251

2.524 5.839 0.0197

1.705 4.234 0.0131

1.652 3.826 0.0128

2.356 5.555 0.0182

1.696 4.060 0.0137

2.517 5.694 0.0192

1.273 3.321 0.0099

1.261 3.107 0.0097

1.278 3.140 0.0098

2.724 5.175 0.0195

2.152 4.747 0.0157

1.584 3.898 0.0114

1.602 3.738 0.0116

2.065 4.852 0.0149

1.822 4.690 0.0133

2.284 5.371 0.0165

1.416 3.631 0.0101

1.319 3.157 0.0095

1.303 3.191 0.0093

3.402 6.286 0.0258

2.620 5.870 0.0200

1.736 4.341 0.0132

1.655 3.920 0.0127

2.482 5.965 0.0188

1.814 4.384 0.0142

2.730 6.371 0.0206

1.380 3.564 0.0105

1.318 3.135 0.0101

1.317 3.193 0.0100

2.931 5.739 0.0219

2.308 5.359 0.0174

1.467 3.678 0.0108

1.408 3.325 0.0105

2.025 4.838 0.0150

1.673 4.456 0.0128

2.104 4.962 0.0155

1.217 3.154 0.0090

1.161 2.809 0.0086

1.170 2.901 0.0086

TAB . 4: Demosaicing quality results, for twelve color images from Kodak database, ∗ ∗ ∗ according to perceptual criteria : estimation error in CIE L∗ a∗ b∗ color space (∆E L a b ), ∗ ∗ ∗ estimation error in S-CIE L∗ a∗ b∗ color space (∆E S-L a b ), and criterion of normalized color difference (NCD). For each image and each criterion, the best result (i.e. lowest value) is written in bold typeface. Images and tested methods are the same as in table 3. The illuminant used for (X,Y,Z) transform is the standard CIE D65, which corresponds to daylight. 73

(a1)

(b1)

(c1)

(d1)

(a2)

(b2)

(c2)

(d2)

(a3)

(b3)

(c3)

(d3)

(a4)

(b4)

(c4)

(d4)

F IG . 38: Zipper effect detection in four Kodak image extracts, according to two measurements. (a1)–(a4) : original extracts. (b1)–(b4) : demosaiced extracts. Last two columns : pixels affected by zipper effect, according to Lu and Tan’s criterion (c1)–(c4) and to the directional alternation (d1)–(d4). Pixels affected by zipper effect are labeled as ×. They correspond to ground truth in images (b1)–(b4). In images (c1)–(d4), the ground truth is reproduced as gray-labeled pixels. So, pixels where the zipper effect is well detected are both labeled as × and gray. Pixels where the zipper effect is sub-detected (respectively, over-detected) are labeled only as × (respectively, only as gray). Images (b1) and (b2) are estimated by bilinear interpolation, (b3) and (b4) by Hamilton and Adams’ (1997) gradient-based method.

74

Image (a1) (a2) (a3) (a4) Total

Well-detected Lu and Directional Tan alternation 100 100 58 83 72 86 7 94 237 363

Sub-detected Lu and Directional Tan alternation 0 0 2 1 1 9 0 0 3 10

Over-detected Lu and Directional Tan alternation 0 0 40 16 27 5 93 6 160 27

TAB . 5: Comparison between the measurements quantifying zipper effect, proposed by Lu and Tan (2003) and based on the directional alternation. Values correspond to the numbers of well-detected, sub-detected and over-detected pixels affected by this artifact in the four image extracts of figure 38. than Lu and Tan’s measurement does. This remark is confirmed numerically by comparing the numbers of well-detected, sub-detected and over-detected pixels affected by zipper effect in the four images. The results in table 5 show that the measurement based on directional alternation generally provides higher well-detected pixel rates than the one proposed by Lu and Tan. Indeed, the latter over-detects zipper effect whereas the measurement based on directional alternation tends to slightly sub-detect this artifact. Finally, we have compared the demosaicing schemes according to the measurement based on directional alternation. Table 6 shows that the results are similar to those obtained with classical criteria, presented in tables 3 and 4 : bilinear interpolation always generates the highest amount of zipper effect, whereas the scheme proposed by Lian et al. (2007) is overall the most efficient. However, by examining table 6 in detail, we notice that in images with few high spatial frequencies (number 2-“Sailboat” and 7-“Island”), the method proposed by Dubois tends to generate less zipper artifact than Lian et al.’s method does. Generally speaking, these results show that the methods which analyze the frequency domain generate less zipper effect than those which scan the image plane (Menon et al., 2006). 5.2.2. False colors As described in section 4.4.3, the estimated color at a pixel is taken as false when the absolute difference between an estimated color component and the original one is higher than a threshold T (see equation (107)). Since adjusting this threshold is not easy, we compare the performance reached by a set of ten demosaicing schemes applied to twelve images of the Kodak database, when T varies from 10 to 25 with an incremental step of 5. Figure 39 shows both the evolution of the average rate of false colors with respect to T for a given scheme, and the rates of false colors generated by the considered schemes for a given value of T . As expected, the rate of false colors decreases when T increases. More interestingly, the relative ranking of demosaicing methods with respect to the number of false colors is consistent with both rankings provided by classical fidelity criteria and by measurements based on zipper effect.

75

Image 1 2 3 4 5 6 7 8 9 10 11 12 Avg.

Bilinear 4.317 22.567 8.793 35.932 9.023 19.876 18.483 18.216 9.459 15.425 12.816 18.729 16.136

Cst. Hue 1.939 12.761 4.581 25.164 4.226 10.707 10.124 11.672 5.618 9.976 6.331 10.107 9.434

Hamilton 0.623 2.656 1.257 4.485 0.610 2.955 1.954 2.369 1.695 3.021 1.809 2.735 2.181

Wu 0.822 2.082 1.626 5.393 0.581 3.405 1.213 3.051 2.192 3.473 2.726 3.461 2.502

Cok 0.735 4.903 1.374 7.214 1.110 3.986 3.730 3.811 2.367 4.003 2.840 3.761 3.319

Kimmel 4.408 2.464 1.795 5.023 0.658 2.797 1.990 2.122 1.537 2.475 1.835 2.269 2.448

Li 3.068 7.157 4.093 14.031 2.069 7.868 4.579 7.213 5.335 8.548 7.083 9.256 6.692

Gunturk 0.893 0.682 1.664 2.402 0.664 1.562 0.391 0.850 0.714 0.984 1.166 1.285 1.105

TAB . 6: Rates ZE% of pixels affected by zipper effect, according to the measurement based on directional alternation. The images and tested methods are the same as in table 3.

20

Bilinear Cst. Hue Hamilton Wu C ok Kimmel Li Gunturk Dubois Lian

18 16 14

FC%

12 10 8 6 4 2 0

10

15

20

25

T F IG . 39: Average rates of false colors FC% with respect to the detection threshold T . The twelve considered images and ten tested methods are the same as in table 3.

76

Dubois 0.861 0.487 1.278 2.351 0.482 1.441 0.177 0.727 0.709 0.967 0.962 1.076 0.960

Lian 0.345 0.590 0.546 1.610 0.192 0.826 0.436 0.617 0.422 0.685 0.510 0.803 0.632

5.3. Discussion The most widely used criteria for the evaluation of demosaicing quality are MSE and PSNR, the latter being a logarithmic form of the MSE criterion. Several reasons explain why most of the authors use these criteria (Wang and Bovik, 2006). First, these functions are easy to be implemented and their derivatives can be estimated. They may therefore be integrated into an optimization scheme. Second, the PSNR criterion has a real physical meaning – namely, the maximal energy of the signal with respect to errors generated by demosaicing –, which can also be analyzed in the frequency domain. However, the PSNR criterion provides a general estimation of the demosaicing quality, but does not really reflect the human judgment. For example, an observer would prefer an image containing a large number of pixels with estimated colors close to the original ones, than an image containing a reduced number of pixels affected by visible artifacts. But MSE and PSNR criteria could provide identical values in both cases, since they do not discriminate the characteristics of different artifacts in the demosaiced image. These objective measurements have been criticized (Wang and Bovik, 2009) since they cannot evaluate the image alteration as a human observer does (Eskicioglu and Fisher, 1995). The alternative criteria ∆E of estimation errors in the CIE L∗ a∗ b∗ and S-CIE L∗ a∗ b∗ color spaces are the most widely used perceptual criteria (Zhang and Wandell, 1997). They are based on perceptually uniform color spaces as an attempt to represent the human perception, but require prior knowledge about the illuminant and the reference white used during image acquisition. Since the acquisition conditions are not always known, the quality of these measurements may be biased. 5.4. Experimental Results for Edge Detection The demosaicing performance has been evaluated with respect to the quality of edge detection thanks to measurements detailed in section 4.5. Table 7 displays the average rates of sub-detected (SD% ), over-detected (OD% ) and erreneously detected (ED% = SD% + OD% ) edge pixels. These values have been computed over the twelve Kodak images previously considered, and for the ten classical demosaicing schemes. f %, g f % which take into acMoreover, this table displays the average rates SD OD% and ED count only unshifted edge pixels. The lowest values correspond to the best demosaicing quality according to these edge-dedicated measurements. f % , similar conclusions can be drawn By examining the average rates ED% and ED about the performances of demosaicing schemes. The methods which privilege the frequency domain allow to obtain better edge detection quality than the other methods do. Besides, the methods proposed by Dubois and by Lian et al. provide the lowest error rates in both edge and unshifted edge detection. These demosaicing schemes are therefore the most apt to be coupled with edge detection procedures based on color gradient. Moreover, we notice that the ranking of the ten tested demosaicing schemes with respect to OD% and SD% is relatively consistent with the ranking obtained with meaf % . However, the rate of over-detected unshifted pixels is the surements g OD% and SD lowest for bilinear interpolation. This suprising performance result can be explained 77

Meas.

Bilinear

Cst. Hue

Hamilton

Wu

Cok

Kimmel

Li

Gunturk

Dubois

Lian

SD%

3.673

2.090

1.528

1.561

1.882

1.983

2.265

1.422

1.278

1.323

OD%

2.257

1.945

1.504

1.522

1.818

1.802

2.319

1.242

1.199

1.263

ED%

5.930

4.035

3.032

3.083

3.700

3.785

4.584

2.664

2.477

2.586

f% SD

1.945

1.109

0.881

0.877

1.032

1.077

1.094

0.888

0.774

0.803

0.663

0.979

0.855

0.842

0.974

0.912

1.156

0.713

0.697

0.748

2.608

2.088

1.736

1.719

2.006

1.989

2.250

1.601

1.471

1.551

g OD%

f% ED

TAB . 7: Average rates of sub-detected edge pixels (SD% ), of over-detected edge pixels (OD% ) and erreneously detected pixels (ED% = SD% + OD% ). Average rates of subf % ), of over-detected unshifted edge pixels (g detected unshifted edge pixels (SD OD% ) f% +g f % = SD and of unshifted edge pixels that are erreneously detected (ED OD% ). The low and high thresholds used for hysteresis thresholding are set to 1 and 6, respectively. The twelve considered images and ten tested methods are the same as in table 3. by both strong blurring and zipper effect generated by this demosaicing method. Indeed, blurring induces fewer detected edge pixels, and zipper effect mainly induces pairs of shifted edge pixels. For each of the other methods, the rates of sub- and over-detected edge pixels are overall similar. Moreover, their ranking is almost the same as the one obtained with the previous criteria. In table 7, we also notice that more than the half of sub- and over-detected edge pixels according to measurements SD% and OD% are not retrieved with measuref % and g ments SD OD% . That means that shifted edges strongly contribute to the dissimilarity between edges detected in the original and demosaiced images. Edge pixels are sub-detected because the color gradient module used to detect edges decreases with blurring in demosaiced images. The over-detected edge pixels correspond to an increase of the color gradient module in case of zipper effect or false colors. f % and g These new rates of sub- and over-detected pixels SD OD% are able to reflect the artifacts caused by demosaicing. From table 7, we can evaluate the influence, on edge detection, of the demosaicing strategies implemented in the tested methods. Both methods using bilinear interpolation and hue constancy estimate the pixel colors without exploiting spatial correlation. Hence, they generate more artifacts than the three other methods which exploit spatial correlation, and provide higher rates of sub- and over-detected edge pixels. All in all, sub- and over-detected edge pixels often coincide with artifacts. Figure 40 shows images which are demosaiced by two different schemes, and the respective maps f and g of sub- and over-detected unshifted edge pixels (SD OD). We notice that demosaicing influences the edge detection more significantly in areas with high spatial frequencies and that the artifacts are also mainly located in these areas. Zipper effect often decreases the variation of levels in transition areas between homogeneous regions. Hence, zipper effect tends to decrease the gradient module, so that the norm of local maxima becomes lower than the high threshold Th used by the 78

(a) Image demosaiced by bilinear interpolation

(b) Image demosaiced by Hamilton and Adams

f in image (a) (c) Sub-detected edge pixels SD

f in image (b) (d) Sub-detected edge pixels SD

f in image (a) (e) Over-detected edge pixels OD

f in image (b) (f) Over-detected edge pixels OD

F IG . 40: Sub- and over-detected unshifted edge pixels, for two demosaicing schemes : bilinear interpolation and the gradient-based method proposed by Hamilton and Adams (1997).

79

(a) Original image I

(b) Demosaiced image Iˆ

(c) Comparison between detected edge pixels (green : coinciding ; blue : subdetected ; red : over-detected)

F IG . 41: Example of edge pixels which are not modified by pixels affected by false colors, on an image demosaiced by the scheme proposed by Hamilton and Adams (1997). hysteresis thresholding (see page 65). This explains why zipper effect causes edge subdetection. Since a lot of pixels are affected by zipper effect, the rate of sub-detected edge pixels is generally lower than that of over-detected ones. Isolated pixels affected by false colors do not always change the location of detected edge pixels. Figure 41 shows that pixels affected with false colors do not change the quality of edge detection, on an extract of the image “Houses”. At these pixels indeed, the gradient module increases, whereas the location of edge pixels remains unchanged. On the other hand, when the local density of pixels affected by false colors is high, they cause edge over-detection. In textured areas with thin details, most demosaicing schemes generate a lot of neighboring pixels affected by false colors. The gradient module at these pixels increases since its computation takes into account several neighboring false colors. The gradient module at local maxima increases, so that it may become higher than the high threshold Th used by the hysteresis thresholding. In that case, new edge pixels are detected. For example, figure 40 shows that edge pixels are over-detected in textured areas which correspond to the shutters and to the tiles of the house roofs. Finally, we notice that statistics about sub-detected edge pixels can be exploited to measure the blurring effect caused by demosaicing, and that over-detected pixels are located in areas with a high density of false colors.

80

6. Conclusion This paper is related to the majority of digital color cameras, which are equipped with a single sensor. The surface of this sensor is covered by a color filter array which consists in a mosaic of spectrally selective filters, so that each sensor element samples only one of the three color components Red, Green or Blue. We focus on the Bayer CFA which is the most widely used. To estimate the color (R,G,B) of each pixel in a true color image, one has to determine the values of the two missing color components at each pixel in the CFA image. This process is commonly referred to as CFA demosaicing, and its result as the demosaiced image. Demosaicing methods may exploit the spatial and/or frequency domains. The spatial domain has been historically used first, and many methods are based on assumptions about spectral and/or spatial correlation. More recently, works have appeared that exploit the frequency domain, which opens wide perspectives. We have compared the performances reached by ten demosaicing schemes applied to twelve images extracted from Kodak database, with respect to three kinds of quality measurements : classical fidelity criteria, artifact-sensitive measurements and measurements dedicated to edge detection. The rankings between the demosaicing schemes established thanks to these measurements are consistent. This detailed evaluation highlights that the methods which primarily analyze the frequency domain outperform those which only scan the spatial domain. More precisely, the methods proposed by Dubois (2005) and by Lian et al. (2007) provide the best demosaicing results whatever the criterion used. The implementation of demosaicing schemes has to respect real-time constraints. Indeed, the time required for image demosaicing has to be lower than the image acquisition time. Hence, it would be useful to look for a compromise between the processing time and the performance reached by the examined demosaicing schemes. This study would allow to select the best methods which are less time-consuming. Thanks to a visual comparison of the results, we have described the relationships between artifacts and edge detection quality. Zipper effect causes edge sub-detection, whereas a high density of pixels affected with false colors tends to cause over-detection of edge pixels. These preliminary conclusions are worth being generalized to the relationships between artifacts and the detection quality of other features in the demosaiced images.

81

Alleysson, D., Chaix de Lavarène, B., Süsstrunk, S., Hérault, J., Sep. 2008. Linear Minimum Mean Square Error Demosaicking. CRC Press, Ch. 8, pp. 213–237. Alleysson, D., Süsstrunk, S., Hérault, J., Apr. 2005. Linear demosaicing inspired by the human visual system. IEEE Transactions on Image Processing 14 (4), 439–449. Astola, J., Haavisto, P., Neuvo, Y., Apr. 1990. Vector median filters. Proceedings of the IEEE 78 (4), 678–689. Asuni, N., Giachetti, A., Jan. 2008. Accuracy improvements and artifacts removal in edge-based image interpolation. In : Ranchordas, A., Araújo, H. (Eds.), Proceedings of the 3rd International Conference on Computer Vision Theory and Application (VISAPP’08). Funchal, Madeira, Portugal, pp. 58–65. Bayer, B. E., Jul. 1976. Color imaging array. U.S. patent 3,971,065, to Eastman Kodak Co., Patent and Trademark Office, Washington D.C. Bayrama, S., Sencar, H. T., Memon, N., Sep. 2008. Classification of digital cameramodels based on demosaicing artifacts. Digital Investigation 5 (1-2), 49–59. Buades, A., Coll, B., Morel, J.-M., Sbert, C., Aug. 2008. Non local demosaicing. In : Proceedings of the 2008 International Workshop on Local and Non-Local Approximation in Image Processing (LNLA’08). Lausanne, Suisse. Busin, L., Vandenbroucke, N., Macaire, L., 2008. Color spaces and image segmentation. Advances in Imaging and Electron Physics 151, 65–168. Chang, L., Tan, Y.-P., Jan. 2006. Hybrid color filter array demosaicking for effective artifact suppression. Journal of Electronic Imaging 15 (1), 013003,1–17. Chen, L., Yap, K.-H., He, Y., Mar. 2008. Subband synthesis for color filter array demosaicking. IEEE Transactions on Systems, Man and Cybernetics 38 (2), 485–492. Chung, K.-H., Chan, Y.-H., Oct. 2006. Color demosaicing using variance of color differences. IEEE Transactions on Image Processing 15 (10), 2944–2955. Cok, D. R., Dec. 1986. Signal processing method and apparatus for sampled image signals. U.S. patent 4,630,307, to Eastman Kodak Co., Patent and Trademark Office, Washington D.C. Cok, D. R., Feb. 1987. Signal processing method and apparatus for producing interpolated chrominance values in a sampled color image signal. U.S. patent 4,642,678, to Eastman Kodak Co., Patent and Trademark Office, Washington D.C. Cok, D. R., May 1994. Reconstruction of CCD images using template matching. In : Proceedings of the IS&T’s 47th Annual Conference, Physics and Chemistry of Imaging Systems (ICPS’94). Vol. 2. Rochester, New York, U.S.A., pp. 380–385. Condat, L., Nov. 2009. A new random color filter array with good spectral properties. In : Proceedings of the IEEE International Conference on Image Processing (ICIP’09). Cairo, Egypt [to appear]. 82

Deriche, R., May 1987. Using Canny’s criteria to derive a recursively implemented optimal edge detector. The International Journal of Computer Vision 1 (2), 167–187. Di Zenzo, S., Jan. 1986. A note on the gradient of a multi-image. Computer Vision, Graphics, and Image Processing 33 (1), 116–125. Dubois, E., Dec. 2005. Frequency-domain methods for demosaicking of Bayersampled color images. IEEE Signal Processing Letters 12 (12), 847–850. Eastman Kodak and various photographers, 1991. Kodak Photo CD PCD0992, Access Software & Photo Sampler, Final version 2.0. [CD-ROM, Part No. 15-1132-01]. Eskicioglu, A. M., Fisher, P. S., Dec. 1995. Image quality measures and their performance. IEEE Transactions on Communications 43 (12), 2959–2965. Faugeras, O. D., Aug. 1979. Digital color image processing within the framework of a human visual model. IEEE Transactions on Acoustics, Speech, and Signal Processing 27 (4), 380–393. Freeman, W. T., Dec. 1988. Median filter for reconstructing missing color samples. U.S. patent 4,724,395, to Polaroid Co., Patent and Trademark Office, Washington D.C. Gribbon, K. T., Bailey, D. G., Jan. 2004. A novel approach to real-time bilinear interpolation. In : Proceedings of the 2nd IEEE International Workshop on Electronic Design, Test and Applications (DELTA’04). Perth, Australia, pp. 126–131. Gunturk, B. K., Altunbasak, Y., Mersereau, R. M., Sep. 2002. Color plane interpolation using alternating projections. IEEE Transactions on Image Processing 11 (9), 997– 1013. Gunturk, B. K., Glotzbach, J., Altunbasak, Y., Schafer, R. W., Mersereau, R. M., Jan. 2005. Demosaicking : Color filter array interpolation. IEEE Signal Processing Magazine 22 (1), 44–54. Hamilton, J. F., Adams, J. E., May 1997. Adaptive color plan interpolation in single sensor color electronic camera. U.S. patent 5,629,734, to Eastman Kodak Co., Patent and Trademark Office, Washington D.C. Hamilton, J. F., Compton, J. T., Feb. 2007. Processing color and panchromatic pixels. U.S. Patent 0,024,879 A1, to Eastman Kodak Co., Patent and Trademark Office, Washington D.C. Harris, C. J., Stephens, M., Aug. 1988. A combined corner and edge detector. In : Proceedings of the 4th Alvey Vision Conference (AVC’88). Manchester, United Kingdom, pp. 147–151. Hibbard, R. H., 1995. Apparatus and method for adaptively interpolating a full color image utilizing luminance gradients. U.S. patent 5,382,976, to Eastman Kodak Co., Patent and Trademark Office, Washington D.C.

83

Hirakawa, K., Sep. 2008. Color filter array image analysis for joint denoising and demosaicking. In : Lukac, R. (Ed.), Single-Sensor Imaging : Methods and Applications for Digital Cameras. CRC Press, pp. 239–261. Hirakawa, K., Parks, T. W., Mar. 2005. Adaptive homogeneity-directed demosaicing algorithm. IEEE Transactions on Image Processing 14 (3), 360–369. Jai Corporation, 2000. CV-S3200/S3300 series – Super sensitive DSP color camera (JAI CV-S3300P brochure). http://www.graftek.com/pdf/Brochures/JAI/ cv-s3200_3300.pdf. Kimmel, R., Sep. 1999. Demosaicing : image reconstruction from color CCD samples. IEEE Transactions on Image Processing 8 (9), 1221–1228. Kröger, R. H. H., Aug. 2004. Anti-aliasing in image recording and display hardware : ˝ lessons from nature. Journal of Optics A : Pure and Applied Optics 6, 743U748. Kuno, T., Sugiura, H., Mar. 2006. Imaging apparatus and mobile terminal incorporating same. U.S. Patent 7,019,774 B2, to Mitsubishi Denki Kabushiki Kaisha, Patent and Trademark Office, Washington D.C. Laroche, C. A., Prescott, M. A., Jun. 1993. Apparatus and method for adaptively interpolating a full color image utilizing chrominance gradients. U.S. patent 5,373,322, to Eastman Kodak Co., Patent and Trademark Office, Washington D.C. Leitão, J. A., Zhao, M., de Haan, G., Jan. 2003. Content-adaptive video up-scaling for high definition displays. In : Proceedings of the SPIE Conference on Image and Video Communications and Processing (IVCP’03). Santa Clara, California, U.S.A., pp. 612–622. Li, J. J., Randhawa, S., Sep. 2005. High order extrapolation using Taylor series for color filter array demosaicing. In : Proceedings of the International Conference on Image Analysis and Recognition (ICIAR’05). Vol. 3656 of Lecture Notes in Computer Science. Springer, Berlin-Heidelberg, Toronto, Canada, pp. 703–711. Li, X., Nov. 2000. Edge directed statistical inference and its applications to image processing. PhD thesis, Princeton University, New Jersey, U.S.A. Li, X., Mar. 2005. Demosaicing by successive approximation. IEEE Transactions on Image Processing 14 (3), 370–379. Li, X., Orchard, M. T., Oct. 2001. New edge-directed interpolation. IEEE Transactions on Image Processing 10 (10), 1521–1527. Lian, N., Chang, L., Tan, Y.-P., Sep. 2005. Improved color filter array demosaicking by accurate luminance estimation. In : Proceedings of the 12th International Conference on Image Processing (ICIP’2005). Vol. 1. Genova, Italia, pp. I–41–4. Lian, N.-X., Chang, L., Tan, Y.-P., Zagorodnov, V., Oct. 2007. Adaptive filtering for color filter array demosaicking. IEEE Transactions on Image Processing 16 (10), 2515–2525. 84

Lian, N.-X., Chang, L., Zagorodnov, V., Tan, Y.-P., Nov. 2006. Reversing demosaicking and compression in color filter array image processing : Performance analysis and modeling. IEEE Transactions on Image Processing 15 (11), 3261–3278. Lu, W., Tan, Y.-P., Oct. 2003. Color filter array demosaicking : New method and performance measures. IEEE Transactions on Image Processing 12 (10), 1194–1210. Lu, Y. M., Karzand, M., Vetterli, M., Jan. 2009. Iterative demosaicking accelerated : Theory and fast noniterative implementations. In : Bouman, C. A., Miller, E. L., Pollak, I. (Eds.), Procs. 21th IS&T/SPIE Electronic Imaging Annual Symposium (SPIE’09). Vol. 7246 of Computational Imaging VII. San Jose, California, USA, pp. 72460L–72460L–12. Lukac, R., Sep. 2008. Single-Sensor Imaging : Methods and Applications for Digital Cameras. Image Processing Series. CRC Press / Taylor & Francis, Boca Raton, Florida, USA. Lukac, R., Plataniotis, K. N., May 2004a. Normalized color-ratio modeling for CFA interpolation. IEEE Transactions on Consumer Electronics 50 (2), 737–745. Lukac, R., Plataniotis, K. N., Oct. 2004b. A normalized model for color-ratio based demosaicking schemes. In : Proceedings of the 11th International Conference on Image Processing (ICIP’04). Singapore, pp. 1657–1660. Lukac, R., Plataniotis, K. N., Nov. 2005a. Color filter arrays : Design and performance analysis. IEEE Transactions on Consumer Electronics 51 (4), 1260–1267. Lukac, R., Plataniotis, K. N., Apr. 2005b. Universal demosaicking for imaging pipelines with an RGB color filter array. Pattern Recognition 38, 2208–2212. Lukac, R., Plataniotis, K. N., 2007. Single-sensor camera image processing. In : Lukac, R., Plataniotis, K. N. (Eds.), Color Image Processing : Methods and Applications. CRC Press / Taylor & Francis, pp. 363–392. Lukac, R., Plataniotis, K. N., Hatzinakos, D., Aleksic, M., Jul. 2006. A new CFA interpolation framework. Signal Processing 86 (7), 1559–1579. Lukin, A., Kubasov, D., Sep. 2004. An improved demosaicing algorithm. In : Proceedings of the 14th International Conference on Computer Graphics, GRAPHICON’04. Moscow, Russia, pp. 38–45. Lyon, R., Hubel, P. M., Nov. 2002. Eyeing the camera : into the next century. In : Proceedings of the 10th Color Imaging Conference (CIC’2002) : Color Science and Engineering Systems, Technologies, Applications. Scottsdale, Arizona, U.S.A., pp. 349–355. Lyon, R. F., Mar. 2000. Prism-based color separation for professional digital photography. In : Proceedings of the IS&T Conference on Image Processing, Image Quality, Image Capture, Systems (PICS’00). Vol. 3. Portland, Oregon, USA, pp. 50–54.

85

Martin, D. R., Fowlkes, C., Malik, J., May 2004. Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (5), 530–549. Marziliano, P., Dufaux, F., Winkler, S., Ebrahimi, T., 2004. Perceptual blur and ringing metrics : application to JPEG2000. Signal Processing : Image Communication 19, 163–172. Medjeldi, T., Horé, A., Ziou, D., Jul. 2009. Enhancement of the quality of images through complex mosaic configurations. In : Kamel, M., Campilho, A. (Eds.), Proceedings of the International Conference on Image Analysis and Recognition (ICIAR’09). Vol. 5627. Springer, Berlin-Heidelberg, Halifax, Canada, pp. 43–53. Menon, D., Andriani, S., Calvagno, G., Sep. 2006. A novel technique for reducing demosaicing artifacts. In : Proceedings of the XIVth European Signal Processing Conference (EUSIPCO’06). Firenze, Italia. Menon, D., Andriani, S., Calvagno, G., Jan. 2007. Demosaicing with directional filtering and a posteriori decision. IEEE Transactions on Image Processing 16 (1), 132–141. Mukherjee, J., Parthasarathi, R., Goyal, S., Mar. 2001. Markov random field processing for color demosaicing. Pattern Recognition Letters 22 (3-4), 339–351. Muresan, D. D., Luke, S., Parks, T. W., Aug. 2000. Reconstruction of color images from CCD arrays. In : Proceedings of the Texas Instruments Digital Signal Processing Systems Fest. Houston, Texas, U.S.A., pp. 1–6 [CD–ROM XP002243635]. Noble, S. A., 2000. The technology inside the new Kodak Professional DCS 620x digital camera. http://www.dpreview.com/news/0005/kodak_dcs620x_tech_ paper.pdf. Omer, I., Werman, M., Oct. 2004. Using natural image properties as demosaicing hints. In : Proceedings of the 11th International Conference on Image Processing (ICIP’04). Vol. 3. Singapore, pp. 1665– 1670. Parulski, K. A., Aug. 1985. Color filters and processing alternatives for one-chip cameras. IEEE Transactions on Electron Devices 32 (8), 1381–1389. Roorda, A., Metha, A. B., Lennie, P., Williams, D. R., Jan. 2001. Packing arrangement of the three cone classes in primate retina. Vision Research 41, 1291–1306. Savard, J., 2007. Color filter array designs [online]. http://www.quadibloc.com/ other/cfaint.htm. Smith, M., Apr. 2005. Super-resolution. Tech. rep., Carleton University, Carleton, Canada.

86

Sony Corporation, 2000. Diagonal 6mm (type 1/3) CCD image sensor for NTSC color video cameras (ICX258AK) (JAI CV-S3300P datasheet). http://www.jai. com/SiteCollectionDocuments/Camera_Solutions_Other_Documents/ ICX258AK.pdf.

Su, C.-Y., May 2006. Highly effective iterative demosaicing using weighted-edge and color-difference interpolations. IEEE Transactions on Consumer Electronics 52 (2), 639–645. Su, D., Willis, P., Jun. 2003. Demosaicing of colour images using pixel level datadependent triangulation. In : Proceedings of the Theory and Practice of Computer Graphics (TPCG’03). Birmingham, United Kingdom, pp. 16–23. Tam, W.-S., Kok, C.-W., Siu, W.-C., Aug. 2009. A modified edge directed interpolation for images. In : Proceedings of the 17th European Signal Processing Conference (EUSIPCO’09). Glasgow, Scotland. Tsai, C.-Y., Song, K.-T., Sep. 2007. A new edge-adaptive demosaicing algorithm for color filter arrays. Image and Vision Computing 25 (9), 1495–1508. Wang, Z., Bovik, A. C., 2006. Modern Image Quality Assessment. Synthesis Lectures on Image, Video, and Multimedia Processing. Morgan & Claypool Publishers. Wang, Z., Bovik, A. C., Jan. 2009. Mean squared error : Love it or leave it ? – A new look at signal fidelity measures. IEEE Signal Processing Magazine 26 (1), 98–117. Wu, X., Zhang, N., 2004. Primary-consistent soft-decision color demosaicking for digital cameras. IEEE Transactions on Image Processing 13 (9), 1263–1274. Yang, Y., Losson, O., Duvieubourg, L., Dec. 2007. Quality evaluation of color demosaicing according to image resolution. In : Proceedings of the 3rd International Conference on Signal-Image Technology & Internet-based Systems (SITIS’07). Shanghai Jiaotong University, China, pp. 689–695. Zapryanov, G. S., Nikolova, I. N., Nov. 2009. Demosaicing methods for pseudorandom Bayer color filter array. In : Proceedings of the 5th International Conference – Computer Science’09 (CS’09). Sofia, Bulgaria, pp. 687–692. Zhang, X., Wandell, B. A., 1997. A spatial extension of CIELAB for digital color reproduction. Journal of the Society for Information Display 5 (1), 61–63.

87