Robust Face Recognition Using Symmetric Shape ... - Semantic Scholar

6 downloads 0 Views 2MB Size Report
lowing assumption: we do not have enough training images, e.g., just one or two face images. (the two images are not necessarily aligned) per class are ...
Robust Face Recognition Using Symmetric Shape-from-Shading

W. Zhao Rama Chellappa Center for Automation Research and Electrical and Computer Engineering Department University of Maryland College Park, MD 20742-3275 Email: f wyzhao, rama [email protected]

The support of the Oce of Naval Research under Grant N00014-95-1-0521 is gratefully acknowledged. DRAFT

Abstract Sensitivity to variations in illumination is a fundamental and challenging problem in face recognition. In this paper, we describe a new method based on symmetric shape-from-shading (SFS) to develop a face recognition system that is robust to changes in illumination. The basic idea of this approach is to use the symmetric SFS algorithm as a tool to obtain a prototype image which is illumination-normalized. Applying traditional SFS algorithms to real images of complex objects (in terms of their shape and albedo variations) such as faces is very challenging. It is shown that the symmetric SFS algorithm has a unique point-wise solution. In practice, given a single real face image with complex shape and varying albedo, even the symmetric SFS algorithm cannot guarantee the recovery of accurate and complete shape information. For the particular problem of face recognition, we utilize the fact that all faces share a similar shape making the direct computation of the prototype image from a given face image feasible. The symmetry property has been used to develop a new model-based source-from-shading algorithm which is more accurate than existing algorithms. Using the symmetric property, we are also able to gain insight into how changes in illumination a ect eigen-subspace based face recognition systems. Finally, to demonstrate the ecacy of our method, we have applied it to several publicly available face databases. We rst compare a new illumination-invariant measure to the measure proposed in (Jacobs et al., 1998), and then demonstrate signi cant performance improvement over existing face recognition systems using PCA and/or LDA for images acquired under variable lighting conditions.

Keywords illumination change, shape-from-shading (SFS), symmetric SFS, direct image computation, robust face recognition, image synthesis

I. Illumination Problem for Face Recognition

As one of the most successful applications of image analysis and understanding, face recognition has recently gained signi cant attention. This is evidenced by the emergence of speci c face recognition conferences such as AFGR (Automatic Face and Gesture Recognition) and systematic empirical evaluation of face recognition techniques in the FERET program (Phillips et al., 1997). There are at least two reasons for such a trend: the rst is the wide range of possible real applications such as face recognition based ATM machines, smart user interfaces, etc. (Chellappa et al., 1995); and the second is the availability of feasible technologies (Wechsler, 1998). Many methods have been proposed for face recognition (Chellappa et al., 1995; Wechsler, 1998); they can be broadly divided into holistic template matching based systems (Turk and Pentland, 1991; Moghaddam and Pentland, 1997; Swets and Weng,1996; Belhumeur et al., 1997; Etemad and Chellappa, 1997; Zhao et al., 1998a), and geometrical local-feature-based schemes (Wiskott et al., 1997; Manjunath et al., 1992.). Even though algorithms of these types have been successfully applied to the task of face recognition/veri cation, they all have certain drawbacks. One of the major diculties is the illumination problem (Adini et al., 1997), that is, the fact that system performance drops signi cantly when illumination variations are present in the input images. This diculty is clearly revealed in the most recent FERET test report, and solving the illumination problem is suggested as a major research issue (Phillips, 1998). To handle the illumination problem, researchers have proposed various methods. For example, within the eigen-subspace domain, it has been suggested that by discarding the three most signi cant principal components, variations due to lighting can be reduced. It was in fact experimentally veri ed in (Belhumeur et al., 1997) that discarding the rst few principal components seems to work reasonably well for images under variable lighting. However in order to maintain system performance for normally lighted images and improve performance for images acquired under varying illumination, we must assume that the rst three principal components capture only the variations due to lighting. With the assumption of Lambertian surfaces, no shadowing and the availability of three aligned images/faces acquired under di erent lighting conditions, a 3D linear illumination subspace/per person has been constructed in (Belhumeur et al., 1997; Nayar and Murase, 1994; Hallinan, 1994; Shashua, 1997) for a xed viewpoint. Thus under ideal assumptions, recognition based on the 3D linear illumination subspace is illumination-invariant. More recently, an illumination cone was proposed as an e ective method of handling illumi1

nation variations, including shadowing and multiple lighting sources (Georghiades et al., 1998; Belhumeur and Kriegman, 1997). This method is an extension of the 3D linear subspace method (Hallinan, 1994; Shashua, 1997) and hence needs no less than three aligned training images acquired under di erent lightings. Quite di erent from these training-based approaches, Jacobs et al. (1998) have suggested a new measure that is robust to illumination change for comparing two images. Their method is based on the observation that the di erence between two images of the same object is smaller than the di erence between images of di erent objects. In this paper, we address the illumination problem speci c to face recognition under the following assumption: we do not have enough training images, e.g., just one or two face images (the two images are not necessarily aligned) per class are available. We wish to develop methods that keep up the performance of the face recognition system when the input face image is acquired under di erent illumination conditions. There are several possible approaches to this problem. One approach is to render the prototype image Ip [x; y ], which we de ne as the frontal lighted image of the same face, from any given image I [x; y ]. More speci cally, if we assume a coordinate system such that the z -axis is parallel to the optical axis and pointing towards the camera, and de ne the slant as the angle between the negative illumination vector ?L~ and the positive z -axis, then the prototype image is produced under a special lighting condition: = 00. After this step, all comparisons/classi cations are carried out using prototype images. Based on its de nition, the prototype image for a class is xed for a given viewing direction, and is not a ected by changes in illumination1 . Yet another approach can be the derivation of an image matching measure that is invariant to illumination changes. This approach is similar to the approach proposed in (Jacobs et al., 1998). However, the proposed matching measure is not strictly illumination-invariant because the measure changes for a pair of images of the same object when the illumination changes. In this paper we use SFS as a tool to implement these two approaches. It is well known that general SFS problem is an ill-posed problem in most practical cases (Marroquin, 1985; Bertero et al., 1987). However, we demonstrate the feasibility of applying symmetric SFS to symmetric objects such as faces. In the literature, the most relevant work to our research is (Atick et al., 1996). In their paper, the authors suggested using Principal Component Analysis (PCA) as a tool for solving the parametric SFS problem, i.e., obtain the eigen-head approximation of a real 3D head after training on about 300 laser-scanned range images of real human heads. Though 2

the ill-posed SFS problem is transformed into a parametric problem, they still assume constant albedo. This assumption does not hold for most real face images and we believe that it is one of the major reasons why most SFS algorithms fail on real face images. This paper is organized as follows: The following section brie y discusses existing SFS techniques and the diculties in applying them to face recognition. Section 3 introduces the application of symmetric SFS to face recognition, including a brief introduction to symmetric SFS and its applications: a new algorithm for light source estimation, a new method for rendering the prototype image, and the de nition of an illumination-invariant measure are given. Also in this section, performance degradation due to illumination changes is analyzed for eigen-subspace based face recognition systems. This analysis justi es the need for solving the illumination problem in order to enhance system performance. In Section 4, we rst compare performances using di erent matching measures, and then demonstrate the signi cantly improved performance of both PCA and subspace LDA face recognition systems (Zhao et al., 1998b; Zhao et al., 1999) implemented using prototype images generated using our symmetric SFS algorithm. All these experiments are based on publicly available face databases. Finally we conclude our paper with discussion and suggest future research directions in Section 5. II. Applying Shape-from-Shading to Face Recognition

A. Shape from Shading: A Brief Review

The basic idea of SFS is to infer the structure (depth map) of the object from the shading information in one image (Horn and Brooks, 1989; Ikeuchi and Horn, 1981). In order to mathematically infer such information, we need to assume a re ectance model under which the image (the only measurement we have) is generated from the 3D depth map. There are many illumination models available, which can be broadly categorized into di use re ectance models and specular models (Nayar et al., 1991). Among these models, the Lambertian model is the most popular one for di use re ectance and has been used extensively in the computer vision community for the SFS problem. Speci cally, in most SFS algorithms the Lambertian model with known constant albedo is assumed. The nature of SFS, inferring the 2.5D structure (depth) from limited observations (image intensities), makes it an ill-posed problem in general (Bertero et al., 1987). This is re ected in the interesting phenomenon that many SFS algorithms can recover a \good" depth map for re-rendering the given image at the same lighting angle, but not 3

good enough for rendering images under di erent lighting angles. Most SFS algorithms in the literature have developed di erent methods of regularizing the ill-posed problem and obtaining a reliable solution. For example, imposing the constraint of smoothness of the shape is the most commonly used regularization method (Horn and Brooks, 1989). One drawback of such a method is that it tends to over-smooth surface patches that have complex structure. Hence in (Zheng and Chellappa, 1991), this constraint is dropped and replaced with the physically meaningful constraint known as surface integrability. Theoretical advances in SFS have also made the SFS problem a well-posed problem under certain conditions. For example, assuming the existence of singular points, i.e., maximally bright image points (I = 1 and Ps = 0; Qs = 0), it can be proved that the solution is unique using various elegant mathematical tools such as dynamical system theory (Bruss, 1982; Saxberg, 1989; Dupuis and Oliensis, 1992; Oliensis, 1991). Availability of boundary conditions also made SFS a well-posed problem (Horn, 1990). The key equation in the SFS problem is the irradiance equation (Horn and Brooks, 1989): I [x; y ] = R(p[x; y ]; q[x; y ])

(1)

where I [x; y ] is the image of the scene, R is the re ectance map, and p[x; y ]; q [x; y ] are the shape gradients (partial derivatives of the depth map z [x; y ]). With the assumption of Lambertian surface re ection and a single, distant light source, the equation can be written as I =  cos ;

or

(2)

1 + pPp s + qQs (3) 2 2 1 + p + q 1 + Ps2 + Q2s @z , q = @z ) and the negative where  is the angle between the surface normal ~n = (p; q; 1) (p = @x @y ~ illumination vector ?L = (P s; Qs; 1) which represents the direction opposite to the light source, and  is the composite albedo, including factors such as the re ecting eciency of the surface and the illumination strength. The light source can also be represented by two angles, slant (the angle between the negative L~ and the positive z -axis: 2 [00; 1800]) and tilt  (the angle between the negative L~ and the x-z plane:  2 [?1800; 1800]) (Figure 19), and the following expressions hold: Ps = k sin cos  (4) Qs = k sin sin ; I = p

4

where k is the length of vector L~ . SFS algorithms attempt to recover the shape information z or (p; q ) from the shading information I using the basic equations (3,4). Some of them include an estimate of the illumination direction as part of the solution. B. Shape-from-Shading for Face Recognition

The rationale for applying SFS to face recognition is to infer the face shape information, so problems of illumination change and rotation out of the image plane can be simultaneously solved. For example, we can solve the illumination problem by rendering the prototype image Ip from a given input image I . This can be achieved in two steps: rst we apply an SFS algorithm to obtain the shape information (p; q ); then we generate the prototype image Ip under the lighting condition = 0. However, it is dicult to successfully use existing SFS techniques for accurate face shape reconstruction. The reason is that the face not only has a complex shape but also is composed of materials with di erent re ecting properties: cheek, lip, eye, eyelid, etc. Hence, it is impossible to model the face surface eciently with the Lambertian model and constant albedo. However, it may be possible to model the surface using the Lambertian model and varying albedo [x; y ], and we assume this model hereafter. To test how ecient some existing SFS algorithms are for face images, we applied several SFS algorithms to synthetic face images generated based on the Lambertian model and uniform albedo, and (more importantly) to real face images. As we will soon see, it is the real face images which fail these SFS algorithms in terms of the accuracy of the recovered images. B.1 Source from Shading Before we apply an SFS algorithm, we need to reliably determine the light source direction. Many source-from-shading algorithms are available, for example, those of Lee and Rosenfeld (1989), Zheng and Chellappa (1991), and Pentland (1982). We implemented all three algorithms and chose to use the simplest/fastest algorithm by Lee and Rosenfeld to estimate  . Lee and Rosenfeld's method seems to produce reasonable results for both simulated and real face images (according to subjective judgment, as no ground-truth is available); it also seems to produce better estimates of  than the other two algorithms. This may be partly due to the relative small images (96  84) we used, noise in the real images, and/or violation of the local spherical patch assumption used in (Lee and Rosenfeld, 1989; Zheng and Chellappa, 1991; Pentland, 1982). 5

However, unlike the other two algorithms, the simple formula derived in (Lee and Rosenfeld, 1989) for  seems to be valid even for the face shape  = arctan

E(Ix) E(Iy )

(5)

where Ix and Iy are the partial derivatives of the image intensity with respect to the image coordinates. We also observed that none of the three algorithms produced satisfactory estimates for the slant angle. Again, we suspect that this is due to violation of the statistical assumptions made in these methods. However this issue is beyond the scope of this paper and we leave it as an open issue. To estimate the slant angle more reliably, we propose a new model-based method which utilizes the prior information that the object is a face. We use a simple 3D face model (the Mozart head used in (Zheng and Chellappa, 1991)) to help in the determination of the slant angle. Currently this is implemented via a minimization procedure de ned by  = arg min(IMF ( ;  )) ? I )2;

(6)

where I is the input image, and IMF is the image generated from the 3D face model MF given a hypothesized and constant  ( has been solved for prior to this step). Alternatively, if we wish to have a better estimate of the tilt angle, we can obtain it by jointly solving (  ;  ) = arg ; min(IMF ( ;  )) ? I )2:

(7)

One advantage of using a 3D face model is that we can take into account both attached-shadow and cast-shadow e ects, which are not utilized in the traditional statistics-based methods. In the traditional statistics-based methods, shadow points either are treated as normal points obeying the statistical assumption (Lee and Rosenfeld, 1989; Pentland, 1982) or are simply discarded (Zheng and Chellappa, 1991). However, these points contribute signi cantly and correctly to the computation of the slant and tilt angles. Hence the model-based method can produce a more accurate estimate if the 3D face model is a good approximation to the real 3D face shape. We will discuss the details of these special shadow e ects in Section 3. Despite all these advantages, we notice that the method has one drawback. Recall that the 3D face model has a constant albedo , while a real face image has varying albedo. So the above formulation may not give a very accurate estimate. We revisit this issue in Section 3. 6

B.2 Obtaining Prototype Images: Some Results We have used three SFS algorithms for prototype face image rendering: (1) Zheng and Chellappa (1991) (an iterative global method based on the variational principle), (2) Wei and Hirzinger (1997) (a global method based on radial basis expansion), and (3) Tsai and Shah (1992) (a local approach based on linearization of the depth map). All these methods have been demonstrated successfully on many synthetic and a few real images. In (Zheng and Chellappa, 1991) the surface smoothness term usually employed in variational approaches was dropped. Instead, the image gradient constraint and surface integrability were imposed. This suggests that the algorithm can handle complex surfaces and guarantees that the reconstructed surface is physically meaningful. (Zheng and Chellappa, 1991) minimizes the following energy function: Z Z

f[R(p; q) ? I [x; y]]2 + [Rx ? Ix]2] + [Ry ? Iy ]2 + [(p ? zx )2 + (q ? zy )2]gdxdy

(8)

where Rx , Ry are the partial derivatives of the re ectance map R and  is a weighting factor. By decomposing the depth map z [x; y ] onto the radial basis functions  z [x; y ] =

N X k=1

wk ([x; y ]; tk; sk );

(9)

where tk and sk are the parameters of the basis functions, Wei and Hirzinger (1997) transformed the problem of estimating (p; q ) and z into that of estimating the parameters wk . The estimation is carried out by minimizing the energy function Z Z

2 + s [x; y ]z 2 + s [x; y ]z 2 ]gdxdy f[R(p; q) ? I [x; y]]2 + [s1[x; y]zxx 2 3 xy yy

(10)

where si (x; y ) (i = 1; 2; 3) are empirical quadratic smoothness constraints which allow for integrating prior knowledge. Even in principle, it is not possible to obtain an accurate depth map for a real face image using these methods. This is mainly because of the variations in albedo. From our experiments, it turns out that the simple local approach (Tsai and Shah, 1992) works best for real face images when the 3D face model MF is given as the initial shape. Possible reasons for the simple local approach being the best are: 1) the Lambertian model with constant albedo is inherently inconsistent with real images, causing systematic errors; 2) the local approach does not propagate errors, while the global approach propagates errors, making algorithms walk away from a good solution, and 3) 7

the underlying surface is complex but a good initial depth map is available2 . Similar observations have also been reported in (Ferrie and Levine, 1989). We have applied the local SFS algorithm (Tsai and Shah, 1992) to dozens of face images from the Weizmann3 and Yale face databases4. The method is based on the linearization of the re ectance map R in depth z ; hence the iteration at the n-th step is z n [x; y ] = z n?1 [x; y ] +

?f (zn?1 [x; y]) ; d f (z n?1 [x; y ])

dz[x;y]

(11)

@z ; @z ) and the partial derivatives are approximated by forward di erences where f is I [x; y ] ? R( @x @y @z  z [x; y ] ? z [x ? 1; y ] and @z  z [x; y ] ? z [x; y ? 1]. In Figure 1, one of the best results using @x @y both a synthetic image and a real face image is shown. In both the synthetic and real image cases, we have an input image under lighting ( = 350;  = 1200). In each case, we plot the given image (column 1) along with the rendered image (column 2) (Figure 2). In addition we plot the rendered prototype image (setting to 00) (column 3) along with the real (approximate) prototype image (column 4). Some \good" results are also plotted in Figure 2 with the input image, recovered original image and the rendered prototype image arranged in the same row. However, we should point out that there are some unsatisfactory results; such bad results would dominate if we used image size 48  42 instead of 96  84, as the results do not render face-like images. Some examples using image size 96  84 are shown in Figure 3. From theses limited results, we can say that the algorithm works well for synthetic images and that it may even recover reasonably good shape information for real face images. However, because it does not recover varying albedo information, the prototype images generated are not good enough for improving face recognition. This is visually justi ed in Figures 1 to 3. Hence we conclude that most existing SFS techniques are not good/robust enough to handle real face images with varying albedo and to have signi cant impact in improving face recognition.

III. Applying Symmetric Shape-from-Shading to Face Recognition

A. Symmetric Shape from Shading

Symmetry is very useful information that can be exploited in SFS algorithms for symmetric objects. However, implicitly introducing this information into existing SFS algorithms does not seem to help very much. We therefore describe a direct method of incorporating this important cue. Before we go into details, we would like to point out an interesting comparison. The 8

Synthetic Image Case

Real Face Image Case Fig. 1. One of the best results obtained using the local SFS algorithm: First column: original input images; second column: recovered original images; third column: recovered prototype image; fourth column: real prototype image.

uniqueness of SFS was achieved by directly exploiting the maximally bright point (singular point) in (Saxberg, 1989; Oliensis, 1992). By directly imposing the symmetry constraint, we prove that we can also achieve a unique solution. The di erence is that our unique solution is obtained point-wise, while existing methods need to either obtain the unique solutions at the singular points rst (p = Ps = 0; q = Qs = 0; I = 1 at a singular point) or have a boundary condition (solutions for p,q on the boundary) available, and then propagate these solutions to other points/locations. Let us assume that we are dealing with a symmetric surface. The background should be excluded since it need not be symmetric. Our de nition of a symmetric surface is based on the following two equations (with an easily-understood coordinate system):

and

z [x; y ] = z [?x; y ];

(12)

[x; y ] = [?x; y ]:

(13)

One immediate property of a symmetric (di erentiable) surface is that it has both anti-symmetric 9

Fig. 2. Some of the good results obtained using the local SFS algorithm. First column: input images; second column: recovered images; third column: recovered prototype images.

and symmetric gradients:

p[x; y ] = ?p[?x; y ] q [x; y ] = q [?x; y ]

(14)

As suggested by (13,14), explicitly using the symmetric property can reduce the number of unknowns by half. Moreover, we can derive a new irradiance equation which contains no albedo information at all. We introduce the concept of self-ratio image to cancel the e ect of the varying albedo. The idea of using two aligned images to construct a ratio has been explored by many researchers (Jacobs et al., 1998; Wol and Angelopoulou, 1994). Here we extend the idea to a single image. Let us substitute (13,14) into the equations for I [x; y ] and I [?x; y ], and then add them, giving 1 +pqQs : (15) I [x; y ] + I [?x; y ] = 2 p 2 1 + p + q 2 1 + Ps2 + Q2s Similarly we have pPs p : (16) I [x; y ] ? I [?x; y ] = 2 p 2 1 + p + q 2 1 + Ps2 + Q2s To simplify the notation, let us de ne I+ [x; y ] = I [x;y]+2I [?x;y] and I? [x; y ] = I [x;y]?2I [?x;y] . Then 10

Fig. 3. Unsatisfactory results obtained using the local SFS algorithm. First column: input images; second column: recovered images; third column: recovered prototype images.

the self-ratio image rI can be de ned as I [x; y ] ; rI [x; y ] = ? I+ [x; y ]

(17)

pPs 1 + qQs :

(18)

which has a very simple expression rI [x; y ] =

De ning the right-hand-side of the above equation as the self-ratio re ectance map rR(p; q ), we arrive at the following self-ratio irradiance equation: rI [x; y ] = rR (p[x; y ]; q [x; y ]):

(19)

Solving for shape information using this equation combined with (1) will be called symmetric SFS. 11

Reflectance map of symmetric SFS 6

5

5

4

4

3

3

2

2

1

1

q

q

Reflectance map of SFS 6

0

0

−1

−1

−2

−2

−3

−3

−4 −6

−4

−2

0 p

2

4

−4 −6

6

(a)

−4

−2

0 p

2

4

6

(b)

Fig. 4. Comparison of re ectance maps:  = 1, Ps = 0:3 and Qs = 0:7. The plot in (a) is the regular SFS re ectance map R(p; q)) while the plot in (b) is the symmetric re ectance map rR (p; q).

Unlike standard SFS, symmetric SFS has two re ectance maps R(p; q ) and rR(p; q ). Figure 4 compares the two re ectance maps; one has a quadratic structure and another has a linear structure (except on the singular point of a rational function). A.1 Symmetric SFS with constant albedo When albedo is constant across the whole image plane, symmetric SFS is a well-posed problem. More speci cally, the shape information can be uniquely recovered at each point locally. In the following discussion, we rst assume that the constant albedo value is known, and then discuss how to recover this unknown value. Since symmetric SFS has two equations (Figure 4), it is obvious that the true solution to (p; q ) must lie on the intersection of the two re ectance maps (Figure 5). Since there are at most two intersections, this implies that there are at most two possible solutions, in the absence of additional constraints. Can we achieve a unique solution to the symmetric SFS problem with a known constant albedo? The answer is expressed in the following theorem. Theorem 1: With a known constant albedo , symmetric SFS has a unique point-wise solution at each point (x; y ) for a symmetric C 2 surface z excluding the following special conditions:  shadow point (including both attached-shadow and cast-shadow). Here only regular SFS can 12

Two possible solutions of symmetric SFS (constant albedo) 6

5

4 p+ 3

q

2

p−

1

0

−1

−2

−3

−4 −6

−4

−2

0 p

2

4

6

Fig. 5. Two possible solutions for symmetric SFS with constant albedo. This is a direct result of combining plots (a) and (b) in Figure 4.

be applied to its symmetric counterpart (?x; y ) if it is not a shadow point.  slant angle = 00: Ps = 0 and Qs = 0. Here the image is the prototype image and only regular SFS can be applied.  tilt angle  = 900: Ps = 0, and Qs 6= 0. Here regular SFS can be applied at all points. Moreover the surface z cannot have the following special form z (x0 ; y 0) = F (x0) + G(y 0);

(20)

where the new coordinate system x0 -y 0-z 0 is obtained by rotating the x-y plane about the z -axis by  (Figure 19). Before we prove this theorem, we need the following de nition and lemmas: De nition 1: Let us de ne three sets of image points (excluding shadow points) as follows: V0 = V+ = V? =

f(x; y) j T2[x; y] = 0orq[x; y] = q+ = q? g f(x; y) j T2[x; y] > 0&q[x; y] = q+ g f(x; y) j T2[x; y] > 0&q[x; y] = q? g;

(21)

where T2[x; y ] is de ned later and q? and q+ are the two possible solutions of symmetric SFS for q. Lemma 1: Excluding the special conditions listed in Theorem 1, the image plane is divided into distinct connected regions each containing points having a single label, V0 , V? or V+ . More speci cally, we have the following cases: 13

 The set V0 is empty, i.e., there is no point in the image satisfying T2 = 0, then there is only

one set V? or V+ for all points.  The set V0 is not empty, so both sets V+ and V? exist. Moreover, any two regions with labels V? and V+ will be connected by a V0 region. Lemma 2: Excluding the special conditions listed in Theorem 1, there is usually only one choice of q which can satisfy the equation py = qx in each distinct region, V0, V? , or V+ . The exceptions occur only when the surface has the special form z (x0 ; y 0) = F (x0) + G(y 0);

(22)

where the new coordinate system x0 -y 0-z 0 is obtained by rotating the x-y plane about the z -axis by  . Based on these two lemmas, we are ready to prove Theorem 1. (For proofs of the lemmas, see the appendix.) Proof of Theorem 1: In order to disambiguate shape recovery, we use surface integrability py = qx , arriving at a constructive proof. Excluding the listed special conditions and after some algebraic manipulations, we can write the quadratic equation for q from (18,3):

[S (1+( rIPQs )2) ? (1+ rI )2 Q2s ]q 2 +2Qs [S ( PrI )2 ? (rI +1)2]q +[S (1+( PrI )2) ? (1+ rI )2 ] = 0; (23) s

s

where S is de ned as

s

S = (1 + Ps2 + Q2s )(I=)2:

(24)

To simplify the notation, we write the coecients of second order, rst order and the constant item as a, b and c, i.e., a = [S (1 + ( rIPQs s )2) ? (1 + rI )2Q2s ]; (25) b = 2Qs [S ( PrIs )2 ? (rI + 1)2 ]; c = [S (1 + ( PrIs )2) ? (1 + rI )2]: Hence (23) can be simpli ed to aq 2 + bq + c = 0: (26) So the possible solutions for q are5

p ? b ? b2 ? 4ac q? = 2a

or 14

p ? b + b2 ? 4ac q+ = : 2a

(27)

p Again, to simplify the notation, let us denote ? 2ba by T1 and b 2?a4ac by T2. Now the above equation can be further simpli ed as 2

q? = T1 ? T 2

or

q+ = T1 + T 2:

(28)

Now let us utilize the two lemmas to nalize our proof. First by using Lemma 1, we establish that within each distinct connected region (V? or V+ ) the choice of q? or q+ is the same for all points. Then we obtain a unique solution for q by imposing the surface integrability constraint based on Lemma 2. Noting that p is uniquely determined by a given q based on (18) completes the proof. Based on Theorem 1, we can use the following algorithm to perform shape recovery.

Algorithm I

1. Compute T2 values at all image points and determine the zero locations based on thresholding. This procedure generates the V0 set. If set V0 is empty, then step 2 can be omitted and the whole image plane is denoted by R0. 2. Use component-connection algorithms to label the connected regions separated by V0 : Ri (i=1, : : : , m). 3. For each labeled region Ri (i=1, : : : , m), the choice of the correct sign of T2 is based on comparison of the following two values: Z Z

Ri

j @p@y+ ? @q@x+ jdxdy >?

Z Z

Ri

j @p@y? ? @q@x? jdxdy

(29)

Some Simulated Examples We now illustrate how symmetric SFS can be used to recover point-wise shape information (p; q ). We demonstrate the shape recovery results for the following cases:  case I T2 > 0 and there are no shadow points in the whole image plane.  case II T2 > 0 in the whole image plane but there are shadow points.  case III V0 is not empty and there are no shadow points.  case IV V0 is not empty and there are shadow points. x )(1 + 0:5 sin y )2 and z2 = s2 (cos x )2(sin y ). Two depth functions are used: z1 = s1 (cos 100 100 100 100 Cases I, II, IV correspond to depth z1 with scalar s1 being 5, 15, and 40 respectively, while case III corresponds to depth z2 with scalar s2 being 25. The illumination angles in all cases are the same: = 600 and  = 1350. In the examples which contain shadow points, we leave the shape information un-recovered at those shadow points and their counterparts. However it is 15

entirely possible to recover the shape information at those points by applying regular SFS to their symmetric counterparts (if they are not shadow points) with known boundary conditions which are already solved for uniquely using symmetric SFS. Determining the albedo value Up to now, we have assumed that the constant albedo value is already known, as in most existing SFS algorithms. What about an unknown constant albedo value? In some special cases, we cannot recover both shape and albedo. For example, when we have a planar surface the albedo value and shape information cannot be uniquely determined if the true angle t is not zero degrees or the true albedo value is not 1: t cos(t) = cos(). (We notice that this particular case is already included in Lemma 2.) On the other hand, determining the albedo can be trivial in other cases. For example, if we assume the existence of maximally bright points (I = 1), immediately we have  = 1, and p = Ps = 0; q = Qs = 0 at these points. Excluding these special cases, we show that we can uniquely determine the albedo value based on the following Lemma (see the appendix for the proof): Lemma 3: Excluding the special conditions listed in Theorem 1, usually there is only one choice of albedo value t (the true value) which can satisfy Ca() = 0

(30)

RR where Ca() is de ned as R j @p@y() ? @q@x() jdxdy . The exceptions can occur only when the following con guration is true: 0 0 2 @rI0 jx=0 = 10 @S0 jx=0 (31) 0

@x

or when the surface satis es

S @x

q jx=0 = constant:

(32)

where all measurements S 0, rI0 are in the new coordinate system x0 -y 0 -z 0 which is obtained by rotating the x-y plane about the z -axis by  (Figure 19). Combining this Lemma and Theorem 1, we have proved the following theorem: Theorem 2: For symmetric SFS, we can recover both the constant albedo value and the pointwise (p; q ) uniquely except in those special conditions listed in Theorem 1 and Lemmas 2 and 3. Based on Theorem 2, we have the following algorithm to recover both constant albedo and shape.

16

0.18

0.4

0.16

0.3

0.14

12

10

0.2 8

0.12 0.1

q

p

z

0.1 0

6

0.08 −0.1

4

0.06 −0.2

0.04

2 50

−0.3

0.02

50

0 −50

50 0 −50

−0.4 −50

0 0

0

0

0

0 50

50

y

−50

50

x

−50 y

y

−50

x

x

(a)

(b) case I 50

40 0.7

1

30

0.8 0.6

20

0.6 0.5

0.4

10

0.2

q

p

y

0.4 0.3

0 −0.2

0

−10

−0.4

0.2

−20

−0.6 0.1 −0.8

50 0 −50

50

−1 −50

0 0

−30

−40

0 0

50

y

−50

50

x

y

−50

−40

−30

−20

−10

x

(c)

0 x

10

20

30

40

50

10

20

30

40

50

(d) case II 50

40

0.6

0.6

0.4

0.5

0.2

0.4

0

20

10 y

0.8

0.7

q

p

30 0.8

0

−10 0.3

−0.2

0.2

−20

−0.4

0.1

−0.6 50

0 −50

50 −0.8 −50

0 0

−30

−40

0 0

50

−50

y

50

x

x

(e)

−50

y

−40

−30

−20

−10

0 x

(f) case III

Fig. 6. Simulation results for cases I, II and III. The plots are arranged in rows with each row representing one case. All plots in the rst column, i.e., the plots in (a), (c) and (e), are the recovered shape information (p; q). The other plots are explained as follows: the plot in (b) is the underlying depth map of z1 , plot in (d) is the shadow map (dark part), and the plot in (f) is the map for regions V0 , V? and V+ . (The strange appearance of V0 is due to the simple shrinking algorithm used to shrink the initial region V0 based on the threshold. In ideal case, it would be just a curve with no branches. A similar phenomenon happens in case IV (Figure 7).)

17

1.4

2

1.2

1.5 1

1

0.5 0

q

p

0.8 0.6 −0.5 0.4

−1

0.2

−1.5 50

0 −50

50 −2 −50

0

0

0

0 50

y

−50

x

y

−50

50 x

50

50

40

40

30

30

20

20

10

10 y

y

(a)

0

0

−10

−10

−20

−20

−30

−30

−40

−40

−40

−30

−20

−10

0 x

10

20

30

40

50

−40

(b)

−30

−20

−10

0 x

10

20

30

40

50

(c)

Fig. 7. Simulation results for case IV. The plot in (a) is the recovered shape information, the plot in (b) is the map for regions V0 , V? and V+ , and the plot (c) is the shadow map (dark part).

Algorithm II 1. Hypothesize the value of the constant albedo . 2. Apply Algorithm I with the hypothesized albedo value. 3. Compute Ca (): if Ca ()  threshold (theoretically this should be zero), we are done; otherwise, go to step 1 with a di erent hypothesis.

We veri ed this algorithm in the following simulations. The simulated data here are exactly the same as in the previous experiments (Figures 6 and 7) except that now the true albedo is 0:5 instead of 1. Figure 8 plots log(Ca() + 1) versus the hypothesized  values in all four cases. As can be seen, the minimum is always obtained at the true albedo value. Though ideally the minimum should be zero, in practice this is not the case due to algorithm implementation and numerical errors. 18

6

5.5

5 5 4.5

4

4

error

error

3.5 3

3

2

2.5

2 1 1.5

0 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 0.1

1

0.2

0.3

0.4

0.5

albedo

0.6

0.7

0.8

0.9

1

0.7

0.8

0.9

1

albedo

(case I)

(case II)

6

5.5

5 5 4.5 4

error

error

4

3

3.5

3 2 2.5 1 2

0 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.5 0.1

1

albedo

0.2

0.3

0.4

0.5

0.6 albedo

(case III)

(case IV)

Fig. 8. Determining albedo value by checking log(Ca () + 1) for a hypothesized  value. The true albedo value is 0:5 in all cases.

A.2 Symmetric SFS with varying albedo When albedo is not constant across the whole image plane, the situation becomes complicated since we need to recover both (p; q ) and a varying [x; y ] from just one image. At rst glance, it seems that we can just use the symmetric irradiance equation (19) and the smoothness constraint to recover the shape information as in ordinary SFS. But using the symmetric irradiance equation alone may be not a very good idea. This is because all line contours (corresponding to di erent rI 's) are passing through the singular point (p = 0; q = ? Q1s in Figure 4(b)), and the true solution may be far away from the singular point. More speci cally, enforcing the local smoothness constraint, or equivalently nding the solution (p; q ) at a point [x; y ] in the linear re ectance map rI which is closest to all lines corresponding to the local neighborhood of [x; y ], may not be stable. 19

Piece-wise constant albedo eld However if the albedo eld has a special form, that is the eld can be divided into regions each having a constant albedo, then it is possible to recover both shape and piece-wise albedo information. Expanding Theorem 2 and using the assumption that  is piece-wise constant, we can prove the following theorem: Theorem 3: If the depth z is a C 2 surface and the albedo eld is piece-wise constant, then both the point-wise solution for shape (p; q ) and the solution for albedo  are unique except in those special conditions listed in Theorem 1 and Lemmas 2 and 3. Proof of Theorem 3: The piece-wise constant albedo eld can be fully described in two parts: 1) the partition P of the 2D albedo eld which divides the whole eld into connected regions RPi each having a constant albedo value (neighboring regions cannot have the same albedo value), and 2) the albedo value Ri for each region RPi . To prove the theorem, we also need the following facts: Fact 1: p, q, rI and S are continuous across the whole image plane except at shadow points. So are a, b, c and T1 , T2. Fact 2: I is piece-wise continuous except at shadow points, i.e., continuous within each constant albedo region RPi . If the whole image plane has just one albedo value, then I is continuous. The proof is in two steps: we rst show that the partition of the albedo eld is unique, and we then prove that we can recover the albedo value in each constant-albedo-region uniquely and hence (p; q ) uniquely. Let us suppose that there exists another possible partition P of the albedo eld which is di erent from the true partition Pt. This implies that the partition P consists of regions RPi which in turn contain parts from several neighboring regions RPj t according to the true partition Pt. According to partition P , the image intensity eld I is continuous in region RPi . But this contradicts the fact that I is not continuous across neighboring regions RPj t . Hence we have established that the partition of the albedo eld is unique. To prove the second step, we rst need to show that the ratios of albedo values in di erent regions are unique, and then prove that the absolute albedo values are unique6 . Consider any  . If this two neighboring regions RPi and RPj and denote the corresponding albedo ratio by ri=j ratio is not the true ratio, then the resulting S (24) in regions RPi and RPj using the hypothesized albedo values must be discontinuous along their border. But this contradicts one of the facts P

20

listed above. By repeating this procedure for all combinations of two neighboring regions, we prove that the relative albedo values are unique. Further, if we normalize the image intensity values I in each region according to the relative albedo value, we have a virtual constant albedo eld with the single albedo value unknown. This is the same situation as described in Theorem 2. Hence we have nished our proof without repeating the proof of Theorem 2. Based on Theorem 3, we present the following simple algorithm:

Algorithm III

1. Determine the partition P of the albedo eld by nding the discontinuities of the image intensity eld. 2. Hypothesize a possible value i for each region RPi , and apply Algorithm I. 3. Compute Ca ([x; y]) and Cc ([x; y]) to determine if they are small enough so a di erent hypothesis is not needed. Here Cc([x; y]) is the measurement of surface discontinuity. Ca ([x; y]) is a generalized version of Ca () since [x; y] is not a constant any more: Ca ([x; y]) =

XZ Z

i

RPi

j @p@y() ? @q@x() jdxdy:

(33)

It should be noted that this simple algorithm is not robust since it depends on the assumption that discontinuities in image intensity only occur along the borders of the albedo partition, and in practice this is not true for digitized images. Arbitrary albedo eld For an albedo eld which is purely continuous or a mixture of continuous and discrete forms, the problem becomes quite dicult and we leave it as an open issue. One simulated example To conclude this subsection we show a simple example (Figure 9) in which we rst recover the simple albedo eld (piecewise constant) and then recover the shape information. More speci cally, our simple procedure is as follows: (1) we rst use an image histogram-based approach to segment the albedo eld, (2) then we apply algorithm II to recover both shape and albedo information. The simulated data here is very similar to depth function z2 , but we have a piecewise constant albedo eld with values 0.5, 0.8 and 1. B. Enhanced Face Recognition

Having observed that most existing SFS techniques do not produce accurate prototype images for real face images, we developed symmetric SFS as a better alternative. By introducing sym21

0.07

0.08

0.06

0.06 0.04

0.05

0.02 0

q

p

0.04 0.03 −0.02 0.02

−0.04

0.01

−0.06 50

0 −50

50 −0.08 −50

0

0

0

0 y

−50

50

50

x

y

−50

x

1

1

0.95

0.8

0.9

0.6

0.85

0.4

0.8

0.2

0.75

z

albedo

(a)

0

0.7

−0.2

0.65

−0.4

0.6

−0.6

0.55 0.5 −30

−0.8

40 20 −20

−1 −30

0 −10

0

−20

10

20

30

40 20 −20

0 −10

0

−20

10

20

−40 y

30

−40 y

x

x

(b)

(c)

Fig. 9. Simulation result for varying albedo symmetric SFS. The plot in (a) is the recovered shape information (p; q), the plot in (b) is the recovered albedo eld, and the plot in (c) is the true depth map.

metric SFS we can recover the unique solution for the constant albedo case (Theorems 1 and 2), and a unique solution also exists and may be recovered in the piecewise constant albedo case (Theorem 3). However, we should point out that in practice it is not easy to guarantee the correct solution for real face images. There are many practical issues that require study before we fully implement symmetric SFS: (1) how sensitive is the unique solution to possible violations of the assumptions such as that of C 2 surface, (2) how sensitive is the solution to noise in the measurement I (and hence rI ), (3) how sensitive is the solution to the single light source assumption and to possible errors in source estimation. One way to solve these practical diculties would be to use more than one image if they are available. Based on these observations and the assumption that we may have only one image available, we decided to incorporate another important fact: 22

all faces share a similar common shape. With the aid of a generic 3D head model, we can shorten the two-step procedure of obtaining the prototype image from a given image (1. given image to shape via SFS, 2. recovered shape to prototype image) to one direct step: image to prototype image. As mentioned earlier, at least two methods can be used to enhance face recognition: one is to render the prototype image Ip and another is to obtain an illumination-invariant measure. In the following subsections, we discuss these two approaches in detail. B.1 Symmetric Source-from-Shading In Section 2 we argued that we can improve the estimate of the slant angle by using a generic 3D face model and formulating a minimization problem. However, we also noticed one drawback, i.e., variations in albedo have not been taken into account. Hence the error measure is not very accurate, and neither is the estimated slant angle. We x this problem by using the self-ratio image de ned in (17). Notice that the expression in (18) only involves shape and lighting information, so we can compare the image generated by the generic 3D face model and the real face images in this ratio form. In summary, we formulate the following minimization problems to accurately estimate the slant angle:  = arg min(rIMF ( ;  )) ? rI )2;

(34)

or both the slant and tilt angles: (  ;  ) = arg ; min(rIMF ( ;  )) ? rI )2:

(35)

For a simple comparison of methods using the original images (6) and the self-ratio images (34), we ran both these algorithms on real face images. In Figure 10, we plot one face image along with the error-versus-slant curve for each method. As can be seen, the correct (subjective judgment) value of slant (80 ) has been recovered by the symmetric method (34). However, it is missed using the method reported in Section 2 (6). B.2 Obtaining the Prototype Face Image: A Direct Computation Let us write the irradiance equation for the prototype image Ip with = 0, i.e., Ps = 0, Qs = 0: 1 : (36) Ip [x; y ] =  p 1 + p2 + q 2 23

460

220

440 200 420 180

Error

Error

400

380

160

360 140 340 120 320

300

0

50 slant

100

100

0

50 slant

100

Fig. 10. Comparison of source-from-shading algorithms. The correct slant value was recovered using the symmetric model-based source-from-shading algorithm (right gure), while it was missed using the model-based algorithm (middle gure).

Comparing (15) and (36), we obtain Ip[x; y ] = p

K 2(1 + qQs ) (I [x; y ] + I [?x; y ])

(37)

where K is a constant equal to 1 + Ps2 + Q2s . This simple equation directly relates the prototype image Ip to I [x; y ] + I [x; ?y ] which is already available. It is worthwhile to point out that this direct computation of Ip from I o ers the following advantages over the two-step procedure:  There is no need to recover the varying albedo [x; y ].  There is no need to recover the full shape gradients (p; q ). The only parameter that needs to be recovered is the partial shape information q . (The light source ( ;  ) has been computed earlier using the symmetric model-based source-from-shading algorithm). Theoretically, we can use the symmetric SFS algorithm to compute this value. But as we argued earlier, due to practical issues of using just one image, we approximate this value with the partial derivative of a 3D face model and use the self-ratio irradiance equation (19) as a consistency checking tool. More details are explained in the implementation section. Finally we should point out that using the generic 3D face model may give us another advantage:  The problem of solving Lambertian-model-based regular/symmetric SFS when the underlying model is not Lambertian is avoided. For example, when ambient lighting is present the Lambertian model should be modi ed. A Special Case:  = 00 It is interesting to note that in a special lighting condition case,  = 00 (Qs = 0), we have an extremely simple direct computation: Ip(x; y ) =

K

2 (I [x; y ] + I [x; ?y ]): 24

(38)

Fig. 11. Rendering images under di erent illuminations using the Mozart head. The rst (left-most) image is the original prototype image, the following images (from left to right)) are rendered using the following parameters respectively: ( = 400 ;  = 1200), ( = 300;  = ?1100) and ( = 400;  = 1600).

This basically states that when we have a left-right light source, we can fully recover the prototype image up to a scalar without computing  and (p; q ) at all. This also provides theoretical support for the heuristic method proposed for illumination compensation (Zhao, 1999). Image Rendering: A Reverse Process It is not unnatural to think about rendering the images under di erent lighting conditions from a prototype image, which is basically the reverse of what we just described. Combining (3) and (36) yields 1 + pPs + qQs I [x; y ]: (39) I [x; y ] = p K Again we can render images under di erent lighting conditions without worrying about the albedo. One requirement for applying this process is that we need to have the full gradient information (p; q ). However, it is possible if we can use a 3D face model MF to obtain a reasonable approximation to the true shape. We demonstrate the idea of using just the prototype image (a real face image from the Yale database) and the Mozart head model (without any modi cation) to generate di erently illuminated images in Figure 11. This rendering method can be extended to synthesize new images under di erent illuminations and head rotations. The underlying theory is brie y described in Section 5 and an example is also shown in Figure 18. B.3 An Illumination-Invariant Measure The idea of having an illumination-invariant measure is very appealing since such a measure eliminates the illumination sensitivity problem. The authors in (Jacobs et al., 1998) suggested using the complexity of the ratio of two aligned images as the similarity measure. This measure 25

is not a strictly illumination-invariant measure, but has proven to be very useful in handling the illumination problem. The basic idea behind this measure is that the ratio of two images is much simpler when they are from the same object. This is re ected in the following two expressions: I1 = ( K2 )( 1 + pI Ps;1 + qI Qs;1 ) I2 K1 1 + pI Ps;2 + qI Qs;2

(40)

for the images of the same object, and s

2 2 I1 = ( K2 )( I )( 1 + pI Ps;1 + qI Qs;1 ) 1 + pJ2 + qJ2 J2 K1 J 1 + pJ Ps;2 + qJ Qs;2 1 + pI + qI

(41)

for images of di erent objects. It is obvious that (41) is more complex than (40) in general. By picking the integral of the magnitude of the gradient of the function (ratio image) as the measure of complexity, they proposed the following symmetric similarity measure: dG (I; J ) =

Z Z

min(I; J ) k r( JI ) kk r( JI ) k dxdy:

(42)

They also noticed the similarity between this measure and a measure that simply compares the edges. An advantage of this measure is that it can be easily computed from the original image pairs using gradient operators. Striving to nd a true illumination-invariant measure, we would like to have a measure which is a function of only the shape gradients and albedos of the two images being compared. It seems very hard to nd such a measure at rst glance; indeed it is very hard if we start from the original images. But we can easily obtain such a measure using the di erence of prototype images (36) Z Z dP (I; J ) = k SF Sp(I ) ? SF Sp(J ) k dxdy; (43) where SF Sp is the operator which recovers the prototype image Ip or Jp from a given image I or J using the proposed model-based symmetric SFS algorithm (direct image computation). This symmetric measure is a function of only the shape gradients (p; q ) and albedos ; hence it is illumination-invariant. To illustrate the di erent mechanisms of these two measures and the direct image di erence RR measure dD (I; J ) = k I ? J k dxdy, we use a real example of comparing two face images of the same person in Figure 12. As can be seen, both the gradient measure dG (I; J ) and the illumination-invariant measure dP (I; J ) improve our ability to classify images under di erent illuminations. But they have the following di erences: 26

(a)

(b)

(c)

(d)

Fig. 12. Comparison of di erent measures: a) image pair need to be compared, b) di erence using the illumination-invariant measure, c) di erence using the gradient measure by Jacobs et al., and d) direct image di erence.  dP (I; J ) is a true illumination-invariant measure, while dG (I; J ) is not.  dG (I; J ) can be easily computed from images, while dP (I; J ) needs the prior computation of

IP and JP .

To summarize, dP (I; J ) is a better measure in theory, but in practice it is not easy to declare which is better because of computation errors in IP and JP . This point is re ected in the experimental section which compares these two measures. C. Shadow and Implementation Issues

One important issue we have not discussed in detail is the attached-shadow and cast-shadow problem. By de nition, attached-shadow points are those where the image intensities are set to zero because (1 + pPs + qQs )  0. A cast-shadow is the shadow cast by the object itself. First let us assume that there is no shadow of any type in the prototype image. But it is possible for images under variable lighting to have both attached and cast shadows. Let us 27

denote the set of all attached-shadow points by A = f[x; y ] j (1 + pPs + qQs )  0g, and the set T of all cast-shadow points by C = f[x; y ] j I [x; y ] = 0 (1 + pPs + qQs ) > 0g. We also denote the symmetric counterparts of A and C at points [?x; y ] by As and C s respectively. We assume that points in As and C s are not shadow points (otherwise there is nothing we can do except use a interpolation/extrapolation scheme to recover information.). In addition to these shadow points, we need to single out the \bad" points, or outliers in statistical terms, for stable source estimation and prototype image rendering. This is because we need to compute the self-ratio image which may be sensitive to image noise. Let us denote the set of all \bad" points by B; at these points the values cannot be used. From a robust statistics point of view, these \bad" points are outliers (Hampel et al., 1986). Hence our policy for handling these outliers is to reject them and mark their locations. We then use values computed at good points to interpolate/extrapolate at the marked bad points. Many interpolation methods are available (Farin, 1993) such as nearest-neighbor interpolation, polynomial interpolation, spline interpolation (Boor, 1978), etc. Since we may have an irregular structure of good points, we use triangle-based methods. C.1 Source Estimation In estimating the source angle , we need to compute the self-ratio image rI . At the attachedshadow points and cast-shadow points we have

and

rI;A = ?1 rI;As = 1;

(44)

rI;C = ?1 rI;C s = 1:

(45)

It is obvious that these values contain useful information and can be used to estimate the slant angle . To handle the bad points, we can use the robust norm instead of the L2 norm to estimate the bad points (Black and Jepson, 1996). For implementation, we use a simpler form, namely we declare the points to be outliers and reject them whenever rI ? rIMF at these points is out of some range (Black and Jepson, 1996)

p B = f[x; y] j jrI ? rIMF j > = 3g: 28

(46)

C.2 Prototype Image Rendering First, at these shadow points we have

and

Hence we have

IA = 0 pPs +qQs p1+ IAs = K 1+ p2 +q2

(47)

IC = 0 pPs +qQs : p1+ IC s = K 1+ p2 +q2

(48)

I+;A = 1+pPKs +qQs Ip I+;C = 1+pPKs +qQs Ip :

(49)

B = f[x; y] j jrI ? 1 +pMqF PsQ j > g:

(50)

So we can still use these points of As and C s to render the prototype image if a good approximation to the full gradient is available. Since we are using a generic 3D face model to approximate the individual face shape, we have an inherent approximation error for (p; q ). One thing we can do is to perform a consistency check at each point, that is to check if (18) is approximately satis ed. Otherwise we should declare these points outliers and reject them. Hence the bad points are de ned as MF s

D. Performance Analysis of Subspace Methods

Eigensubspace approaches have been very popular in the face recognition literature (Sirovich and Kirby, 1987; Turk and Pentland, 1991). They have been used as a classi er (Turk and Pentland, 1991), and have also been treated as a preprocessing step for enhancing generalized/predictive recognition performance (Zhao et al., 1998a; Zhao et al., 1999) Therefore it is of interest to analyze the e ect of illumination change on subspace-based face recognition methods using the irradiance equation. This is carried out using the analysis of how the illumination variation changes the subspace projection coecients for images causing performance degradation. Consider the basic expression for the subspace decomposition of a face image: I ' IA +

m X i=1

i  i

(51)

where IA is the average image, i are the eigenimages and i are the projection coecients. In practice the eigenimages are obtained by training on all the available training samples, i.e., 29

solving an eigenproblem. Without loss of generality, let us assume that all the training images are symmetric, for example, prototype images. Then it is easy to show that the following is true: Lemma 4: Given symmetric training images I , the average image IA and eigenimages i are also symmetric. This can be shown in a constructive way: First, there exist symmetric images that are eigenimages; second, the eigenimages are unique if unit norm and the image sign are enforced. In the following analysis, we concentrate on the projection coecients i since they are the core features used in any subspace-based methods. Suppose that we have a prototype image Ip and an image I~ under lighting (Ps ; Qs; ?1) for a particular class/person; then they have projection coecient vectors = [ 1; 2;    ; m]T for Ip , and ~ = [ ~1; ~2;    ; ~m ]T for I~. These coecients are computed as follows: i = I p  i ? I A  i (52) ~i = I~ i ? IA i ; where denotes the sum of all element-wise products of two matrices (vectors). If we divide the images and the eigenimages into two halves, i.e., the left and the right, then we have i = IpL Li + IpR Ri ? IA i ~i = I~L Li + I~R Ri ? IA i:

(53)

Now let us use the image rendering equation (39) and the symmetric properties of eigenimages (based on Lemma 4) and the prototype image; we then have i = 2IpL [x; y ] Li [x; y ] ? IA i ~i = ( K2 )(IpL[x; y ] + IpL [x; y ]q L[x; y ]Qs) Li [x; y ] ? IA i ;

(54)

thus proving the following theorem: Theorem 4: Assume that we have symmetric eigen-images, then the subspace projection coecients of the prototype image and an image under lighting (Ps ; Qs ; ?1) are related by the equation 1 Q K?1 ~ = ( ) + s [f1 ; f2 ;    ; fm ]T ? : (55) K K K A where fi = 2(IpL[x; y ]q L[x; y ]) Li[x; y ] and A is the projection coecient vector of the average image IA : [IA 1 ;    ; IA m ]. Before going into any general discussion, let us focus on an interesting special case which has

30

Projection variation due to pure difference in class

Projection variation due to pure illumination

2000

2000

1500

1500

1000

Projection Coefficients

Projection Coefficients

1000 500

0

−500

500

0

−500 −1000

−1000

−1500

−2000

0

2

4

6

8

10

12

−1500

14

0

2

Eigen Basis

4

6

8

10

12

14

Eigen Basis

(a)

(b)

Fig. 13. Change of projection vectors due to (a) class variation, and (b) illumination change.

appeared before:  = 00. In this case, the projection coecient vectors are related as follows: 1 K ?1 ~ = ( ) ? : (56) K

K

A

From this simple relation we can see that the performance of any subspace method which does not take care of image intensity scaling and adding constant, including eigenface and subspace LDA, is subject to degradation under illumination change. On the other hand, the illumination problem in such a case should be taken care by intensity scaling and adding constant; thus it is not really a problem. In general (55) suggests that signi cant illumination change can seriously degrade the performance of subspace-based methods. Hence it is necessary to seek methods that compensate for these changes as we did in previous sections. Figure 13(b) plots the projection coecients for the synthesized face images under di erent illuminations using a real image ( 2 [00; 400],  2 [00; 1800]). Please refer to Section 3.2.2 on how to synthesize face images under di erent illuminations. For comparison purposes, we plot the variations in projection coecient vectors due to di erence in class (Figure 13(a)) along with the variations due to illumination change of the same face (Figure 13(b)). This clearly suggests that illumination changes can severely a ect the system performance. IV. Experiments

In this section, we demonstrate the e ectiveness of applying the proposed light-source estimation algorithm and the prototype image rendering algorithm based on the symmetry property 31

to real face images. The algorithms were used to generate the prototype images from the given images under various lighting conditions. We then used the prototype images to construct an illumination-invariant measure, and more importantly we used these prototype images in existing PCA and subspace LDA face recognition systems. Signi cant performance improvements were observed based on experiments on two face databases. Except in Figure 14 all face images in these experiments have image size 48  42, obtained by normalizing the original images with eye locations xed. We used image size 96  84 in Figure 14 because our experiments indicated that a certain image size is needed for regular SFS algorithms to work well. The faces we used in our experiments are from the FERET, Yale and Weizmann databases. The Yale database contains 15 persons including four images obtained under di erent illuminations. The Weizmann database contains 24 persons also including four images obtained under di erent illuminations. A. Rendering Prototype Images

We have applied our light-source estimation (Section 3.2.1) and direct prototype image rendering method (Section 3.2.2) to more than 150 face images from the Yale and Weizmann databases. Though the purpose of rendering prototype images is to improve recognition performance, we would like to visualize the quality of the rendered images and compare them to the images obtained using a local SFS algorithm (Tsai and Shah, 1992). In Figure 14 we compare the results of rendering the prototype images using 1) local SFS and source-from-shading, as introduced in Section 2.2.2; and 2) direct computation based on symmetric SFS, source-from-shading (Section 3.2.1) and a generic 3D face model. These results clearly indicate the superior quality of the prototype images rendered by direct computation. More rendered prototype images using only direct computation are shown in Figure 15. B. Enhancing Face Recognition

We have conducted three experiments. The rst experiment demonstrates improvements in recognition performance by using the new illumination-invariant measure and compares this measure and the gradient measure introduced in (Jacobs et al., 1998). The second experiment is used to show that using the rendered prototype images instead of the original images can signi cantly improve existing face recognition methods such as PCA and LDA. We think it is appropriate to separate the rst two sets since they are based on di erent methodologies. For example, introduction of the gradient measure is meant to alleviate the e ect of illumination change, but 32

Fig. 14. Image rendering comparison. All the original images are listed in the rst column. The second column represents the prototype images rendered using the local SFS algorithm. Prototype images rendered with symmetric SFS are plotted in the third column. Finally, the fourth column represents real images which are close to the prototype images.

it does not involve any training, whereas PCA and LDA are training-based methods that do not explicitly handle illumination changes. Finally, in the third experiment we demonstrate that the generalized/predictive recognition rate of subspace LDA can be greatly enhanced. B.1 Direct Image Di erence, Gradient Measure and Illumination-Invariant Measure In this section, we investigate the e ect of using di erent measures on the recognition of face images under di erent illuminations. The three measures we compare are 1) the image di erence measure dD (I; J ), 2) the gradient measure dG (I; J ), and 3) the illumination-invariant measure dP (I; J ). In the following experiment, we divide each face database (Yale and Weizmann) which contains four images per person into two disjoint sets: a sample set and a testing set. We conducted two rounds of experiments for each face database: In the rst round (Table 1) the sample set contains only one image, while in the second round (Table 2) the sample set contains two images. Figure 16 shows some of these images. In our experiment, we found out that applying the zero-mean-unit-variance preprocessing step can considerably improve matching performance based on the image measure and we only report these results. Some conclusions can be drawn from the results: 1) both the gradient measure and the illumination-invariant measure handle the illumination changes better than the image measure, 33

Fig. 15. Rendering the prototype image. The images with various lighting conditions are shown in the rst two columns, while the corresponding prototype images are shown in the last two columns.

34

Fig. 16. Databases (Yale and Weizmann) used in our experiments. We divide the images into two disjoint sets: sample set and testing set. In the rst round of the test (Table 1), the sample set contains only one image per class (column 1), and in the second round (Table 2), the sample set contains two images (columns 1 and 2).

and 2) the illumination-invariant measure is a better measure than the gradient measure if the computation/approximation errors in (p; q ) are small. However, we also noticed an interesting phenomenon in Table 1 for the Weizmann database. That is, the illumination-invariant measure is worse than the gradient measure, and even worse than the image measure. This is due to the following reasons: 1) a big approximation error in using the 3D generic model to t individual faces, and 2) the image quality in the Weizmann database is worse than in the Yale database. These reasons also explain a similar phenomenon in Table 4 for the Weizmann database. 35

Database Image Measure Gradient Measure Illumination-Invariant Measure Yale 68.3% 78.3% 83.3% Weizmann 86.5% 97.9% 81.3% TABLE I Testing round one: matching performance using three different measures. Only one sample image per class is available.

Database Image Measure Gradient Measure Illumination-Invariant Measure Yale 78.3% 88.3% 90.0% Weizmann 72.9% 96.9% 97.9% TABLE II Testing round two: matching performance using three different measures. Two sample images per class are available.

B.2 Prototype Images for PCA and LDA In this section we show that the use of prototype images signi cantly improves existing PCA and LDA systems. Both PCA and LDA use either original images or rendered prototype images as training and testing samples. The testing scenarios here are exactly the same as in the experiment described above. Hence PCA and LDA are trained using either one image per class (Table 3) or two images per class (Table 4 ). LDA was performed on the PCA projection coecients. The LDA used here is a regularized one, i.e. scatter matrix Sw has been modi ed to add a very small diagonal constant (Zhao et al., 1999). For example, in the case of one sample per class, the scatter matrix Sw was set to be the identity matrix. If no weights are applied in the PCA and LDA transformed spaces, both classi ers are exactly the same. This is clearly revealed in Table 3. B.3 Enhanced Predictive/Generalized Recognition: Subspace LDA We have built a successful subspace LDA face recognition system (Zhao et al., 1998a; Zhao et al., 1999). By carefully choosing the dimension of the (PCA) subspace, this method can handle the over- tting problem, so it has good predictive/generalized recognition performance. This characteristic enables the system to recognize new face images or new classes/persons without 36

Database PCA LDA P-PCA P-LDA Yale 56.7% 56.7% 81.7% 81.7% Weizmann 38.5% 38.5% 77.1% 77.1% TABLE III Testing round one: matching performance based on PCA and LDA. We use symbols P-PCA and P-LDA to denote PCA and LDA systems using prototype images.

Database PCA LDA P-PCA P-LDA Yale 71.7% 88.3% 90.0% 95.0% Weizmann 97.9% 100% 95.8% 98.9% TABLE IV Testing round two: matching performance based on PCA and LDA. Symbols used here are the same as in Table 3.

retraining of the subspace (PCA) and/or LDA. For details of the system and an independent evaluation of this system, see (Zhao et al., 1998a; Phillips et al., 1997). In this experiment, the face-subspace  was obtained by training on 1038 FERET images from a total of 444 classes with only 2 to 4 training samples per class and the face subspace dimension was chosen to be 300 based on the characteristics of the eigenimages (Zhao et al., 1998a). The LDA projection matrix W was also trained on these FERET images. There was no re-training of  and W when new classes from the Yale and Weizmann databases were presented in this experiment. We used a testing protocol similar to the FERET test: we have a gallery set and a probe set. For each image in the probe set a rank ordering of all the images in the gallery set is produced. The cumulative match scores are computed in the same way as in the FERET test (Phillips et al., 1997). We conducted two independent experiments on the Yale and Weizmann databases. For the Yale database, we have a testing database composed of a gallery set containing 486 images from several face databases, including 15 (one image per class) from the Yale database, and a probe set containing 60 images also from the Yale database. For the Weizmann database, we have a testing database composed of a gallery set containing 495 images from several face databases, including 24 (one image per class) from the Weizmann database, and a probe set 37

Performance Comparison for Yale Dataset

Performance Comparison for Weismann Dataset 1.05

1 1

0.95

0.95

Accumulative Match Score

Accumulative Match Score

0.9 0.9

0.85

0.8

0.85

0.8

0.75

0.7

0.65 0.75 0.6

0.7

0

50

100

150

200

250 Rank

300

350

400

450

0.55

500

0

50

100

150

(a)

200

250 Rank

300

350

400

450

500

(b)

Fig. 17. Enhancing the predictive/generalized recognition of subspace LDA. The thin lines represent the cumulative scores when applying the existing subspace LDA to the original images, while the thick lines represent the scores when applying it to the prototype images. The curves in (a) are for the Yale face database, the curves in (b) are for the Weizmann database.

containing 96 images from the same database. Figure 17 shows the signi cant improvement in performance using the prototype images in both databases. V. Discussion and Conclusions

We have presented in this paper a new symmetric SFS algorithm and proved that it has a unique point-wise solution. By combining the symmetric SFS algorithm and a generic 3D head model, we can enhance face recognition under illumination changes. The feasibility and ecacy of this method has been demonstrated experimentally using two sets of real face images. A. Discussion

The new symmetric SFS algorithm presented in this paper has the following advantages over existing SFS algorithms for symmetric objects:  It not only has a point-wise unique solution for (p; q ) but also a unique solution for albedo. Here the albedo can be either constant or piece-wise constant across the whole image plane.  By using the self-ratio image, problems due to variations in albedo are avoided. Hence the model-based light-source estimation approach becomes more accurate.  By using the symmetry property, shadow points can still be utilized for both light-source estimation and solving for the SFS problem, if their symmetric counterparts are not shadows. 38

 Combining symmetric SFS and regular SFSs, a unique solution can be obtained if shadow

points are present. More speci cally, after recovering the point-wise solutions for (p; q ) at lighted points, regular SFS can be applied to the symmetric counterparts of the shadow points. Since the boundary conditions (values at surrounding lighted points) are given, regular SFS is likely to be well-posed and to have a unique solution (Horn, 1990).  Compared to photometric stereo algorithms (Woodham, 1980), the registration problem of multiple images has been alleviated and the calibration of multiple-light-source brightness was also avoided. Based on symmetric SFS and a generic 3D model, our method of direct prototype image computation also has the following features in handling the illumination problem in face recognition:  There is no training, hence only one image is needed.  The new matching measure is illumination-invariant.  The illumination problem and the recognition problem based on training images are nicely separated.  Since no full symmetric SFS is really carried out and the computation is image to image, it is fast.  The problem of solving for complex/arbitrary albedo information is avoided.  The problem of solving Lambertian-model-based regular/symmetric SFS when the underlying model is not Lambertian is avoided. For example, when ambient lighting is present the Lambertian model should be modi ed.  Our method can be easily integrated with other approaches. For example, we can combine symmetric SFS with multiple 3D models such as the laser-scanned range images of real human heads. Using multiple 3D models we can render much more accurate prototype images by choosing the best model. As a side-product, a new model-based image synthesizing method has been developed to render new images of a symmetric object with arbitrary albedo under di erent rotations and illuminations. Some drawbacks of our scheme:  The need to apply symmetric light-source estimation for each image, which may be timeconsuming.

39

 The accuracy of this scheme depends on light-source estimation and on the approximation

of the real 3D shapes of individual faces by the 3D head. B. Future Directions

We plan to investigate many practical issues that arise in implementing symmetric SFS, its sensitivity to measure errors, etc. Also, good algorithms for recovering both shape and albedo information in the general case need to be developed. Since in our image rendering algorithm we use division and the given partial shape information may not be accurate, it is likely that we have some bad reconstruction points. In order to have a better individual t, we may need to deform the generic 3D face model. One simple implementation is to detect the nose tip, face boundary, mouth corner, etc. rst, and then deform the shape according to the movement of these key points. For example, the deformation can be done using a B-spline based method which deforms and interpolates the given initial shape (Boor, 1978; Farin, 1993). More advanced physics-based methods can also be explored (Metaxas, 1997; Pentland and Sclaro , 1991). For symmetric objects under out-of-plane rotation we are also developing symmetric SFS algorithms. The basic idea is to rotate the object back to the frontal view, i.e., symmetric status, and then apply symmetric SFS. This can be done by rst determining the 3D pose of the object and then rotating it back. Several potential issues need to be dealt with before we can apply it in practice: 1) the correspondence problem [x; y ] ! [x0; y 0], and 2) possible reconstruction errors due to occlusions. Again, using a 3D object model may help in solving these problems. In order to facilitate this procedure, we need the following lemma and theorem (the proofs are omitted due to space restrictions): Lemma 5: Suppose that the partial gradients (p[x; y]; q[x; y]) become (p [x0; y0]; q [x0; y0]) after the underlying surface is rotated in the x-z plane about the y -axis by  (anti-clock-wise); then they are related by p [x0 ; y 0] = tan( + 0 ) (57) [x;y] cos  ; q  [x0 ; y 0] = qcos( + ) where tan 0 = p[x; y ]. Theorem 5: Assuming an arbitrary single light source (Ps; Qs; 1), then the rotated (anti-clockwise in the x-z plane about the y -axis) image I  [x0 ; y 0] is related point-wise to the original image 0

0

40

I [x; y ] via q cos(0 ) Q + 1 tan( + 0 )Ps + cos( +0 ) s  0 0 ; I [x ; y ] = 1z; I [x; y ](cos  ? p[x; y ] sin ) pPs + qQs + 1

(58)

and related point-wise to the prototype image Ip via

1 q cos(0 ) Qs +1] p (59) I  [x0; y 0] = 1z; Ip [x; y ](cos  ? p[x; y ] sin )[tan( + 0 )Ps + cos( + 0 ) 1 + Ps2 + Q2s where tan 0 = p[x; y ] and 1z; is the indicator function indicating the possible occlusion determined by the shape and rotation angle. In Figure 18, we illustrate the synthesized images under di erent rotations and illuminations (with a Lambertian model) based on Theorem 5 and the 3D Mozart head model. The original prototype image is the same as used in Figure 11. In summary, we can pursue several major future directions:  Use more than one face image to obtain better gradient information, i.e., full symmetric SFS.  Apply simple geometric operation of the generic 3D model to t individual faces better.  Extend the Lambertian model to more general models, such as including specular re ections and multiple lighting sources. Possible ambient lighting e ects should also be considered.  Develop symmetric SFS algorithms to handle arbitrarily varying albedo.  Develop symmetric SFS algorithms for symmetric objects under a 3D transformation. Acknowledgments

We would like to thank Dr. QinFen Zheng and Dr. GuoQing Wei for providing their codes and insightful discussions. We thank Dr. Jonathon Phillips for providing us the FERET images. We also thank Mr. KeXue Liu for useful discussions of the proofs. Appendix I. Proof of Lemma 1

Two possible cases need to be discussed:  The set V0 is empty, i.e., there is no point in the image satisfying T2 = 0.  The set V0 is not empty. In the rst case, we prove that there is only one set possible for all regular image points, either V? or V+ . We use contradictions to prove this. Suppose that both sets V? and V+ exist. Pick any two points in the neighboring V? and V+ and draw a path L between these two points. Given a 41

Fig. 18. Rendering images under di erent rotations and illuminations using the Mozart head. The images are arranged as follows: in the row direction (left to right) the images are rotated by the following angles: 50 , 100, 250, and 350; in the column direction (top to bottom) images are illuminated under the following conditions: pure texture warping (no illumination imposed); ( = 00); and ( = 300,  = 1200).

small positive number , we can always nd two points [x1; y 1] and [x2; y 2] on the path L such p that (x1 ? x2)2 + (y 1 ? y 2)2 < , [x1; y 1] 2 V? and [x2; y 2] 2 V+ . Since Ti[x; y ](i = 1; 2) is a continuous function (of continuous functions rI and S , (p; q )), there exist small positive numbers 1 and 2 such that jT1[x1; y 1] ? T1 [x2; y 2]j < 1 and jT2[x1; y 1] ? T2[x2; y 2]j < 2 . Hence the partial derivative di erence satis es the following inequality

jq[x1; y1] ? q[x2; y2]j = jT1[x1; y1] ? T1[x2; y2] ? T2[x1; y1] ? T2[x2; y2]j  jT2[x1; y1] + T2[x2; y2]j? jT1[x1; y1] ? T1[x2; y2]j > 2 min(jT2[x1; y 1]j; jT2[x2; y 2]j) ? 1 : 42

(60)

z/z’ y’

y

L1

L2

α (p’, -q’) (p’, q’)

[x0, y0] τ

x’ x

Fig. 19. Rotation of the coordinate system x-y-z to x0-y0 -z 0 and the geometrical interpretation of the symmetric SFS.

Now let  ! 0; then [x1; y1] ! [x2; y2 ], i ! 0. Hence jq [x1; y 1] ? q [x2; y 2]j ! 2T2[x; y ] > 0 since V0 is empty. This obviously contradicts the fact that q [x; y ] is a continuous function, i.e., jq[x1; y1] ? q[x2; y2]j ! 0. In the second case, we can use a similar argument to prove that sets V? and V+ are separated by set V0. That is, for any two points [x1; y 1] 2 V? and [x2; y 2] 2 V+ , there exists at least one point [x0; y 0] on the path L linking these two points such that [x0; y 0] 2 V0. II. Proof of Lemma 2

We use a geometrical argument similar to the one used in (Onn and Bruckstein, 1990) to prove this Lemma. Later we also give a pure algebraic proof. Before we go into details of the proof, we want to emphasize the di erence between two classes of measurements. The rst class has simple point-to-point mappings under coordinate system transformations. These measurements include the depth map z , albedo , image intensity map I , light source direction ( ,  ) and hence rI , S , T1 , T2 and (a, b, c). The second class is not so simple, it includes the partial derivatives p and q . For example, there is no simple point-to-point mapping between p and p0 (obtained when the coordinate system is rotated). Instead, p0 and q 0 are functions of both p and q and the rotation angle. This is easy to understand since the de nitions of the partial derivatives involve the coordinate system: @z p0 = @x (61) @z : q 0 = @y 0

0

0

0

43

It is important that we should not be confused by the fact that the derivatives (p, q ) change to (p0, q 0 ) but the image intensity I does not change (I 0[x0; y 0] = I [x; y ]). For example, suppose that we rotate the coordinate system so that Qs 6= 0 becomes Q0s = 0 (Figure 19), then the following equation holds: I 0[x0; y 0] = I [x; y ]; (62) yielding 1 + p0 [x0; y 0]Ps0 p 1 + p[x; y ]Ps + qp [x; y ]Q2s p = : 1 + p[x; y ]2 + q [x; y ]2 1 + Ps2 + Q2s 1 + (p0[x0; y 0])2 + (q 0[x0; y 0])2 1 + (Ps0 )2

p

(63)

At rst glance, the above equation seems to contract intuition because the R.H.S. has a much simpler form (no parameter Q0s at all) than the L.H.S. But we should realize that p0 encodes both p and q information in general, so essentially there is no change. Recall that for a Lambertian surface the image intensity is the dot product of the surface normal ~n = (p; q; 1) and the negative illumination vector ?L~ = (Ps ; Qs ; 1). So for a given illumination vector and the image intensity at one point [x0; y0], the possible solution for the surface normal is a cone with axis in the illumination direction and apex located at [x0 ; y0] (Figure 19). The cone also has an opening angle determined by arccos( I ). This geometrical interpretation clearly explains that regular SFS cannot resolve this ambiguity locally. However symmetric SFS brings in another constraint (19) to reduce the number of possible solutions to two. Now we can formulate symmetric SFS di erently: Consider it as a photometric stereo problem in either half plane of the whole image plane and assume a special lighting condition: Qs = 0. To see this, let us rewrite the image irradiance equations at points [x0 ; y0] and [?x0 ; y0] as follows: =  p1+p 1++qpPps 1+P s 1 ? pP s I [?x0; y0] =  p1+p +q p1+P : s

I [x0; y0]

2

2

2

2

2

2

(64)

This can be formulated as a photometric problem for a surface with (p; q ) lighted by the original illumination vector L~ 1 = (Ps ; 0; ?1) and the virtual illumination vector L~ 2 = (?Ps ; 0; ?1), or equivalently 1 = 2 and 1 = 0 and 2 = 1800. Since each illumination vector produces a cone, the possible solutions for surface normal ~n are the intersections of the two cones (Figure 19). When there is only one intersection the solution is unique which corresponds to T2 = 0. Since there must be a real solution, the case where the two cones do not intersect cannot occur (T2 < 0). In case there are two intersections (T2 > 0), the two solutions have a special structure, i.e., the 44

two solutions ~n? and ~n+ (corresponding to q? and q+ (26) respectively) are symmetric with respect to the plane de ned by the two illumination vectors L~ 1 and L~ 2 . Hence p? = p+ and q? = ?q+ . To generalize the lighting condition, let us obtain a new coordinate system by rotating the x-y plane about the z -axis by  degrees so that the new x0 -z 0 plane coincides with the illumination vector L~ 1 (Figure 19). In this new x0 -y 0 -z 0 coordinate system, we have the following relations: p0? = p0+ q?0 = ?q+0 ;

(65)

Now suppose that both solutions satisfy the integrability constraint. Applying the surface integrability constraint7 @p0=@y 0 = @q 0=@x0 and the above equation leads to @q?0 @p0? @p0+ @p0+ @q+0 = = ? = ? = ? : @y 0 @x0 @x0 @y 0 @y 0

(66)

Hence if we assume that both solutions satisfy the integrability constraint, the surface must satisfy the following special condition: @p0 @q 0 = = 0: @y 0 @x0

(67)

This immediately suggests that the surface must have a special form: z (x0 ; y 0) = F (x0) + G(y 0);

(68)

where both F and G are continuous functions symmetric with respect to the old x-z plane. The special surface form can be easily expressed in the original coordinate system as follows: z (x cos  + y sin ; ?x sin  + y cos  ) = F (x cos  + y sin  ) + G(?x sin  + y cos  );

(69)

We can arrive at the same conclusion di erently. In the rotated coordinate system x0-y 0 -z 0 , we have Q0s = 0; hence p0? = p0+ = PrIs from (18). In addition, we have a0 = S 0 (S 0 6= 0 for non-shadow points), b0 = 0 and c0 = [S 0(1 + ( PrIs )2) ? (1 + rI0 )2] from (23), immediately suggesting q+0 = ?q?0 . In summary, we arrive at the relation (65) and the reset of the proof is the same as in the above geometrical argument. 0

0

0

0

III. Proof of Lemma 3

Suppose that we have two possible solutions for albedo, t and , and  = ht , where h is a positive constant and t is the true solution. We also denote the recovered partial derivatives by 45

(p(t); q (t)) and (p(); q ()). To prove the lemma, we assume that both values of albedo satisfy (30) or equivalently surface integrability. As in the proof of Lemma 2, we rst need to rotate the coordinate system so that the new x0 -z 0 plane coincides with the plane de ned by the illumination vector L~ and axis z (Figure 19). We already knew that under this new coordinate system x-y 0 -z 0 we have Q0s = 0. This immediately suggests p0 (t) = p0 (): (70) From the de nitions of a, b and c (23), we have a0 = S 0 , b0 = 0 and c0 = [S 0(1+( PrIs )2 ) ? (1+ rI0 )2]. Hence we have the following equation from (25) without disambiguating the plus/minus sign: 0

0

r

?4S (t )[S (t )(1+( PrIs )2 )?(1+rI )2 ]

q 0 (t) = q 0 () =

0

r

0



0

0

0

0

2S (t) ?4S ()[S ()(1+( PrIs )2 )?(1+rI )2 ] : 2S ()

(71)

0

0

0

0

0

0

Without referring to the speci c albedo value, we have a common expression for the partial derivatives of q 0 (t) and q 0 (): r

@q @x

0

r0

(1+r 0 )2

@ ?(1+( PIs )2 )+ S I @x rI @rI 2(1+rI ) @rI (1+rI )2 @S ?2 (P )2 @x + S @x ? (S )2 @x

=

0

0

0

0



=

0

0

0

0

s

r

0

0

0

0

0

r0

(72)

0

0

0

2 ?(1+( PIs )2 )+ (1+SrI ) 0

2

0

0

Recall that one of the properties of the symmetric object is p[x; y ] = ?p[?x; y ] (14). This implies that on the middle line (x = 0) p[0; y ] = 0; hence rI [0; y ] = 0. Evaluating the partial derivative (72) on this middle line leads to the following simple expression: @r

1 @S 2 I @q 0 S @xq? (S )2 @x jx=0: j =  @x0 x=0 2 1 ?1 0

0

0

0

S

(73)

0

0

0

Using the integrability constraint or (30) and (70), we have q 0 (t) q 0 () = : @x0

(74)

@x0

Now substituting (73) into this equality and using the fact that S 0 () = h1 S 0(t) (S 0 > 0 for non-shadow points), we have 2

2 @rI 1 @S 1 @S 2 @rI S @xq? (S )2 @x S @xq? (S )2 @x jx=0 =  1 1 1 jx=0: 2 S1 ? 1 2 h S ? h2 0

0

0

0

0

0

0

0

0

0

0

0

0

46

0

(75)

Since the numerators of both sides in (75) are the same, we usually have h2

r

r

1 ? 1 =  h2 1 ? 1; S0 S0

(76)

except if the following special condition occurs: 0 0 2 @rI0 jx=0 = 10 @S0 jx=0 :

@x

S @x

(77)

From (76), rst it is obvious that the sign cannot be negative, and second we can obtain a quadratic equation for h2: (78) ( S10 ? 1)(h2)2 ? S10 (h2 ) + 1 = 0:

This equation has two solutions: h2 = 1 and h2 = 1?1 . Since h is a constant, the second S solution suggests that S 0, hence I or equivalently I can only be a constant along the middle line (x = 0): I j = constant: (79)  x=0 1

0

0

0

We know that p = 0 along the middle line, leading to

1 + qQs j = constant: (80) 1 + q 2 x=0 Using the fact that q is a continuous function, the above quadratic equation can only have one solution: q jx=0 = constant: (81) p

So excluding special conditions (77,81), we have proved that h = 1 and hence  = t ; :

(82)

One simple example of special condition (77) is when the underlying surface is planar. In such @S I a case, @r @x = @x = 0. For special condition (81), we have the example z = yF (x). 0

0

0

0

Note

1. Simple scaling of illumination intensity can be easily handled, hence is not considered to be a change in illumination. 2. We also modi ed these algorithms, imposing additional constraints such as shape symmetry, but this did not seem to improve the results much. 3. Internet Address: http://www.wisdom.weizmann.ac.il/yael/. 47

4. Internet Address: http://cvc.yale.edu/projects/yalefaces/yalefaces.html. 5. Since one of the possible solutions is the true solution, b2 ? 4ac must be greater than or equal to zero. 6. In order to use Lemma 3 (Theorem 2) to prove the uniqueness of the albedo value, we need to make use of the surface property along the middle line (x = 0). Lemma 3 does not guarantee unique recovery of the albedo value in a particular region. 7. Surface integrability does not change under the rotation of the coordinate system. It is an intrinsic second-oder property of the surface. References [1] Adini, Y., Moses, Y., and Ullman, S. 1997. Face Recognition: The Problem of Compensating for Changes in Illumination Direction. IEEE Trans. on PAMI, Vol. 19, pp. 721-732. [2] Atick, J., Grin, P., and Redlich, N. 1996. Statistical Approach to Shape from Shading: Reconstruction of Three-Dimensional Face Surfaces from Single Two-Dimensional Images. Neural Computation, Vol 8, pp. 1321-1340. [3] Bathe, K. 1982. Finite Element Procedures in Engineering Analysis, Prentice-Hall: Englewood Cli s, NJ. [4] Belhumeur, P.N., Hespanha, J.P., and Kriegman, D.J. 1997. Eigenfaces vs. Fisherfaces: Recognition Using Class Speci c Linear Projection. IEEE Trans. on PAMI, Vol. 19, pp. 711-720. [5] Belhumeur P.N. and Kriegman, D.J. 1997. What is the Set of Images of an Object Under All Possible Lighting Conditions? In Proc. Conference on Computer Vision and Pattern Recognition, San Juan, PR, pp. 52-58. [6] Bertero, M., Poggio, T., and Torre, V. 1987. Ill-posed Problems in Early Vision. MIT Arti cial Intelligence Laboratory, Technical Report 924. [7] Black, M. and Jepson, A. 1996. Eigentracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation. In Proc. European Conference on Computer Vision, Springer, Berlin, pp. 329-342. [8] Boor, C.D. 1978. A Practical Guide to Splines, Springer: Berlin. [9] Bruss, A.R. 1982. The Eikonal Equation: Some Results Applicable to Computer Vision. Journal of Mathmetical Physics, Vol. 23, pp. 890-896. [10] Chellappa, R., Wilson, C.L., and Sirohey, S. 1995. Human and Machine Recognition of Faces, A Survey. Proc. of the IEEE, Vol. 83, pp. 705-740. [11] Dupuis, P. and Oliensis, J. 1992. Direct Method for Reconstructing Shape from Shading. In Proc. Conference on Computer Vision and Pattern Recognition, Urbana/Champaign, IL, pp. 453-458. [12] Etemad, K. and Chellappa, R. 1997. Discriminant Analysis for Recognition of Human Face Images. Journal of the Optical Society of America A, Vol. 14, pp. 1724-1733. [13] Farin, G. 1993. Curves and Surfaces for Computer Aided Geometric Design: A Practical Guide, Academic Press: San Diego, CA. [14] Ferrie, F.P., and Levine, M.D. 1989. Where and When Local Shading Works. IEEE Trans. on PAMI, Vol. 11, pp. 198-206.

48

[15] Georghiades, A.S., Kriegman, D.J., and Belhumeur, P.N. 1998. Illumination Cones for Recognition Under Variable Lighting: Faces. In Proc. Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA, pp. 52-58. [16] Hallinan, P. 1994. A Low-Dimensional Representation of Human Faces for Arbitrary Lightning Conditions. In Proc. Conference on Computer Vision and Pattern Recognition, Seattle, WA, pp. 995-999. [17] Hampel, F.R. et al. 1986. Robust Statistics: The Approach Based on In uence Functions, John Wiley & Sons: New York. [18] Horn, B.K.P. and Brooks, M.J. 1989. Shape from Shading, MIT Press: Cambridge, MA. [19] Horn, B.K.P. 1990. Height and Gradient from Shading. Int. Journal of Computer Vision, Vol. 5, pp. 37-75. [20] Ikeuchi, K. and Horn, B.K.P. 1981. Numerical Shape from Shading and Occluding Boundaries. Arti cial Intelligence, Vol 17, pp. 141-184. [21] Jacobs, D.W., Belhumeur, P.N., and Basri, R. 1998. Comparing Images under Variable Illumination. In Proc. Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA, pp. 610-617. [22] Lee, C. H. and Rosenfeld, A. 1989. Improved Methods of Estimating Shape from Shading Using the Light Source Coordinate System. Shape from Shading, Eds. B.K.P. Horn and M.J. Brooks, MIT Press: Cambridge, MA, pp. 323-569. [23] Manjunath, B.S., Chellappa, R., and Malsburg, C.V.D. 1992. A Feature Based Approach to Face Recognition. In Proc. Conference on Computer Vision and Pattern Recognition, Urbana/Champaign, IL, pp. 373-378. [24] Marroquin, J.L. 1985. Probabilistic Solution of Inverse Problems. MIT Arti cial Intelligence Laboratory, Technical Report 860. [25] Metaxas, D.N. 1997. Physics-Based Deformable Models: Applications to Computer Vision, Graphics, and Medical Imaging, Kluwer Academic: Boston, MA. [26] Moghaddam, B. and Pentland, A. 1997. Probabilistic Visual Learning for Object Representation. IEEE Trans. on PAMI, Vol. 19, pp. 696-710. [27] Nayar, S.K., Ikeuchi, K., and Kanade, T. 1991. Surface Re ection: Physical and Geometrical Perspectives. IEEE Trans. on PAMI, Vol. 13, pp. 611-634. [28] Nayar, S.K. and Murase, H. 1994. Dimensionality of Illumination Manifold in Eigenspace. Department of Computer Science, Columbia University, Technical Report CUCS-021-94. [29] Oliensis, J. 1991. Uniqueness in Shape from Shading. Int. Journal of Computer Vision, Vol. 6, pp. 75-104. [30] Onn, R. and Bruckstein, A. 1990. Integrability Disambiguates Surface Recovery in Two-Image Photometric Stereo. Int. Journal of Computer Vision, Vol. 5, pp. 105-113. [31] Pentland, A.P. 1982. Finding the Illumination Directions. Journal of the Optical Society of America A, Vol. 72, pp. 448-455. [32] Pentland, A. and Sclaro , S. 1991. Closed-form Solutions for Physically Based Shape Modeling and Recognition. IEEE Trans. on PAMI, Vol. 14, pp. 715-729. [33] Phillips, P.J., Moon, H., Rauss, R., and Rizvi, S.A. 1997. The FERET Evaluation Methodology for FaceRecognition Algorithms. In Proc. Conference on Computer Vision and Pattern Recognition, San Juan, PR, pp. 137-143,

49

[34] Phillips, P.J., Moon, H., Rizvi S., and Rauss, P. 1998. The FERET Evaluation. Face Recognition: From Theory to Applications, Eds. Wechsler, H., Phillips, P.J., Bruce, V., Soulie, F.F., and Huang, T.S. Springer: Berlin, pp. 244-261. [35] Saxberg, B.V.H. 1989. A Modern Di erential Geometric Approach to Shape from Shading. MIT Arti cial Intelligence Laboratory, Technical Report 1117. [36] Shashua, A. 1997. On Photometric Issues in 3D Visual Recognition from a Single 2D image. Int. Journal of Computer Vision, Vol. 21, pp. 99-122. [37] Sirovich, L. and Kirby, M. 1987. Low-dimensional Procedure for the Characterization of Human Faces. Journal of the Optical Society of America A, Vol. 4, pp. 519-524. [38] Swets, D.L. and Weng, J. 1996. Using Discriminant Eigenfeatures for Image Retrieval. IEEE Trans. on PAMI, Vol. 18, pp. 831-836. [39] Tsai, P.S. and Shah, M. 1992. A Fast Linear Shape from Shading. In Proc. Conference on Computer Vision and Pattern Recognition, Urbana/Champaign, IL, pp. 459-465, [40] Turk, M. and Pentland, A. 1991. Eigenfaces for Recognition. Journal of Cognitive Neuroscience, Vol. 3, pp. 72-86. [41] Wechsler, H., Phillips, P.J., Bruce, V., Soulie, F.F., and Huang, T.S. 1998. Face Recognition: From Theory to Applications. Springer: Berlin. [42] Wei, G.Q., and Hirzinger, G. 1997. Parametric Shape-from-Shading by Radial Basis Functions. IEEE Trans. on PAMI, Vol. 19, pp. 353-365. [43] Wiskott, L., Fellous, J.-M., and Malsburg, C.V.D. 1997. Face Recognition by Elastic Bunch Graph Matching. IEEE Trans. on PAMI, Vol. 19, pp. 775-779. [44] Wol , L.B. and Angelopoulou, E. 1994. 3-D Stereo Using Photometric Ratios. In Proc. European Conference on Computer Vision, Springer, Berlin, pp. 247-258, [45] Woodham, R. 1980. Photometric Method for Determining Surface Orientation from Multiple Images. Optical Engineering, Vol. 19, pp. 139-144. [46] Zhao, W., Chellappa, R., and Krishnaswamy, A. 1998a. Discriminant Analysis of Principal Components for Face Recognition. In Proc. 3rd Conference on Automatic Face and Gesture Recognition, Nara, Japan, pp. 336-341. [47] Zhao, W., Krishnaswamy, A., Chellappa, R., Swets, D.L., and Weng, J. 1998b. Discriminant Analysis of Principal Components for Face Recognition. Face Recognition: From Theory to Applications, Eds. Wechsler, H., Phillips, P.J., Bruce, V., Soulie, F.F., and Huang, T.S. Springer-Verlag: Berlin, pp. 73-85. [48] Zhao, W. 1999. Improving the Robustness of Face Recognition. In Proc. 2nd International Conference on Audio- and Video-based Person Authentication, Washington DC, pp. 78-83. [49] Zhao, W., Chellappa, R., and Phillips, P.J. 1999. Subspace Linear Discriminant Analysis for Face Recognition. Center for Automation Research, University of Maryland, College Park, Technical Report CAR-TR-914. [50] Zheng, Q. and Chellappa, R. 1991. Estimation of Illumination Direction, Albedo, and Shape from Shading. IEEE Trans. on PAMI, Vol. 13, pp. 680-702.

50