The Online Gait Measurement for Characteristic Gait Animation ...

The Online Gait Measurement for Characteristic Gait Animation Synthesis Yasushi Makihara1, Mayu Okumura1, Yasushi Yagi1, and Shigeo Morishima2 1

The Institute of Scientific and Industrial Research, Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan, 2 Department of Science and Engineering , Waseda University, 3-4-1 Okubo Shinjuku-ku, 1698555 Tokyo, Japan, {makihara,okumura,yagi}@am.sanken.osaka-u.ac.jp, [email protected]

Abstract. This paper presents a method to measure online the gait features from the gait silhouette images and to synthesize characteristic gait animation for an audience-participant digital entertainment. First, both static and dynamic gait features are extracted from the silhouette images captured by an online gait measurement system. Then, key motion data for various gaits are captured and a new motion data is synthesized by blending key motion data. Finally, blend ratios of the key motion data are estimated to minimize gait feature errors between the blended model and the online measurement. In experiments, the effectiveness of gait feature extraction were confirmed by using 100 subjects from OU-ISIR Gait Database and characteristic gait animations were created based on the measured gait features.

1

Introduction

Recently, audience-participant digital entertainment has gained more attention, where the individual features of participants or users are reflected to computer games and Computer Graphic (CG) cinemas. In EXPO 2005 AICHI JAPAN[1],the Future Cast System (FCS)[2] in the Mitsui-Toshiba pavilion[3] has been performed as a one of large-scale audience-participant digital entertainments. The system captures an audience’s facial shape and texture online, and a CG character’s face in the digital cinema is replaced by the captured audience’ face. In addition, as an evolutional version of the FCS, Dive Into the Movie(DIM) Project[4] tries to reflect not only the individual features of face shape and texture but also those of voice, facial expression, facial skin, body type, and gait. Among these, body type and gait are expected to attract more attention of the audience because there are reflected to the whole body of the CG character. For the purpose of gait measurement, acceleration sensors[5][6][7][8] and Motion Capture (MoCap) systems[9] have been widely used. These systems are, however, unsuitable for an online gait measurement system because it takes much time for the audiences to wear the acceleration sensors or to attach Mo-Cap markers. R. Shumaker (Ed.): Virtual and Mixed Reality, Part I, HCII 2011, LNCS 6773, pp. 325–334, 2011. © Springer-Verlag Berlin Heidelberg 2011

326

Y. Makihara et al.

In a computer vision-based gait analysis area, both model-based approaches[10][11]and appearance-based approaches[12][13] have been proposed, which can measure gait feature without any wearable sensors or attached markers. In the model-based methods, a human body is expressed as articulated links or generic cylinders and it is fit to the captured image to obtain both static features like link length and dynamic features like joint angles separately. Although these features can be used for the gait feature measurement, the method is unsuitable for online measurement because of high computational cost and difficulties of model fitting. The appearance-based methods extract gait features directly from the captured images without troublesome model fitting. In general, extract features are composite of both static and dynamic components. Although the composite features are still useful for gait-based person identification[10][11][12][13], they are unsuitable for separate measurement of static and dynamic components. Therefore, we propose a method of online measurement of intuitive static and dynamic gait features from the silhouette sequences and also a method of characteristic gait animation synthesis in the digital cinema. Side-view and frontview cameras capture image sequences of target subject’s straight walk, and Gait Silhouette Volume (GSV) is constructed via silhouette extraction, silhouette size normalization, and registration. Then, the both static and dynamic components are measured separately from the GSV. Because the proposed method extracts the gait features directly from the silhouette sequence without model fitting, its computational cost get much lower than that of the model-based methods, which enables online measurement of the audience’s gait.

2 2.1

Gait Feature Measurement GSV Construction

The first step in gait feature measurement is the construction of a spatiotemporal Gait Silhouette Volume (GSV). First, gait silhouettes are extracted by background subtraction and a silhouette image is defined by a binary image whose pixel value is 1 if it is inside the silhouette and is 0 otherwise. Second, the height and center values of the silhouette region are computed for each frame. Third, the silhouette is scaled so that the height is a pre-determined size, whilst maintaining the aspect ratio. In this paper, the size is set to a height of hg = 60 pixels and a width of wg = 40 pixels. Fourth, each silhouette is registered such that its center corresponds to the image center. Finally, a spatio-temporal GSV is produced by stacking the silhouettes on the temporal axis. Let f(x, y, n) be a silhouette value of the GSV at position (x, y) of the nth frame. 2.2

Estimation of Key Gait Phases

The next step is estimation of two types of key gait phases: Single Support Phase (SSP) and Double Support Phase (DSP) where subject’s legs and arms are the closest and the most spraddle, respectively. The SSP and DSP are estimated as local

The Online Gait Measurement for Characteristic Gait Animation Synthesis

327

minimum and maximum of the second-order moment around the central vertical axis of the GSV in the half gait cycle. The second-order moment at n-th frame within the vertical range of [ht, hb] is defined as (1) where xc is horizontal center of the GSV. Because the SSP and DSP occur once per a half gait cycle alternately, those at i-th half gait cycle are obtained as follows: (2)

and are initialized to zero, and gait period gp is detected by where maximizing the normalized autocorrelation of the GSV for the temporal axis[14]. In the following section, NSSP and NDSP represent the number of the SSP and DSP respectively. Note that SSP and DSP focused on arms, legs, and entire body are computed by defining the vertical range [ht, hb] appropriately. In this paper, the ranges of arms and legs are defined as [0.33hg, 0.55hg] and [0.55hg, hg] respectively. Figure 2 shows the result of the estimation of SSP and DSP.

Fig. 1. Example of GSV

Fig. 2. SSP (left) and DSP (right)

2.3

Fig. 3. Measurement of the waist size

Measurement of the Static Feature

Static features contain the height, the waist size in width and bulge. When the static features are measured, GSV at SSP is used to reduce the influence of the arm swing and the leg bend. First, the number of the silhouette pixels at height y of the side and front GSV are defined as the width w(y) and the bulge b(y) respectively as shown in Fig. 3. Then, their averages WGSV and BGSV within the waist range of heights [yWt, yWb] and [yBt, yBb] are calculated respectively.

328

Y. Makihara et al.

(3)

(4)

(5) Then, given the height on the original size image Hp, the waist size in the width and the bulge on the image Wp and Bp are computed as (6) Finally, given the distance l from the camera to the subject and the focal length f of the camera, the real height H and the waist size in the width W and the bulge B are (7) For statistical analysis of the static features, 100 subjects in OU-ISIR Gait Database[15] were chosen at random and front- and left side-view cameras were used for measurement. Figure 4(a) shows the relation between the measured height and the questionnaire result range of the height. The measured heights almost lie within the questionnaire result range. The measured heights of some short subjects (children) are, however, out of lower bound of the questionnaire result. These errors may result from self-enumeration errors due to the rapid growth rate of the children.

Fig. 4. The measurements and questionnaire results


329

Figure 4(b) shows the relation between the measured waist size in bulge and the questionnaire result of the weight and it indicates that the waist size correlates with the weight to some extent. For example, the waist size of light subject A and heavy subject B are small and large as shown in Fig. 4(b), respectively. As an exceptional example, the waist size of light subject C becomes large by mistake because he/she wears a down jacket.

Fig. 5. Arm swing areas. In (a), front and back lines are depicted as red and blue lines, respectively. In (b), front and back arm swing areas are painted in red and blue, respectively.

2.4

Measurement of the Dynamic Feature

Step: Side-view silhouettes are used for step estimation. First, walking speed v is computed by distance between silhouette positions at the first and the last frame and elapsed time. Then, the averaged steps’ length are computed by multiplying the walking speed v and the half gait cycle gp=2. Arm swing: The side-view GSV from SSP to DSP is used for arm swing measurement. First, the body front and back boundary line (let them be lf and lb respectively) are extracted from a gait silhouette image at SSP, and then front and back arm swing candidate areas RAf and RAb are set respectively as shown in Fig. 5(a). Next, the silhouette sweep image Fi(x, y) are calculated for i-th interval from SSP to the next DSP as

(8)

(9)

where sign is sign function. Finally, the front and back arm swing areas are computed as areas of the swept pixels of Fi(x, y) in the RAf and RAb respectively. (10)

(11)

330

Y. Makihara et al.

For statistical analysis of the arm swing, the same 100 subjects as the static feature analysis are used. Figure 6 shows the result of the measured the front swing arm area. We can see that the measured arm swings are widely distributed, that is, they are useful cues for synthesizing characteristic gait animation. For example, arm swing for subject A (small), B (middle), and C (large) can be confirmed in the corresponding gait silhouette image at DSP on the graph. In addition, though the asymmetry of the arm swing is not so outstanding, some subjects have the asymmetry such as Subject D. Stoop: We propose two methods of stoop measurement: slope-based and curvaturebased methods. In the both methods, a side-view gait silhouette image at a SSP is used to reduce the influence of the arm swing, and a back contour is extracted from the image. Then, the slope of the back line is computed by fitting the line lb to the back contour, and also the curvature is obtained as maximum k-curvatures of the back contour (k is set to 8 empirically). For statistical analysis of the stoop, the same 100 subjects are used. Figure 7 shows the result of the measured the stoop. By measuring both the slope and the curvature of the back contour, various kinds of the stoops are measured such as large slope and small curvature (e.g., Subject A), large slope and large curvature (e.g., Subject B), and small slope and large curvature (e.g., Subject C).

Fig. 6. Distribution of front arm swing areas and its asymmetry

3 3.1

Fig. 7. Distribution of stoop with slope and curvature

Gait Animation Synthesis Motion Blending

Basically, a new gait animation is synthesized by blending a small number of motion data called key motions in the same way as [16]. First, a subject was asked to walk on a treadmill at a speed of 4 [km/h] and its 3D motion data was captured by using motion capture system “Vicon”[9]. The motion data is composed of n walking styles with variations in terms of step width, arm swing, stoop. Second, two steps of sequence are clipped from the whole sequence to produce key motions M = {mi}. Third, because all the motion data M needs to be synchronized before blending, synchronized motion S = {si} is generated by using time warping[16]. Finally, a blended motion is synthesized as


331

(12) where αi is a blend ratio for i-th key motion data and a set of blend ratios is denoted by α = {αi}. The remaining issue is how to estimate the blend ratios based on the appearance-based gait features measured in the proposed framework and it is described in the following section. 3.2

Blend Ratio Estimation

First, a texture-less CG image sequence for each key motion is rendered as shown in Fig. 8. Second, m dimensional appearance-based gait feature vi for i-th key motion data si is measured in the same way as described before. Then, let’s assume the gait feature vector of the blended motion data is approximated as a weighted linear sum of those of the key motions vi (13) Then, the blend ratio α is estimated so as to minimize errors between the gait of the blended model and the online measured gait features v (call it an features input vector later). The minimization problem is formulated as the following convex quadratic programming.

(14)

The above minimization problem is solved with the active set method. Moreover, when the number of the key motion data n is larger than the dimension of the gait features m, the solution to Eq. (14) is indeterminate. On the other hand, Eq. (13) is just the approximate expression because the mapping from the motion data domain to the appearance-based gait feature domain is generally nonlinear. Therefore, it is desirable to choose nearer gait features to the input feature. Thus, another cost function is defined as an inner product of the blend ratio and the cost weight vector w of the Euclidean distance from the features of each key motion data to the input feature is defined as w = [||v1-v||, ..., ||vn - v||]T. Given one of the solution to Eq. (14) as αCQP and the resultant blended model vα = V αCQP , the minimization problem can be written as a linear programming.

(15)

Finally, this linear programming is solved with the simplex method.

332

3.3

Y. Makihara et al.

Experimental Results

In this experiment, three gait features: arm swing, step, and stoop are reflected to the CG characters, and seven key motion data and three test subjects are used as shown in Table 1 respectively. Note that the gait features are z-normalized so as to adjust scales among them. The resultant blend ratios are shown in Tab. 2. For example, Subject A with large arm swing gains the largest blend ratio of the Arm Swing L and it results in the synthesized silhouette of the blended model with large arm swing (Fig. 8). The gait features of Subject B (large step and small arm swing) and C (large stoop and large arm swing) are also successfully reflected in both terms of the blend ratios and the synthesized silhouettes as shown in Tab. 2 and Fig. 8. Moreover, the gait animation synthesis in conjunction with texture mapping was realized in the audience-participant digital movie as shown in Fig. 9 and it plays a certain role in identifying the audience in the digital movie. Table 1. Gait features for key motions and inputs Key motion Arm swing L Arm swing S Step L Step S Stoop Recurvature Average Input A B C

Arm swing 3.50 -1.00 -0.34 -0.30 1.11 1.48 0.30 Arm swing 3.40 -0.84 1.83

Step

Stoop

0.51 -1.44 3.00 -2.30 -0.61 -0.68 -1.02 Step

-0.55 -0.63 -1.27 1.12 2.50 -3.00 -1.49 Stoop

0.34 1.13 0.52

0.23 -1.23 2.46

Fig. 8. The synthetic result


333

Table 2. The blending ratio of the key motion data Key motion/Input Arm swing L Arm swing S Step L Step S Stoop Recurvature Average

A

B

C

0.83 0 0 0 0.17 0 0

0 0.42 0.58 0 0 0 0

0.16 0 0.04 0 0.80 0 0

Fig. 9. Screen shot in digital movie

4

Conclusion

This paper presents a method to measure online the static and dynamic gait features separately from the gait silhouette images. The sufficient distribution of the gait features were observed in the statistical analysis with large-scale gait database. Moreover a method of characteristic gait animation synthesis was proposed with the blend ratio estimation of the key motion data. The experimental results show the gait feature like arm swing, step, and stoop are effectively reflected to the synthesized blended model. Acknowledgement. This work is supported by the Special Coordination Funds for Promoting Science and Technology of Ministry of Education, Culture, Sports, Science and Technology.

References 1. EXPO 2005 AICHI JAPAN, http://www.expo2005.or.jp/en/ 2. Morishima, S., Maejima, A., Wemler, S., Machida, T., Takebayashi, M.: Future cast system. In: ACM SIGGRAPH 2005 Sketches, SIGGRAPH 2005. ACM, New York (2005)

334

Y. Makihara et al.

3. MITSUI-TOSHIBA Pavilion, http://www.expo2005.or.jp/en/venue/pavilionprivateg.html 4. Morishima, S.: Yasushi Yagi, S.N.: Instant movie casting with personality: Dive into the movie system. In: Proc. of Invited Workshop on Vision Based Human Modeling and Synthesis in Motion and Expression, Xian, China, pp. 1–10 (September 2009) 5. Gafurov, D., Helkala, K., Sondrol, T.: Biometric gait authentication using accelerometer sensor. Journal of Computer 1(7), 51–59 (2006) 6. Gafurov, D., Snekkenes, E., Bours, P.: Improved gait recognition performance using cycle matching. In: 2010 IEEE 24th Int. Conf. on Advanced Information Networking and Applications Workshops (WAINA), pp. 836–841 (2010) 7. Rong, L., Jianzhong, Z., Ming, L., Xiangfeng, H.: A wearable acceleration sensor system for gait recognition. In: 2nd IEEE Conf. on Industrial Electronics and Applications, pp. 2654–2659 (2007) 8. Rong, L., Zhiguo, D., Jianzhong, Z., Ming, L.: Identification of individual walking patterns using gait acceleration. In: The 1st Int. Conf. on Bioinformatics and Biomedical Engineering, pp. 543–546 (2007) 9. Motion Capture Systems Vicon, http://www.crescentvideo.co.jp/vicon/d 10. Yam, C., Nixon, M., Carter, J.: Extended model based automatic gait recognition of walking and running. In: Proc. of the 3rd Int. Conf. on Audio and Video-based Person Authentication, Halmstad, Sweden, pp. 278–283 (June 2001) 11. Cuntoor, N., Kale, A., Chellappa, R.: Combining multiple evidences for gait recognition. In: Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, vol. 3, pp. 33–36 (2003) 12. Sarkar, S., Phillips, J., Liu, Z., Vega, I., Grother, P., Bowyer, K.: The humanid gait challenge problem: Data sets, performance, and analysis. Trans. of Pattern Analysis and Machine Intelligence 27(2), 162–177 (2005) 13. Han, J., Bhanu, B.: Individual recognition using gait energy image. Trans. on Pattern Analysis and Machine Intelligence 28(2), 316–322 (2006) 14. Makihara, Y., Sagawa, R., Mukaigawa, Y., Echigo, T., Yagi, Y.: Gait recognition using a view transformation model in the frequency domain. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 151–163. Springer, Heidelberg (2006) 15. OU-ISIR Gait Database, http://www.am.sanken.osaka-u.ac.jp/gaitdb/index.html 16. Kovar, L., Gleicher, M.: Flexible automatic motion blending with registration curves. In: ACM SIGGRAPH 2003, pp. 214–224 (2003)