New Vehicle Detection Method with Aspect Ratio

0 downloads 0 Views 5MB Size Report
Dec 9, 2015 - Abstract: All kinds of vehicles have different ratios of width to height, which ... Most previous works, however, use a fixed aspect ratio for vehicle ...
Article

New Vehicle Detection Method with Aspect Ratio Estimation for Hypothesized Windows Jisu Kim 1 , Jeonghyun Baek 1 , Yongseo Park 2 and Euntai Kim 1, * Received: 26 August 2015; Accepted: 3 December 2015; Published: 9 December 2015 Academic Editor: Felipe Jimenez 1 2

*

The School of Electrical and Electronic Engineering, Yonsei University, Seoul 120-749, Korea; [email protected] (J.K.); [email protected] (J.B.) Department of Electrical Engineering, Gachon University, Seongnam 461-701, Korea; [email protected] Correspondence: [email protected]; Tel.: +82-2-2123-7729

Abstract: All kinds of vehicles have different ratios of width to height, which are called the aspect ratios. Most previous works, however, use a fixed aspect ratio for vehicle detection (VD). The use of a fixed vehicle aspect ratio for VD degrades the performance. Thus, the estimation of a vehicle aspect ratio is an important part of robust VD. Taking this idea into account, a new on-road vehicle detection system is proposed in this paper. The proposed method estimates the aspect ratio of the hypothesized windows to improve the VD performance. Our proposed method uses an Aggregate Channel Feature (ACF) and a support vector machine (SVM) to verify the hypothesized windows with the estimated aspect ratio. The contribution of this paper is threefold. First, the estimation of vehicle aspect ratio is inserted between the HG (hypothesis generation) and the HV (hypothesis verification). Second, a simple HG method named a signed horizontal edge map is proposed to speed up VD. Third, a new measure is proposed to represent the overlapping ratio between the ground truth and the detection results. This new measure is used to show that the proposed method is better than previous works in terms of robust VD. Finally, the Pittsburgh dataset is used to verify the performance of the proposed method. Keywords: vehicle detection; ACF; ROI estimation

1. Introduction Vehicle detection (VD) is one of the major research issues within intelligent transportation system (ITS) organizations, and considerable research has been conducted. Most of the research works consist of two steps: HG (hypothesis generation) and HV (hypothesis verification). Concerning the HG, symmetry [1,2], color [3,4], shadow [5,6] and edges [7,8] were used to select the vehicle candidates. Further, search space reduction methods were developed to save the computational resources in HG. For example, in [9], the linear model between the vehicle position and vehicle size is updated using a recursive least square algorithm. This linear model helps to generate the Region of interests (ROIs) such that they are likely to include vehicle regions. Therefore, this approach can reduce false positives as compared with the previous exhaustive search or sliding window approaches. Interestingly, in [10], image inpainting is used to verify the detection results. Image inpainting is actually a method for restoring damaged images. This approach also reduces false positives. Concerning the HV, lots of research has focused on the application of machine vision technologies to VD, as in [11,12]. The HV works based on machine vision mainly consist of features and classifiers. In case of the features, the Histogram of oriented gradient (HOG) [13], Haar-like wavelet [14], Gabor feature [15] and Aggregate channel features (ACF) [16] are generally used.

Sensors 2015, 15, 30927–30941; doi:10.3390/s151229838

www.mdpi.com/journal/sensors

Sensors 2015, 15, page–page

Haar-like wavelet takes less computational time than the HOG or Gabor feature. However, the detection performance using the Haar-like wavelet is lower than that of the HOG or Gabor feature. Sensors 2015, 15, 30927–30941 In [17], the HOG and Haar-like wavelet are combined in cascade form to reduce the computational time and to improve the detection performance. The Haar-like wavelet accelerated the hypothesis The Haar-like takes less computational time than the HOG or feature. However, the the generation (HG),wavelet while HOG verified the generated hypotheses. In Gabor addition to these features, detection performance using the Haar-like wavelet is lower than that of the HOG or Gabor feature. Gabor feature is also an effective feature for VD. The Gabor filter is a kind of band-pass filter that In [17], the HOG and Haar-like wavelet are combined in cascade form to reduce the computational extracts specific information called the Gabor feature from the frequency domain. In [18], a gamma time and to improve the detection performance. The Haar-like wavelet accelerated the hypothesis distribution is used to represent the Gabor feature for better detection performance when compared generation (HG), while HOG verified the generated hypotheses. In addition to these features, the withGabor the Gaussian Thefeature Gaborfor feature, however, takesisaa long time due to feature isdistribution. also an effective VD. The Gabor filter kind computational of band-pass filter that howextracts the computation of a convolution required. Some have been reported [19,20] to reduce specific information called theisGabor feature fromworks the frequency domain. In [18], a gamma the computational time of the Gabor filter. In case of the classifiers, the support vector machine (SVM), distribution is used to represent the Gabor feature for better detection performance when compared with the Gaussian distribution. The(NN) Gabor feature, however, takes a longfeatures. computational timeLatent due to SVM Adaboost [21] and Neural Network are used to train the various Recently, how the computation of a convolution is required. Some works have been reported [19,20] to reduce has been researched for a deformable part-based model (DPM) [22]. This model can capture the computational time theobject Gaborappearance. filter. In case of the classifiers, the support vector machine (SVM), significant deformation inof the Adaboost [21] and Neural Network are used train theratios various On the other hand, all kinds of (NN) vehicles havetodifferent offeatures. width toRecently, height, Latent which are SVM has been researched for a deformable part-based model (DPM) [22]. This model can capture called the aspect ratios. Most previous works, however, use a fixed aspect ratio for vehicle detection significant deformation in the object appearance. (VD). The use of a fixed vehicle aspect ratio [23–26] for VD degrades the performance. Thus, a new On the other hand, all kinds of vehicles have different ratios of width to height, which are called stepthe named HI (hypothesis improvement) is developed to enhance the VD performance in this aspect ratios. Most previous works, however, use a fixed aspect ratio for vehicle detection (VD). paper. theofHI, the aspect the hypothesized is estimated andThus, the result applied TheInuse a fixed vehicle ratio aspectofratio [23–26] for VD windows degrades the performance. a newisstep to the classifier in the HV. Thus, the HI is positioned betweenthethe and the HV. A part of this named HI (hypothesis improvement) is developed to enhance VDHG performance in this paper. paper was presented in [27]. The contribution of this paper is threefold: (1) the HI based on In the HI, the aspect ratio of the hypothesized windows is estimated and the result is applied to the the estimation ratio between the the HV classifierofinvehicle the HV.aspect Thus, the HIisisinserted positioned between theHG HG(hypothesis and the HV. generation) A part of thisand paper was presented in [27]. The this paper is threefold: (1) horizontal the HI basededge on the estimation (hypothesis verification); (2) contribution a simple HGofmethod named a signed map is proposed of vehicle aspect ratio is inserted between the HG (hypothesis generation) and the HV (hypothesis to speed up VD; (3) a new measure is proposed to quantify how well the detection result matches (2) aThis simple HG method a signed horizontal edge map is proposed up than the verification); ground truth. measure can named be used to show that the proposed methodtoisspeed better VD; (3) a new measure is proposed to quantify how well the detection result matches the ground previous methods in terms of robust VD. truth. This measure can be used to show that the proposed method is better than previous methods The remainder of this paper is organized as follows: In Section 2, the proposed vehicle detection in terms of robust VD. system is briefly outlined. In Section 3, the vehicle aspect ratio is estimated and is used to generate an The remainder of this paper is organized as follows: In Section 2, the proposed vehicle detection efficient ROI. In Section 4, some experiments are conducted verify theand validity proposed system is briefly outlined. In Section 3, the vehicle aspect ratio to is estimated is usedof to the generate method. Some ROI. conclusions drawn in Section are 5. conducted to verify the validity of the proposed an efficient In Sectionare 4, some experiments method. Some conclusions are drawn in Section 5.

2. Motivation and System Overview 2. Motivation and System Overview

Figure 1 shows the distribution of vehicle aspect ratios in the Pittsburgh dataset. Figure 1 shows the distribution of vehicle aspect ratios in the Pittsburgh dataset.

Figure 1. The distribution of vehicle aspect ratios in the Pittsburgh dataset.

Figure 1. The distribution of vehicle aspect ratios in the Pittsburgh dataset.

30928

2

Sensors 2015, 15, 30927–30941

Sensors 2015, 15, page–page

A A total total of of 10,907 10,907 vehicles vehicles are are used used in in the the Pittsburgh Pittsburgh dataset, dataset, as as shown shown in in Figure Figure 1. 1. As As can can be be seen in the figure, the vehicle aspect ratio is approximately 1, but it varies from 0.5 to 2, depending seen Sensors in the2015, figure, the vehicle aspect ratio is approximately 1, but it varies from 0.5 to 2, depending 15, page–page on Examples of of vehicle vehicle images images are are given on the the types types of of vehicles vehicles and and the the camera camera viewpoint. viewpoint. Examples given in in Figure Figure 2. 2. In general, sedans have low vehicle aspect ratios, while trucks or buses have high vehicle aspect A total of 10,907 are used in the Pittsburgh Figure 1. vehicle As can be In general, sedans have vehicles low vehicle aspect ratios, while dataset, trucks as or shown buses in have high aspect ratios, as in 2. Thus, the use use ofis aaapproximately fixed aspect can degrade the performance. seen in the figure, the vehicle aspect ratioof 1, butin it VD varies from 0.5 to 2, depending ratios, as shown shown in Figure Figure 2. Thus, the fixed aspect ratio ratio in VD can degrade the performance. on the types of vehicles and the camera viewpoint. Examples of vehicle images are given in Figure 2. In general, sedans have low vehicle aspect ratios, while trucks or buses have high vehicle aspect ratios, as shown in Figure 2. Thus, the use of a fixed aspect ratio in VD can degrade the performance.

(a)

(b)

(c)

Figure images for various various vehicle vehicle aspect ratios: ratios: (a) (a) 0.7; (b) (b) 1; 1; and and (c) (c) 1.5. 1.5. (a) for (b) aspect (c)0.7; Figure 2. 2. Vehicle Vehicle images Figure 2. Vehicle images for various vehicle aspect ratios: (a) 0.7; (b) 1; and (c) 1.5.

The proposed system is outlined in Figure 3; as shown in the figure, the HI is positioned The proposed system is outlined in Figure 3; as shown in the figure, the HI is positioned between between The the proposed HG and system the HV. the HG, a simple a signed edge is is In outlined in Figure 3; as method shown innamed the figure, the HIhorizontal is positioned the HG and the HV. In the HG, a simple method named a signed horizontal edge is developed to developed to the provide good windows. the HI,named the aspect ratios of the hypothesized between HG and thehypothesized HV. In the HG, a simpleInmethod a signed horizontal edge is provide good hypothesized windows. In the HI, the aspect ratios of the hypothesized windows developed to provideby good hypothesized windows. In thehorizontal HI, the aspect ratios hypothesized windows are estimated combining the symmetry and edges of of thethe vehicles. In the HV, are estimated by estimated combining the symmetry and horizontal edges edges of theofvehicles. InInthe are byto combining the symmetry and horizontal the vehicles. theHV, HV,ACF ACF windows is employed with SVM test the hypothesized windows with estimated aspect ratios. is employed with SVM to test the hypothesized windows with estimated aspect ratios. ACF is employed with SVM to test the hypothesized windows with estimated aspect ratios.

Figure Flowchart chartof ofthe the proposed proposed vehicle system. Figure 3. 3. Flow vehicledetection detection system.

3 Figure 3. Flow chart of the proposed vehicle detection system.

30929

3

Sensors 2015, 15, page–page

3. Proposed Sensors 2015, 15,Method 30927–30941 In this section, the three steps of the proposed method are explained. The results of the three steps are summarized, 3. Proposed Method as in Figure 4. As in [28], the hypotheses for the vehicles are generated in the HG. Here, let us denote the i -th hypothesis for the vehicle as wi   xi , yi , wi , hi  , where  xi , yi  In this section, the three steps of the proposed method are explained. The results of the three hi are denotes left-lower position of the and wi and the associated width i -th steps arethe summarized, as in Figure 4. As in hypothesis [28], the hypotheses for the vehicles are generated inand the HG. Here, let the i-th hypothesis vehicle as wi ratio, “ pxi ,or yi ,the wi , hequivalent the height of us thedenote window, respectively. Infor thethe HI, the aspect height, of the i q, where px i , yi q denotes the left-lower position i-th hypothesis and w and h are the associated width and the height hypothesized window ofwthe is estimated. Initially, the height of the hypothesis is set to i i h  2wi , i i of the window, respectively. In the HI, the aspect ratio, or the equivalent height, of the hypothesized which is long enough to include all kinds of vehicles, as shown in Figure 1. As in Figure 4a, the window wi of is estimated. Initially, the height Figure of the hypothesis is set to hi “of2w is long i , which candidates vehicle width are generated. 4b shows the results the estimation of enough vehicle to include all kinds of vehicles, as shown in Figure 1. As in Figure 4a, the candidates of vehicle width height hi for the given vehicle width wi . Let us denote the estimated value for the vehicle height are generated. Figure 4b shows the results of the estimation of vehicle height hi for the given vehicle ˆ . Then, the i -th hypothesis is computed by w by . Figure shows the h yi , whˆii,.hˆiThen, i  xi , by widthi wi . Let us denote the estimated value for the vehicle ˆheight the i-th4chypothesis is ´ ¯ ˆ computed by w ˆ i “ xi , ygiven Figure 4c of shows the Finally, hypothesized windows given as to thetest result hypothesized windows result the HI. ACF and SVM are used all of i , wi , has i . the





windows given HI. The vehicle detection results of thegiven proposed the hypothesized HI. Finally, ACF and SVM arefrom usedthe to test all of the hypothesized windows from method the HI. are shown in Figure 4d. The vehicle detection results of the proposed method are shown in Figure 4d.

(a)

(b)

(c)

(d)

Figure 4.4.The Theframework framework of the proposed method: (a) shows theofresult of by the HG horizontal by signed Figure of the proposed method: (a) shows the result the HG signed horizontal edges  hi  2wi  ; (b) is the result of the estimation of the vehicle height; (c) shows the edges phi “ 2wi q; (b) is the result of the estimation of the vehicle height; (c) shows the hypothesized hypothesized given of shows the HI;the and (d) shows the vehicle windows givenwindows as the result of as thethe HI;result and (d) vehicle detection results detection in the HV.results in the HV.

3.1. Hypothesis Generation (HG)—A Signed Horizontal Edge Map 3.1. Hypothesis Generation (HG)—A Signed Horizontal Edge Map An efficient hypothesis generation method for VD was reported by Alonso et al. in [28]. An efficient hypothesis methodbyfor VD was reported by Alonso et al. in [28]. The The method uses an absolute generation edge map defined method uses an absolute edge map defined by (1) E px, yq “ |Eh px, yq ´ Ev px, yq| E  x, y   Eh  x, y   Ev  x, y  (1) where Eh px, yq and Ev px, yq represent the horizontal and vertical gradient images, respectively. Figure 5a,c shows an original vehicle image I px, yq and the associated absolute edge map E px, yq 4 30930

Sensors 2015, 15, page–page

where Eh  x, y  and Ev  x, y  represent the horizontal and vertical gradient images, respectively. Figure 5a,c shows an original vehicle image I  x, y  and the associated absolute edge map E  x, y 

Sensors 2015, 15, 30927–30941

and the generated hypotheses. The absolute edge map method in [28] is very efficient but it has the drawback that the absolute edge map E  x, y  sometimes misses some weak horizontal edges such and the generated hypotheses. absolute edge map in [28] issome veryweak efficient but it has the as vehicle shadows, degrading The the VD performance. Tomethod avoid missing horizontal edges drawback that the absolute edge map E px, yq sometimes misses some weak horizontal edges such such as shadow edges, the signed horizontal edge map computed by as vehicle shadows, degrading the VD performance. To avoid missing some weak horizontal edges 2 1  1 by such as shadow edges, the signed horizontal edge map computed  Es  x, y   I  x, y   H , H»  0 0 0 fi (2) ´1 ´2 ´1   2 1 ffi — 1 Es px, yq “ I px, yq ˚ H, H“– 0 (2) 0 0 fl It is used as in Figure 5b. The signed horizontal edge 1map 2 Es  1x, y  takes into account both sign and magnitude of the horizontal edges and outperforms the absolute edge map E  x, y  in It is used as in Figure 5b. The signed horizontal edge map Es px, yq takes into account both detecting the edges between shadows and roads since the shadows to be darker thanEthe sign and magnitude of the horizontal edges and outperforms thetend absolute edge map px, roads. yq in Figure 5b,d show the same image , and the associated edge and the generated I  xand , y  roads since the shadows tend Es  x,to y  be darker than the detecting the edges between shadows yq and the generated roads. FigureThe 5b,dsigned show the same image I px, yq, and edge the Es px, hypotheses. horizontal edge map outperforms absolute edges image Es  the x, y associated hypotheses. The signed horizontal edge map Es px, yq outperforms the absolute edges image E px, yq E  x, y  in the HG. in the HG.

(a)

(b)

(c)

(d)

Figure 5. 5. Hypothesis edge image image by by [28]; [28]; (b) (b) Signed Signed Figure Hypothesis generation generation methods methods for for VD: VD: (a) (a) Absolute Absolute edge horizontal edge edge image; image; (c) (c) HG HG by by absolute absolute edge edge image image [28]; [28]; and and (d) (d) HG HG by by signed signed horizontal horizontal horizontal edge image. edge image.

3.2. Hypothesis Improvement (HI)–Aspect Ratio Estimation 3.2. Hypothesis Improvement (HI)–Aspect Ratio Estimation In this subsection, the symmetry of the vehicle images, the horizontal edges and the prior In this subsection, the symmetry of the vehicle images, the horizontal edges and the prior knowledge about the aspect ratio of the vehicles are combined to estimate the aspect ratio of the knowledge about the aspect ratio of the vehicles are combined to estimate the aspect ratio of the hypothesized windows provided by the HG. hypothesized windows provided by the HG. 5

30931

3.2.1. Symmetry The basic idea of this subsection is to exploit the fact that vehicles are symmetric while the backgrounds are not, as shown in Figure 6. The symmetry for each value in the y axis is Sensors 2015, 15, 30927–30941

computed as follows:

(1) 3.2.1. The Symmetry given hypothesized window is flipped horizontally as in Figure 6, making a mirror image. Figure 6a,b show examples of the original image and the corresponding flipped image, The basic idea of this subsection is to exploit the fact that vehicles are symmetric while the respectively. For jare s that the symmetry two images are almost for j s that backgrounds not, belong as showntoinvehicles, Figure 6. The for each value in the the ysame axis iswhile, computed as backgrounds, follows: belong to the two images are different from each other. (1) The given hypothesized window is flipped as in Figure 6,window, making a mirror (2) In order to quantify the symmetry of thehorizontally given hypothesized the similarity image. Figure 6a,b show examples of the original image and the corresponding flipped image, between the hypothesized window and the mirror image is computed. Instead of intensity values, respectively. For js that belong to vehicles, the two images are almost the same while, for js that gradientbelong values in images the aretwo used because theyfrom areeach more robust than intensity values under to backgrounds, images are different other. various illuminations. thethe HOG feature vector, is used. Thewindow, HOG feature is a part of ACF and (2) In order toThus, quantify symmetry of the given hypothesized the similarity between hypothesized and the mirror image is computed. Instead of intensity values, includesthethe gradient window magnitude and orientation. The HOG feature vector for gradient a hypothesized values in images are used because they are more robust than intensity values under TIJ T window can be denoted by H  F1,1 , , FI ,1 , , F1, J , , FI , J   ; Fi , j   Bi1, j , , BiT, j various denotes   

the

illuminations. Thus, the HOG feature vector, is used. The HOG feature is a part of ACF and includes t the gradient magnitude and orientation. HOG feature for a hypothesized window can be histogram of the block, and BThe thevector sum of gradient according ” the ı magnitudes i , j denotes “i, j ‰ TI J 1 T denoted by H “ F1,1 , ¨ ¨ ¨ , F I,1 , ¨ ¨ ¨ , F1,J , ¨ ¨ ¨ , F I,J P < ; Fi,j “ Bi,j , ¨ ¨ ¨ , Bi,j P < T denotes the

 

to the orientation bin t in the  i, Bjt  denotes block,the where I and J are the numbers of column and histogram of the pi, jq block, and sum of the gradient magnitudes according to the i,j orientation bin twindow, in the pi, jq respectively, block, where I and are the numbers of column row blocks row blocks of the as Jshown in Figure 6; Tanddenotes theof the number of window, respectively, as shown in Figure 6; T denotes the number of orientation bins in the HOG; orientation bins in the HOG; i, j and t are the indices for I , J and T , respectively. i, j and t are the indices for I, J and T, respectively.

(a)

(b) Figure 6.Figure The 6. procedure of using thethesymmetry: (a)isisthe the HOG feature the hypothesized The procedure of using symmetry: (a) HOG feature of theof hypothesized window; window; (b)HOG is the HOG feature of the flippedhypothesized hypothesized window pT “9q. and (b) and is the feature of the flipped window T  9 .

(3) For the HOG of the hypothesized window H  F1,1 , , FI ,1 , , F1, J , , FI , J  TIJ and the   30932 F F F F F TIJ HOG of the associated flipped image H  F1,1 , , FI ,1 , , F1, J , , FI , J   , the similarity between 

the two HOGs is defined by



  B1,1  B1,1  

B1,1  B1,1 

 BI ,1  BI ,1

BI ,1  BI ,1 

T

T

 B  B1,1FJ

B1,TJ  B1,TFJ 

1 Sensors 2015, 15, 30927–309411, J

 BI1, J  BI1,FJ

T

BIT, J  BITF, J T

(3)

    

‰ (3) For the HOG of the hypothesized window H” “ F1,1 , ¨ ¨ ¨ , F I,1 , ¨ ¨ ¨ , F1,J , ¨ ¨ ¨ ,ı F I,J P < TI J   F F TIJ F , ¨ ¨ ¨ , FF TI J , the and the  HOG flipped F1,1 , ¨ ¨ ¨ , FFI,1 , ¨ ¨ ¨ , F1,J  s1,1of the associated I,J P < s I ,1 s1, J image H s I , J “  similarity between the two HOGs is defined by   “

T

T

T

«

F F 1,1 ˝ F1,1 loooomoooon



T

F I,1 ˝ FFI,1 ¨ ¨ ¨ loooomoooon multiplication;

F F1,J ˝ F1,J loooomoooon 

¨¨¨

¨¨¨ 1

FFI,J ˝ FFI,J 1looomooon

si , j   Bi , j  Bi , j P< T P< T » ” ” 1 1F T TF 1 1F T TF ¨¨¨ B I,1 ¨ B B1,1vector ¨ B1,1 ¨block ¨ ¨ B1,1can ¨ B1,1be quantified symmetry for the j -th–row by I,1 ¨ ¨ ¨ B I,1 ¨ B I,1 where

H ˝ HF “ denotes S “component-wise P< T



T ıP