an effective histogram binning for mutual information

AN EFFECTIVE HISTOGRAM BINNING FOR MUTUAL INFORMATION BASED REGISTRATION OF OPTICAL IMAGERY AND 3D LIDAR DATA Ebadat G. Parmehr, Clive S. Fraser, Chunsun Zhang, Joseph Leach Cooperative Research Centre for Spatial Information, Department of Infrastructure Engineering University of Melbourne, VIC 3010, Australia [email protected], {c.fraser, chunsunz, leach}@unimelb.edu.au ABSTRACT Automatic registration of multi-sensor data is a basic step in data fusion applications. Mutual information (MI) has been widely used in medical and remote sensing image registration. In this paper, an effective histogram binning technique is proposed to improve the robustness of image registration using MI and Normalized MI (NMI). Increasing the bin size improves the robustness of MI to local maxima that occur in the convergence surface of MI. In addition, the computation cost of registration is decreased due to use of a smaller joint pdf, without decreasing the accuracy. The performance of the proposed method in the registration of aerial imagery with LiDAR data has been experimentally evaluated and the results obtained are presented. Index Terms— Mutual information, registration, LiDAR, optical imagery 1. INTRODUCTION The increasing availability of Earth observation data has facilitated the generation of spatial information from integrated multi-sensor data. The associated integration of multi-sensor data, especially imagery and ranging data, is thus an active research topic in remote sensing. For instance, integration of data from LiDAR (Light Detection And Ranging) and electro-optical sensors is required for applications such as feature extraction, change detection analysis, classification, 3D city modelling, virtual reality and urban planning[1]. Accurate, automated co-registration of data within a common geodetic reference system is a necessary prerequisite for the data fusion process. Feature-based image registration methods rely on the robust extraction and matching of common features such as points, line segments and patches between data sets[2]. However, matching can suffer from an absence of sufficient physical correspondences and so produce erroneous results. In contrast, intensity-based image registration methods utilize statistical relationships between data sets. The statistical relationships can be defined via either image grey values, such as with the sum of squared intensity differences and normalized cross-correlation (limited to registration of

978-1-4799-2341-0/13/$31.00 ©2013 IEEE

data acquired from the same sensor type), or the joint probability distribution function (pdf). Mutual Information (MI), originating from information theory, measures the transformable information between two data sets using the joint pdf. This particular property has made MI suitable for registration of images from either the same or different sensor types. The effectiveness of the MI-based methods for registration of multi-modal medical images has encouraged the Computer Vision and Remote Sensing communities to adopt this approach for multi-sensor data registration[3]. For a comprehensive review of several aspects of the use of mutual information-based registration methods, see [4]. The maximum value of MI is expected to be achieved when the images are geometrically aligned [5], [6]. Since the joint pdf changes with the relative transformation of the images, formation of the joint pdf must accompany the transformation. The joint pdf can be estimated by either Parzen windowing, where a non-parametric distribution is fit to data, or histogramming, which can be used when estimates of derivatives are not needed[7]. While MI has proven to be effective in multi-sensor registration, it suffers from the appearance of local maxima in the convergence surface. This can degrade the registration accuracy and robustness[8]. The local maxima can occur because of existing good local matches between the images and noise in the data set[4]. In fact, the noise of images can cause registration to fail by generating artificial structures that have higher statistical similarity rather than real regions of images. Up until now, the general approach to address the problem of local optima related to noise in the convergence surface of MI has been either to reduce the noise by filtering of images, or to increase the bin size of the histogram (kernel bandwidth in Parzen window). The idea of optimal binning is to determine a sufficient number of bins to capture the major features in the data and ignore fine details. Several optimal data-based binning methods for the histogram have been proposed in statistics literature [9–11]. These classical approaches operate under the assumption that the density function is known, while the joint pdf of imaging and ranging data is unknown. Moreover, recently proposed methods for optimal binning of the histogram, such as Bayesian-based framework [12], do not consider the

1286

ICIP 2013

effect of bin size on the MI value. This is because optimal data-based binning may highlight the non-prominent information instead of structural information and so decrease the registration accuracy. Improving the robustness of registration by decreasing the bin number has been reported in medical imaging [13–15] as well as remote sensing multi-senor image registration [16]. Contrary to the proposed method in this paper, those referred to above do not analyze the effect of bin size on MI value behaviour. In addition, they do not measure the peakedness of the convergence surface of MI which plays a crucial role in the optimization procedure. Aerial imagery and 3D LiDAR data have been utilized in this paper to efficiently analyze the effect of histogram bin size on the automatic registration of multi-sensor data. The presentation is organized as follows: Section 2 defines MI and NMI similarity measures and introduces the proposed histogram binning method. Next, in Section 3, the performance of the proposed technique is experimentally evaluated. Conclusions follow in Section 4. 2. METHODOLOGY In the intensity-based image registration approach, the maximum value of the similarity measure is expected when the images are geometrically aligned. The capability of MI as a similarity measure to establish the non-linear relationship between multi-sensor images has been reported for a wide range of applications [4]. 2.1. Mutual information Mutual information is based on information theory and has been defined in the literature [17], [18] in various forms such as entropy, conditional entropy and probability divergence. In this paper, the entropy-based definition of MI is used. Shannon entropy [19], for a random variable of , … , with probability of is defined as ∑ . (1) It can be interpreted as the amount of information for the event when it occurs. The definition of MI of two random variables and B is related to entropy by , , . (2) Here, H A and H B are the entropy of A and B, respectively, and H A, B is their joint entropy [20]. In equation (2), the maximum value of MI can be achieved when the joint entropy of images is minimized. In the case of misregistration, corresponding regions in the images are duplicated in the joint pdf which increases the value of joint entropy, while in the registered images the corresponding regions of both images appear once which yields smaller joint entropy. MI does not consider spatial information thus it can be influenced by the size of overlapping area of the images [4].

2.2. Normalized mutual information Normalized mutual information (NMI) is proposed to provide overlap invariance MI [21] and is defined as ,

/ , . (3) In fact, it is possible to maximize the MI value by maximizing the marginal entropies with a higher increment in sum of marginal entropies rather than in the joint entropy. This implies less effect of the relationship between the images on the MI value, which would cause misregistration. In contrast, NMI which uses the ratio of joint and marginal entropies, is able to take into account any increase in marginal entropies through a change in joint entropy and consequently improves the robustness of MI. 2.3. Effective histogram binning The idea is to choose an appropriate number of bins that capture the necessary features in images which have an effective role in estimating the transformation parameters. In comparison, the conventional histogram binning methods try to keep the major information of an image and assume that the estimated MI is fully associated with the geometric transformation. This is not always true. Indeed, the maximum value of MI can be obtained via large increment of the marginal entropies rather than the joint entropy (see equation (2)), and may have no effect on the improvement in geometric alignment. For instance, random noise in data can increase the MI value without leading to real correspondence in images. It is clear that image data can contain information that is both relevant and irrelevant to geometric transformation. Therefore, the choice of the number of bins that preserves the relevant information for image transformation is crucial and can improve the performance of registration since information irrelevant to registration can be minimized through random sampling. To make this idea clear, a synthetic data set (shown in Figure 1) was generated and Gaussian noise added to increase its similarity to a real data set.

Fig. 1. Synthetic multimodal images. As shown in Figure 2, for the images (without geometric transform), the sum of marginal entropies has been increased more than joint entropy. This indicates that the increase of the MI value is not because of changes in image alignment. In fact, the relevant part of MI to the image transformation is represented by joint entropy. Figure 3

1287

shows the behavior of MI and NMI as the number of bins is increased from 4 to 256 bins (2-8 bits). It is noteworthy that the MI values of images rise by increasing the number of bins, while NMI values decline. Consequently, NMI outperforms MI in describing the relevant information for geometric transformation. This also indicates that the use of smaller number of bins can provide more information.

parameters for registration of the data set were obtained using Powell’s optimization method [22] through maximizing the NMI value.

Fig. 4. Orthoimage (left), LiDAR DSM (right). It is noteworthy that as with the case of synthetic images (Figure 3), the MI values of the imagery and LiDAR data grow as the number of bins increase, while NMI values decline (as shown in Figures 5 and 6).

Fig. 2. Entropies of synthetic images with different number of bins.

Fig. 5. MI values of optical imagery with LiDAR DSM and intensity data using different number of bins.

Fig. 3. MI and NMI of synthetic images with different number of bins. 3. EXPERIMENTAL RESULTS AND DISCUSSIONS The selected experimental data, shown in Figure 4, covered the suburban area of Aitkenvale, Queensland. It comprised 5cm GSD aerial orthoimagery and a LiDAR DSM, along with LiDAR intensity data with a point density of 35 pts/m2, acquired in May 2010. Both the elevation (DSM) and LiDAR intensity information were used in the registration of the imagery with the 3D LiDAR point cloud. In order to model the perspective geometry of imagery, a Direct Linear Transform (DLT) was used. Therefore, LiDAR data were transformed into the image coordinates. Optimum DLT

1288

Fig. 6. NMI values of optical imagery with LiDAR DSM and intensity data using different number of bins.

In this paper, kurtosis is utilized as a measure of the peakedness of a surface, in order to evaluate the quality of the convergence surface of the similarity measure in terms of the appearance of local maxima. The kurtosis value of the NMI convergence surface for translation in the X and Y directions in registration of the image with the LiDAR DSM and intensity data is shown in Figure 7. Since a higher value of kurtosis indicates a sharper peak, the he graph shows that the peakedness of the convergence surface falls with increases in the number of bins. This proves the idea of decreasing local maxima by using a smaller number of bins bins. In addition, 3D plots of NMI convergence surface surfaces for 4 and 256 bins are shown in Figure 8, to visually inspect the appearance of local maxima.

resolution yields misregistration.. Figure 9 shows the results of registration using 4 and 256 histogram bins.

Fig. 9. Registered imagery with LiDAR DSM using 4 bins (left), 256 bins (right). In addition, colorization of the LiDAR point cloud with registered imagery provides better visual assessment of registration quality, as shown in Figure igure 10.

Fig. 7. Kurtosis values for NMI convergence surface of optical imagery with LiDAR DSM and intensity data using different bin numbers.

Fig. 10. Colorized ized LiDAR point clouds. clouds 4. CONCLUSIONS

Fig. 8. NMI convergence surface of imagery and LiDAR DSM for translation in X and Y directions using 4 bins (left), 256 bins (right). As shown in Figure igure 8, the NMI convergence using 4 bins is smooth and sharp, whereas using 256 bins (full radiometric resolution) yields a non-monotonic monotonic surface with local maxima and makes the optimization procedure a difficult task. Moreover, using more bins increases computation cost due to the estimate of a larger joint pdf. The experimental testing highlighted that automatic registration of aerial imagery to LiDAR data is viable using lower radiometric resolution, whereas using full radiometric

The ability of the proposed histogram binning method to enhance the registration accuracy of multi-sensor data has been highlighted. The performance of choosing effective bin number in intensity-based registration of optical imagery with LiDAR data has been shown to be particularly impressive. Based on the testing, which involved a number of data sets, the authors conclude that using more bins in joint pdf estimation not only increases increase computation cost but also decreases the registration accuracy due to a decline in the relevant information for registration. On the other hand, a small number of bins can overcome this problem by applying more relevant information for the geometric transformation. This increases the reliability of optimization by providing a smoother convergence surface for similarity measures as well as speeding up the registration procedure. Finally, this effective method of histogram binning can be used in intensity-based registration of multi-sensor remote sensing data with a higher level of robustness and accuracy for a wide range of applications.

1289

5. REFERENCES [1]

[2] [3]

[4]

[5]

[6]

[7]

[8]

[9]

[10] [11]

[12] [13]

[14]

[15]

A. Habib, M. Ghanma, and E. M. Kim, “ LIDAR Data for Photogrammetric Georeferencing ,” in Proc. FIG Working Week and GSDI , 2005, vol. 8. A. A. Goshtasby, Image Registration: Principles, Tools and Methods . Springer , 2012. J. LeMoigne, N. S. Netanyahu, and E. R.D., Image Registration for Remote Sensing . Cambridge University Press , 2011. J. P. W. Pluim, J. B. A. Maintz, and M. A. Viergever, “Mutual-information-based registration of medical images: a survey,” Medical Imaging, IEEE Transactions on, vol. 22, pp. 986–1004, 2003. F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens, “Multimodality image registration by maximization of mutual information,” Medical Imaging, IEEE Transactions on, vol. 16, pp. 187–198, 1997. P. Viola and W. M. Wells III, “Alignment by maximization of mutual information,” International journal of computer vision, vol. 24, pp. 137–154, 1997. A. C. S. Chung, R. Gan, and W. Wells III, “Robust multi-modal image registration based on prior joint intensity distributions and minimization of KullbackLeibler distance,” HKUST CES Technical Report, pp. 1–4, 2007. A. Roche, X. Pennec, M. Rudolph, D. Auer, G. Malandain, S. Ourselin, L. Auer, and N. Ayache, “Generalized correlation ratio for rigid registration of 3D ultrasound with MR images,” in Medical Image Computing and Computer-Assisted Intervention– MICCAI 2000, 2000, pp. 203–220. D. Freedman and P. Diaconis, “On the histogram as a density estimator: L 2 theory,” Probability theory and related fields, vol. 57, pp. 453–476, 1981. D. W. Scott, “On optimal and data-based histograms,” Biometrika, vol. 66, pp. 605–610, 1979. H. A. Sturges, “The choice of a class interval,” Journal of the American Statistical Association, vol. 21, pp. 65–66, 1926. K. H. Knuth, “Optimal data-based binning for histograms,” arXiv preprint physics/0605197, 2006. P. A. Legg, P. L. Rosin, D. Marshall, and J. E. Morgan, “Improving accuracy and efficiency of registration by mutual information using Sturges’ histogram rule,” Proc. Medical Image Understanding and Analysis, University of Aberystwyth, pp. 17–18, 2007. D. A. Hahn, V. Daum, and J. Hornegger, “Automatic parameter selection for multimodal image registration,” Medical Imaging, IEEE Transactions on, vol. 29, pp. 1140–1155, 2010. Z. Knops, J. Maintz, M. Viergever, and J. Pluim, “Normalized mutual information based registration

[16]

[17] [18] [19]

[20]

[21]

[22]

1290

using k-means clustering and shading correction,” Medical image analysis, vol. 10, pp. 432–439, 2006. S. Suri and P. Reinartz, “Application of generalized partial volume estimation for mutual information based registration of high resolution SAR and optical imagery,” in Information Fusion, 2008 11th International Conference on, 2008, pp. 1–8. T. M. Cover, “JA Thomas Elements of information theory.” John Wiley, 1991. S. Kullback, Information theory and statistics. Dover publications, 1997. C. E. Shannon, W. Weaver, R. E. Blahut, and B. Hajek, The mathematical theory of communication, vol. 117. University of Illinois press Urbana, 1949. F. Maes, D. Vandermeulen, and P. Suetens, “Medical image registration using mutual information,” Proceedings of the IEEE, vol. 91, pp. 1699–1722, 2003. C. Studholme, D. L. G. Hill, D. J. Hawkes, and others, “An overlap invariant entropy measure of 3D medical image alignment,” Pattern recognition, vol. 32, pp. 71–86, 1999. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, “Numerical recipes in C: the art of scientific computing. 2,” Oxford University Press, 1992.