Visual Motion Estimation: Localization Performance ... - CiteSeerX

Visual Motion Estimation: Localization Performance Evaluation Tool for Planetary Rovers Joseph Nsasi Bakambu, Chris Langley, and Raja Mukherji MDA Corporation, 9445 Airport Road, Brampton, Ontario, Canada Joseph.bakambu,chris.langley,[email protected]

Abstract This paper describes a tool to evaluate the performance of localization algorithms for planetary rovers. Given the customer (e.g., ExoMars) requirements, terrain definition (including rock size, density and distribution), rover speed and a goal, the tool predicts the goal reachability and localization accuracy. Different Visual Motion Estimation (VME) algorithms have been implemented and improved to support the tool. The localization algorithms include: frame-to-frame, multi-frame, and bounded state space size Simultaneous Localization and Mapping (SLAM) VME. Based on the advantages and disadvantages of the above VME algorithms, a coupled multi-frame observable VME algorithm has been proposed. To make the tool realistic, the rover model, sensor uncertainties, and wheel slippage have been considered. The simulation results using the Localization Performance Evaluation Tool (LPET) show that our new VME approach is more accurate compared to other algorithms.

1. Introduction It is well known that major space agencies around the world are targeting Mars as a planetary body of prime interest. The Canadian Space Agency (CSA) has identified Mars as a current interest in its Space Science programme, including participation in the European Space Agency (ESA) ExoMars mission. One of the critical derived requirements for the ESA ExoMars mission is that the rover autonomously traverse one km daily at speeds of up to 100 m/h. Throughout the traverse, a localization accuracy of one percent of the distance traveled and one degree in all of the 3 attitude-axes must be maintained. Accurate localization is arguably the most fundamental competence required for long range

autonomous navigation. In this context, MDA Space Missions and the CSA have embarked on a project to further their current expertise in the area of visual motion estimation (VME) for planetary rovers. VME algorithms have recently seen a considerable amount of interest from the rover planetary exploration community as a solution for accurate localization. On the Mars Exploration Rovers (MERs), VME was not considered part of the main system for localization, but was shown to work relatively well. In [1], JPL reported that “Visual Odometry software has enabled precision drives over distances as long as 8 m on slopes greater than 20 degrees, and has made it possible to safely traverse the loose sandy plains of Meridiani”. The algorithm works by tracking features (Harris corners) in a stereo image pair from one frame to the next (frame-to-frame). Thus, the problem is one of determining “the change in position and attitude for two pairs of stereo images by propagating uncertainty in a 3D to 3D pose estimation formulation using maximum likelihood estimation”. The evaluation tests conducted at the JPL Marsyard and Johnson Valley, California showed that the absolute position errors were less than 2.5% over the 24 meter Marsyard course, and less than 1.5% over the 29 meter Johnson Valley course. The rotation error was less than 5.0 degrees in each case. LAAS/CNRS (Laboratoire d'Architecture et d'Analyse des Systèmes/Centre National de la Recherche Scientifique) VME is based on the frame-toframe pixel tracking method. Landmarks are extracted from images by finding points of interest identified by image intensity gradients. The test results using the Lama rover [2] showed an error of 4% on a 25 meter traverse. After improvement of the algorithm in [3], the overall error of 2% on a 70 m traverse was maintained. The survey of literature [4][5][6] shows that the state-of-the-art in localization has yet to meet the

ExoMars localization accuracy requirement of 1% of the traveled distance. This paper describes MDA’s localization performance evaluation tool (LPET), and the latest VME capabilities. MDA’s VME approach uses odometry and a stereo-pair to identify visual landmarks. A high level set of natural visual features called Scale Invariant Feature Transform (SIFT) [7] is used as the visual landmarks. The SIFT features are invariant to image translation, scaling, rotation, and partially invariant to illumination changes and affine projection, making them suitable landmarks for use in motion estimation. Previous work at MDA has shown that a full simultaneous localization and mapping (SLAM) approach yields good accuracy, but has a high computational burden that results in reduced speed. Conversely, if no map is maintained, a frame-to-frame technique has been shown to operate with acceptable speed but at the cost of decreased accuracy. This paper introduces a multi-frame bounded state space size VME algorithm that attempts to obtain a suitable balance between accuracy and speed, and briefly discusses the stability issue of the SLAM problem.

2. Localization Performance Evaluation Tool description The LPET aims to evaluate localization algorithms performance in a realistic simulated terrain and sensors. To make the tool realistic, the rover model and sensors uncertainties, and wheel slippage have been considered.

the goal reachability1 using the Monte-Carlo Path Planning and Analysis (MCPPA). The outputs of MCPPA are the goal reachability for a given terrain and worst case paths (e.g., paths with high and several curvatures). Worst case paths and localization accuracy requirements are input to the Sensor Tradeoff and Analysis block (STA). The sensor tradeoff results are appropriate sensor specifications (parameters and location) that allow the localization accuracy requirements to be met. A corresponding sensor is simulated and used to extract visual features in the rover’s surrounding environment for the VME Algorithm Driver and Tradeoff Study. Finally, the tool predicts the VME algorithm, and its parameters, which provides the best localization accuracy given the sensor and the terrain. Preliminary comparison of the LPET predicted results with the real rover field tests results validated the tool. The LPET is useful during the study phase of an autonomous rover to evaluate the localization algorithms, to assess the goal reachability for a given terrain (e.g., Mars-like terrain) as well as for designing the sensors specifications and locations. It is also useful for the project in the development phase during which one cannot afford immediate rover implementation, testing and validation. Moreover, the LPET is important for VME algorithm debugging and/or comparison because it makes it possible to replay exactly the same sequences of static input data, which is hardly conceivable with a real rover.

3. LPET Functionalities The main functionalities of the localization performance evaluation tool are depicted in Figure 1.

3.1. Monte Carlo Path Planning and Analysis

Figure 1. LPET Architecture

Given the customer (e.g., ExoMars) requirements, terrain definition (including rock size, density and distribution), rover speed and a goal, the tool predicts

The purpose of the MCPPA is to assess the goal reachability and generate worst case paths for the sensor tradeoff study. This process is accomplished in three steps: 1. Create an internal representation of the terrain based on the customer terrain definition in terms of the rock size, density and distribution. In practice, the rocks are randomly distributed in the terrain. Typically, in the ExoMars terrain definition, the size and the number of rocks per 1

Given a terrain model and an initial state A, a state B is called reachable if there is a sequence [q1,q2,…,qk] of actions which when executed from state A will lead to a state where B holds.

2.

3.

m2 and the area of the terrain covered by rocks are provided [16]. To simulate a worst case terrain, the number of rocks is multiplied by K (K>1). Select a goal B. Figure 2 illustrates a 2D representation of the terrain generated based on the ExoMars’ terrain definition (K=1). Choose N random initial states and verify whether a path exists from each initial condition to the selected goal B. To simplify the path search, it is assumed that the size and the location of obstacles are known (i.e. the sensor is perfect). Repeat step 1 and 2 several times and quantify the number of the successful paths found as a percentage of the number of starting locations.

Figure 2. ExoMars Terrain Reference

3.2. Sensors tradeoff Analysis The goal of the sensor tradeoff is to derive the appropriate sensor specifications and configuration from the worst case paths found in Section 3.1 and the localization accuracy requirements. The tradeoff is based on sensor parameters (FOV, accuracy, resolution, mast height, depression angle, etc.) optimization through static analysis and simulation.

performance appraisal to be carried out. The data sets from the simulator must be quickly generated and highly controlled, in order to ensure valid comparisons between algorithms during the tradeoff study. The simulation must model all relevant sources of error, with model parameter values traceable to mission requirements or rover hardware specifications. Inputs to the simulator are a set of paths and a set of visual landmarks (both outputs of the Monte Carlo Path Planning Tool from Section 3.1), the vision sensor parameters (outputs of the Sensor Tradeoff Analysis from Section 3.2), and the rover parameters (including the size of the rover and its odometry error parameters). The simulation then executes the path, and at each image frame computes a noisy odometry measurement, a set of noisy landmark observations, and the ground truth pose of the rover. These data are saved to a set of files which can be read by the VME Algorithm Driver (Section 3.4) so that multiple algorithms can be tested using exactly the same input data. The simulator operates in full 3D, so that elevation and pitch/roll changes can be modeled (Figure 3). Input parameters which affect the noise models include: • Vision sensor placement and field of view • Vision sensor frame rate • Vision sensor angular uncertainty, resolution, and stereo baseline • Probabilities of detection and misclassification in the image processing algorithm • Width of the rover • Odometry uncertainty, resolution, and scale factor error • Wheel slippage In addition, the user can enforce a large slippage; for example, a 5 second slip in which the rover does not move but the odometry measurements continue to advance.

3.3. Sensor Simulation The goal of the sensor simulation is to provide input data sets for exhaustive testing of the VME algorithms. By its very nature, the performance of any VME algorithm is dependent on the visual scene and the path being executed. Running hundreds of simulated experiments allows a statistically significant

Figure 3. 3D visualization of the simulated terrain (shaded by elevation) elevation) and random visual landmarks (black dots).

3.4. VME algorithm driver and tradeoff study The goal of the VME algorithm driver is to read in the input data set(s) generated by the Sensor Simulation (Section 3.3), and determine the localization accuracy of a candidate VME algorithm by comparing its output with the ground truth motion of the simulated rover. The algorithm driver runs in a batch mode, adding to a log file as it executes. For each run, the log records: • The input data set being used • The VME algorithm parameters (see Section 4) • The total arc length of the path • The endpoint localization error • Algorithm statistics (e.g., average number of new/matched/dropped landmarks, mean and maximum size of the SLAM state vector, etc.) A script automatically parses the log file to generate performance comparisons over different VME algorithms or parameter values. The performance is averaged over many input data sets in order to give a statistically meaningful appraisal. In addition, the algorithm driver can produce a visualization of the ground truth, odometry, and VME localization results. A typical result is shown in Figure 4. Visualization of SLAM results Run: TradeOff001path02rate300-9999-9999-0

Rock distribution based on ExoMars terrain specification

35 30

Ground truth path

25

y (m)

20 15

VME localization

10 5 0

UserInduced slip

Wheel odometry localization

-5 10

20

30 x (m)

40

50

Figure 4. Visualization of a typical run of the VME algorithm driver. driver.

4. VME improvements The VME algorithms tested are based on the Simultaneous Localization and Mapping (SLAM) formulation. The SLAM problem is very well known in

60

the literature, and its development will not be repeated here. Interested readers are referred to [8][9][10]. It is sufficient for the current discussion to note a few key facts: • The SLAM problem can be formulated as an Extended Kalman Filter (EKF) where the system state includes both the rover’s pose and locations of a set of landmarks, all with respect to a fixed inertial frame. • The term relative map is used to describe the locations of the landmarks with respect to each other (i.e., in locally defined relative coordinates). The absolute map is the locations of the landmarks with respect to and expressed in the fixed inertial frame. • In the EKF solution to the SLAM problem, the main computational bottleneck is the inversion of a matrix whose size is proportional to the number of landmarks currently maintained in the system state vector. Many techniques have focused on removing the correlations between landmarks in order to block diagonalize and make the matrix inversion faster [11][12]. However, doing so violates one of the conditions for guaranteed convergence of the relative map [10], and further has been shown in our simulations to perform poorly (see Section 5). Other improvements to the implementation were published [13], with some success. However, the largest gain to be realized is by reducing the size of the matrix inversion. It is also important to note that in the planetary exploration concept, the rover is continually exploring new terrain; therefore, maintaining an exhaustive list of previously viewed landmarks in the map does not help the rover to localize itself. As such, removing old landmarks from the map saves both memory and computational burden during the matrix inversion. There are two straightforward methods to determining which landmarks should be dropped; these methods are referred to as multiframe and bounded state space size, and are discussed below. Multiframe SLAM is implemented in the same manner as EKF-SLAM, but with one addition. Each landmark has an associated scalar parameter, m, called the time to live (note that the time to live does not form part of the SLAM state space; it is merely a separate variable in memory). At any time step in which the landmark is observed in the image, its time to live is reset to the tunable value M, which is effectively the number of previous image frames in which to search for a corresponding observation. Landmarks which are not observed in the current image have their time to live decremented by one. When m = 0, the landmark is

considered dead, and its corresponding elements are removed from both the SLAM state vector and covariance matrix. Note that a landmark can survive for greater than M image frames if it is continually observed; in other words, the measurement information can be maintained (via the recursive update) for longer than M frames. Note that frame-to-frame VME can be considered the special case of M = 1. The bounded state space variation on the SLAM algorithm is similar in motivation to the multiframe variation, in that it attempts to decrease the computational load while maintaining accurate performance. Whereas the multiframe version applies a temporal constraint which results in an effective limiting of the size of the state space, this version applies a constraint to the size of the state space which results in effective temporal pruning of landmarks. The algorithm is parameterized by N, the maximum number of landmarks which can be carried in the SLAM state vector. The landmark state vector can therefore be considered a first-in-first-out queue of fixed size. Upon the first observation of a landmark, it is added to a waiting pool, where it can be considered for potential matching in future image frames. Once the landmark is matched between two frames, it is pushed into the queue. If the queue is already full, the oldest landmark is popped out of the queue. A diagram of the process is shown in Figure 5. Note that the queue could also be resorted such that landmarks with the shortest time to live or with the least confidence from the image extraction are popped first.

grow unbounded over time (see Figure 6). Simulation results have shown that under reasonable amounts of slippage and odometry errors the 1% (absolute) localization accuracy requirement can never be met with an unobservable formulation of SLAM. To make the problem observable, when the rover is stationary it repeatedly measures one or more landmarks to map them with respect to its initial frame of references. Once the traverse begins, these landmarks are used to “anchor” the relative map to the absolute frame. Enforcing observability allows the estimator to drive the absolute errors toward zero, but within an error band defined by the process and measurement noise in the system.

Figure 6. In the unobservable formulation of SLAM, the estimated relative map (grey, right) converges to the true relative map (black, left), but the absolute location of of the map and rover with respect to the inertial frame does not converge. Time History of Localization Errors Unobservable vs. Observable Data Set: TradeOff001path02rate300-9999-9999-0 1 Unobservable Observable

0.9

Localization Error (m)

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Figure 5. Diagram of the fixedfixed-size queue queue used in the bounded state space size variant of SLAM. SLAM.

It should be pointed out that the typical formulation of the SLAM problem (e.g., [10]) is unobservable [14][15]. This means that, while the relative map is guaranteed to converge, the errors in the absolute map

0

20

40

60

80 100 Time (s)

120

140

160

180

Figure 7. Typical example of localization errors as a function of time. The unobservable formulation’s error grows with time, whereas whereas that of the observable formulation remains bounded.

Accuracy vs. Number of Frames, Multiframe SLAM

8 Mean % Error

Figure 8 illustrates the overall technical approach taken to improve the localization of the MDA rover testbed. Note that improvements have been made to both the exteroceptive and proprioceptive localization schemes; namely, the VME and IMU-fused 3D odometry, respectively.

10

6

Observable Coupled Unobservable Decoupled Unobservable

4

2

0

2

4 6 8 10 Number of Image Frames Accuracy vs. Number of Landmarks, Bounded SLAM

10

Mean % Error

8

6

Observable Coupled Unobservable Decoupled Unobservable

4

2

0 20

5. Experimental results The simulation was used to generate 100 input data sets, based on the ExoMars terrain specifications, and the outputs of the sensor tradeoff study. In addition to the stochastic slippage model, each run also had an artificially induced five second slip. These data sets were used to examine the performance of the full SLAM, multiframe and bounded state space size VME algorithms. The results for a rover speed of 2 cm per frame are shown in Figure 10. The ExoMars requirement of 1% can be met using the observable formulation of SLAM. The observable formulation also consistently outperforms both the coupled and decoupled SLAM formulations. Further, Figure 10 shows that the multiframe algorithm yields slightly better performance than the bounded state space size algorithm given equivalent computational resources.

60 80 100 Max. Number of Landmarks

120

Figure 9. Results of algorithm tradeoff. Top: Multiframe SLAM VME. Bottom: Bottom: Bounded state space size SLAM VME. Dashed lines indicate the performance using all images and landmarks. In both cases, the decoupled version performed worse, followed by the coupled but unobservable formulation. The ExoMars requirement of 1% can only only be met using the observable formulation. Comparison of Multiframe and Bounded Algorithms Observable EKF 4

3 Mean % Error

Figure 8. The technical approach used to advance the rover testbed localization capabilities.

40

2

1 Multiframe Bounded 0 20

30

40

50 60 70 80 90 Maximum State Space Size

100

110

120

Figure 10. 10. Comparison of observable multiframe and bounded state space size algorithms as a function of the maxmimum state space size.

6. Conclusion and Future work This paper presents a tool for Localization Performance Evaluation that predicts the VME algorithm, and its parameters, which provide the best localization accuracy given the terrain. The tool can be used during an autonomous rover study phase and/or during the development phase. Several variations of the SLAM algorithm for planetary exploration rovers have been implemented, including the frame-to-frame, the multi-frame, and the bounded state space size SLAM VME algorithms. The coupled, observable, multi-frame VME, which exploits the advantages of the above algorithms, has been presented. The simulation results using the LPET show that the new VME approach is consistently more accurate compared to other algorithms. Future work includes the implementation of the new VME algorithm in the rover for field trial tests, the extensive comparison of the field tests results with the LPET simulation results, and based to the comparison results, adjust the tool’s parameters.

7. Acknowledgment The Authors would like to thank Giri Pushpanathan and Pritam Sarkar from MDA for their contributions to the SLAM implementation, rover setup and field tests. We would also like to thank Erick Dupuis from the Canadian Space Agency for his valuable comments in this work. Thanks also to Professor Joshua Marshall from Carleton University at Ottawa for his contributions and useful discussions.

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

8. References [1] M. Maimone, Y. Chang, and L. Matthies. “Two years of visual odometry on the Mars exploration rovers”. Journal of Field Robotics, 3(24):169–286, 2007. [2] S. Lacroix, A. Mallet, D. Bonnafous, G. Bauzil, S. Fleury, M. Herrb and R. Chatila. “Autonomous Rover Navigation on Unknown Terrains. Demonstrations in the Space Museum Cité de l'Espace at Toulouse”. In 7th International Symposium on Experimental Robotics. Honolulu, HI (USA), 2000. [3] S. Lacroix S. Lacroix, A. Mallet, D. Bonnafous, G. Bauzil, S. Fleury, M. Herrb and R. Chatila. “Autonomous Rover Navigation on Unknown Terrains: Functions and Integration”, In International Journal of Robotics Research, 21(10-11), pages 917-942, 2002. [4] P.I. Corke, D. Strelow, and S. Singh. “Omnidirectional visual odometry for a planetary rover”. In Proceedings of Intelligent Robot and System, 2004 [5] J.J. Biesiadecki,et al., “Mars exploration rover surface operations: Driving opportunity at meridiani planum”.

In IEEE Conference on Systems, Man and Cybernetics, The Big Island, Hawaii, USA, October 2005. S. Se, T. Barfoot, P. Jasiobedzki, “Visual Motion Estimation and Terrain Modeling for Planetary Rovers”, iSAIRAS 2005, Munich, Germany. D.G. Lowe, "Distinctive Image Features from ScaleInvariant Keypoints", International Journal of Computer Vision, 60(2): 91-110 (2004) H. Durrant-Whyte, T. Bailey, "Simultaneous Localization and Mapping: Part I", IEEE Robotics and Automation Magazine, 13(2): 99-110 (2006) T. Bailey, "Simultaneous Localization and Mapping: Part II", IEEE Robotics and Automation Magazine, 13(3): 108-119 (2006) M.W..M.G. Dissanayake, P. Newman, S. Clark, HF. Durrant-Whyte, M. Csorba, "A Solution to the Simultaneous Localization and Map Building (SLAM) Problem", IEEE Transactions on Robotics and Automation, 17(3): 229-241 (2001) J. Neira, J.D. Tardos, J.A. Castellanos, "Linear Time Vehicle Relocation in SLAM", Proceedings of IEEE ICRA, 1:427-433 (2003) M. Montemerlo, "FastSLAM: A Factored Solution to the Simultaneous Localization and Mapping Problem With Unknown Data Association", PhD thesis, Carnegie Mellon University (2003) J.E. Guivant, E.M. Nebot, "Optimization of the Simultaneous Localization and Map-Building Algorithm for Real-Time Implementation", IEEE Transactions on Robotics and Automation, 17(3): 242257 (2001) J. Andrade-Cetto, A. Sanfeliu, "The Effects of Partial Observability in SLAM", Proceedings of IEEE ICRA, 1:397-402 (2004) K.W. Lee, W.S. Wijesoma, J. Ibanez-Guzman, "On the Observability and Observability Analysis of SLAM", Proceedings of IEEE ICRA, 1:3569-3574 (2006) ESA, “EXOMARS phase B1: Mission and System Requirements, EXM-MS-RS-ESA-00001, ESTEC, Noordwijk, The Netherlands, 2005

Visual Motion Estimation: Localization Performance ... - CiteSeerX

Visual Motion Estimation: Localization Performance ... - CiteSeerX

Suggest Documents

Bio-inspired Visual Motion Estimation

visual motion estimation and terrain modeling for ... - CiteSeerX

Course-to-fine Estimation of Visual Motion

Estimation Bounds for Localization - CiteSeerX

Performance Evaluation of Motion Estimation in DCT ... - CiteSeerX

perception of visual motion - CiteSeerX

Regularized Patch Motion Estimation - CiteSeerX

Towards Visual Localization, Mapping and Moving ... - CiteSeerX

Motion Sketch: Acquisition of Visual Motion Guided ... - CiteSeerX

VISUAL PROSODY ANALYSIS FOR REALISTIC MOTION ... - CiteSeerX

Dynamic visual attention: competitive versus motion ... - CiteSeerX

Mechanisms of visual motion detection - CiteSeerX

Changing pitch induced visual motion illusion - CiteSeerX

VISUAL PROSODY ANALYSIS FOR REALISTIC MOTION ... - CiteSeerX

A Joint Motion & Disparity Motion Estimation Technique ... - CiteSeerX

BAYESIAN ESTIMATION OF MOTION VECTOR FIELDS ... - CiteSeerX

Incremental Motion Estimation Through Local Bundle ... - CiteSeerX

Fast, Unconstrained Camera Motion Estimation from ... - CiteSeerX

blind image deconvolution: motion blur estimation - CiteSeerX

BAYESIAN ESTIMATION OF MOTION VECTOR FIELDS ... - CiteSeerX

POWER EFFICIENT MOTION ESTIMATION USING ... - CiteSeerX

Optimal motion estimation from visual and inertial measurements

Robust Techniques for Visual Motion Estimation 1 ...

Visual Motion Estimation and Prediction: A ... - Semantic Scholar