3D registration of laser range scenes by coincidence of coarse binary cubes

August 11, 2017 | Autor: Jesus Morales | Categoría: Cognitive Science, Machine Vision
Share Embed


Descripción

3D Registration of Laser Range Scenes by Coincidence of Coarse Binary Cubes Jorge L. Martínez, Antonio J. Reina, Anthony Mandow Jesús Morales Departamento de Ingeniería de Sistemas y Automática University of Málaga, 29071 Málaga, Spain. Email: [email protected] Abstract - Scene registration of a pair of three dimensional (3D) range images is a 6D optimization problem usually required in mobile robotics. This problem is different from object registration, since all scan directions and depths may contain relevant data, and because farther regions are sampled with lower densities. The paper proposes an efficient scene matching method based on the concept of coarse binary cubes. An integer objective function is defined as the number of coincident cubes between both scans. This is a metric of the degree of overlap that does not employ point distances. Its value is obtained without actually using any 3D grid data structure, with a computational complexity of order $O(n)$, where $n$ represents the number of laser points. This objective function is optimized with a globalized version of the downhill Simplex algorithm to avoid local minima. Experimental results are presented from indoor and outdoor environments with different degrees of structuring. The effect of cube size and the number of vertices on registration performance has been analyzed. Besides, experiments show that the proposed method achieves similar accuracy as Iterative Closest Points (ICP) and Normal Distribution Transform (NDT), while it improves both computation time and robustness against initial misalignments. Keywords: 3D measurement system; Laser ranging; Scene registration; Point matching; Point cloud; Mobile robotics ___________________________________________________________________________________________________

Citation Information Martínez, J.L., Reina, A.J., Mandow, A., Morales, J. 3D registration of laser range scenes by coincidence of coarse binary cubes (2012) Machine Vision and Applications, 23 (5), pp. 857-867.. doi: 10.1007/s00138-011-0387-z

@ARTICLE{Martinez2012MVA, author={Mart\'{\i}nez,J.L.andReina,A.J.andMandow,A.andMorales,J.}, title={3D registration of laser range scenes by coincidence of coarse binary cubes}, journal={MachineVisionandApplications}, year={2012}, volume={23}, number={5}, pages={857867}, }

__________________________________________________________________________________________ © 2012, the Authors. This document is a self-archiving PREPRINT of a copyrighted publication. The final publication is available at www.springerlink.com Please find the published article at: http://dx.doi.org/10.1007/s00138-011-0387-z

Machine Vision and Applications, doi:10.1007/s00138-011-0387-z, manuscript No. (will be inserted by the editor)

3D Registration of Laser Range Scenes by Coincidence of Coarse Binary Cubes Jorge L. Mart´ınez · Antonio J. Reina · Anthony Mandow · Jes´ us Morales

PREPRINT VERSION. The final publication is available at www.springerlink.com

Abstract Scene registration of a pair of three dimensional (3D) range images is a 6D optimization problem usually required in mobile robotics. This problem is different from object registration, since all scan directions and depths may contain relevant data, and because farther regions are sampled with lower densities. The paper proposes an efficient scene matching method based on the concept of coarse binary cubes. An integer objective function is defined as the number of coincident cubes between both scans. This is a metric of the degree of overlap that does not employ point distances. Its value is obtained without actually using any 3D grid data structure, with a computational complexity of order O(n), where n represents the number of laser points. This objective function is optimized with a globalized version of the downhill Simplex algorithm to avoid local minima. Experimental results are presented from indoor and outdoor environments with different degrees of structuring. The effect of cube size and the number of vertices on registration performance has been analyzed. Besides, experiments show that the proposed method achieves similar accuracy as Iterative Closest Points (ICP) and Normal Distribution Transform (NDT), while it improves both computation time and robustness against initial misalignments.

Keywords 3D measurement system · Laser ranging · Scene registration · Point matching · Mobile robotics

Departamento de Ingenier´ıa de Sistemas y Autom´ atica E.T.S. Ingenieros Industriales, Universidad de M´ alaga 29071-M´ alaga, Spain Tel.: (+34) 951 952 322, E-mail: [email protected]

1 Introduction Three dimensional (3D) laser range-finders are increasingly finding their way into new fields that demand accurate sampling of complex sites [7] and outdoor environments [1]. These sensors have also been proposed for scene perception and modeling in mobile robot applications such as simultaneous localization and mapping [5], planetary exploration [25], and surveillance [2]. Typically, 3D data processing involves registration, i.e., aligning two partially overlapping point clouds from subsequent scans in real time [23]. This is a highly nonlinear problem with no analytical solution that is commonly solved around an initial estimation [32]. Scenes pose challenging differences with respect to the widely studied case of object registration [22]. Instead of different views of a single object from a similar distance, registration has to deal with different perspectives of an environment, so all scan directions and depths in the scene may contain relevant information [18]. Moreover, scene dimensions can surpass the maximum sensor range, so areas can emerge or vanish from successive scans. In addition, the object modeling sensors usually provide Cartesian or cylindrical coordinates, whereas most scanners in mobile robots obtain spherical coordinates by composition of two rotations of the laser beam [17]. Using raw range data (i.e., point clouds) for scene registration [19][5][16] is a more general approach than relying on a priori geometric features that can be expectable in urban [10] or indoor areas [24] but not so much in less structured environments. Point cloud matching is an optimization problem because exact correspondence is actually impossible due to spurious ranges, random noise, mixed pixels, occlusions, and sensor discretization. The objective functions optimized for

2

point cloud matching methods, such as the Iterative Closest Point (ICP) algorithm [19] or the LevenbergMarquardt method [5], are related to the distance between matched points. These functions pose three general limitations: the computational load of finding point correspondences, the need for a good initial guess to avoid falling into local minima, and the systematic bias introduced by pairing boundary points between nonoverlapping scans [16]. Besides, point distance optimization has specific drawbacks for spherical coordinate scene scans. First, search is dominated by the enormous amount of data from the closest areas, such as the ground in terrestrial vehicles, which can distort registration. Furthermore, the threshold employed to discard wrong point matches usually throws away information from sparse distant ranges. Precisely, appropriate matching of farther points can be valuable to improve the accuracy of orientation estimations [31]. This is why far points have traditionally been considered as landmarks for triangulation in unstructured environments [29]. Occupancy grids have been widely used to represent maps of the environment in mobile robotics [6]. For mapping, each cell does not convey explicit sensor data but probabilistic occupancy information based on a history of multiple sensor readings [30]. Since a complete regular grid is not an efficient data structure, compressed 1D representations, such as occupied voxel lists [26], have been proposed. Furthermore, several works have also proposed the grid concept as a tool for pairwise registration of 3D point clouds. Thus, some registration methods have used 3D grids to speed up point distance optimization. In [28][19], the ICP nearest neighborhood search considers the center points of octree cells instead of the raw point set of the first scan. In the Normal Distributions Transform (NDT) [16], a Gaussian probability distribution calculated from all the points within each coarse cell allows a fast gradient-based registration scheme. Grids have also been used to define objective functions that avoid point distance metrics. In this way, [11] defines the fitness value for 2D scan matching with Genetic Algorithms as the point count from the second scan with correspondences to a binary grid of the first scan, assuming that points are subsampled to prevent superposition of denser areas. For the 3D case, [21] proposes maximizing the number of overlapping coarse cubes between two octrees in an exhaustive discrete search in order to obtain an adequate initial estimation for ICP. In [18], coarse binary cubes are also employed as a common frame to match a pair of scans. An integer objective function is defined as the number of coinci-

dent cubes. This value is computed from the intersection of 1D vectors, so it does not actually implement a 3D data structure. This scene registration method provides a matching accuracy similar to ICP with increased robustness to initial misalignments. This paper improves the work presented in [18] in the following ways: – The evaluation of a prospective solution is reduced to order O(n), where n represents the number of laser points. – The integer objective function is maximized with a globalized version of the Nelder-Mead algorithm to avoid local minima. – Performance in real environments with different degrees of structuring is compared with two standard methods in mobile robotics [15]: ICP and NDT. The paper is organized as follows. Next section describes the proposed scan matching strategy for scene registration. Experimental results are presented in Sect. 3. Conclusions and future work are outlined in Sect. 4. Finally, acknowledgments and references complete the paper.

2 Scan matching with binary cubes 2.1 Evaluation of the integer objective function In mobile robotics, a 3D range scan is usually obtained from a laser beam subject to two rotations, which yields a spherical scan [17]. Let us define a sensor Cartesian frame such that φ and ψ are expressed as rotations around the Z and X axes, respectively, and the Y axis corresponds to the direction defined by φ = 90◦ and ψ = 0◦ (see Fig. 1). Then, assuming that the rotations are performed as a composite sequence, φ before ψ, the Cartesian coordinates [x, y, z]T of a range given in spherical coordinates [r, φ, ψ] can be computed as ⎤ ⎡ ⎤ ⎡ x Cφ r ⎣ y ⎦ = ⎣ C ψ Sφ r ⎦ , Sψ Sφ r z

(1)

where S and C denote the sin() and cos() functions, respectively. Let us assume that a second scan is obtained after a translation [x0 , y0 , z0 ]T and a rotation [α, β, γ] (with the Roll-Pitch-Yaw convention). Then, the points of the second scan [x∗ , y ∗ , z ∗ ]T , as obtained from (1), can be

3

– The list L is a non-sparse data structure that contains the same information as V . The length  of L corresponds to the number of cubes set to one, which is always less or equal than the number n of points from a 3D scan. Note that with coarse cubes, the relation   n  v holds.

Z

r

Z

Y

G

X Figure 1 Spherical coordinates of a laser point

projected onto the local frame of the first scan as ⎡ ⎤ ⎡ ⎤ x x0 ⎣ y ⎦ = ⎣ y0 ⎦ + (2) z0 z ⎡ ⎤⎡ ∗⎤ C α C β C α Sβ Sγ − Sα C γ Sα Sγ + C α Sβ C γ x ⎣ Sα C β C α C γ + Sα Sβ Sγ Sα Sβ C γ − C α Sγ ⎦ ⎣ y ∗ ⎦ . −Sβ C β Sγ Cβ Cγ z∗ The proposed integer objective function J is based on the concept of 3D binary cubes. A 3D spatial grid is composed of cubes of the same edge length E, whose Boolean value is set to one if any Cartesian points of a scan fall within its limits. Evaluation of J does not actually require a 3D data structure, such as a matrix or an octree. In fact, the value of J can be easily calculated with one binary vector V and one unsorted integer list L: – The length v of V is given by the axis-aligned minimum bounding box for the actual readings of the first scan, whose Cartesian coordinates are (xmax , xmin , ymax , ymin , zmax , zmin ). Then v = Ixmax Iymax Izmax ,

(3)

where Ixmax = round((xmax − xmin )/E), Iymax = round((ymax − ymin )/E),

(4)

Izmax = round((zmax − zmin )/E). Each cube has a unique integer index in V . The index associated to a Cartesian scan point [x, y, z]T can be obtained as follows I = Ix + Iy Ixmax + Iz Ixmax Iymax ,

(5)

where Ix , Iy , Iz are the integer grid coordinates Ix = round((x − xmin )/E), Iy = round((y − ymin )/E), Iz = round((z − zmin )/E).

(6)

In the registration process, the first scan is only processed once in order to build V and L. Initially, V is created as a zero vector, and L is empty. For each point of the first scan (1), its I index is computed from (5)(6). If V (I) = 0, then V (I) is set to one, and I is inserted into L; otherwise, no action is taken to avoid repetitions in the list. The objective function J, i.e., the number of coincident cubes, has to be evaluated for each prospective solution [x0 , y0 , z0 , α, β, γ] in the optimization process. Prior to each evaluation, J is initialized to zero. Next, for each point of the second scan, its projection is obtained from (2). If its coordinates fall within the minimum boundary box for the first scan, the corresponding I index is computed from (5)(6). In this case, if V (I) = 1, then V (I) is set to zero and J is incremented by one. Once J has been obtained for a prospective solution, V can be recovered from L for a new evaluation.

2.2 Computational effort Memory requirements are of order O(v), because the vector V is the largest data structure. Let R be the maximum range of the laser sensor. Memory requirements can also be expressed as O((R/E)3 ), which means that they have a cubic growth with smaller edge lengths E. In any case, the memory cost is affordable because the state of each cube is represented by a single bit. Regarding computational time, filling the data structures requires one step for each of the n points in the first scan. After that, counting coincident cubes also needs n steps corresponding to the points of the second scan. Finally, recovering vector V from L needs  steps. Therefore, as  ≤ n, the evaluation of the objective function is O(n). Contrary to memory requirements, running time is independent of E. In the solution provided in [18], the computational time cost was O(v) due to the intersection of the binary vectors calculated for both scans. This has been improved by taking into account the sparsity of V with the unsorted list L. Note that the evaluation of objective functions for point distance minimization methods is dominated by the calculation of the closest points. In Cartesian coordinates, corresponding points can be calculated exhaustively with a time cost of O(n2 ) [3]. This can be improved by using k-d trees to take time O(n log(n)) [32].

4

Only if the search for the closest points is implemented in polar coordinates by inverse calibration, complexity can be reduced to O(n) [4].

GPS Antennas

Antennas Holder

2.3 Globalized Nelder-Mead optimization The previous subsections have proposed an integer objective function with evaluation time O(n) that can be used to search a transformation in the 6D space. This function is appropriate for optimization techniques that do not require explicit knowledge of the gradient. Some derivative-free techniques, such as genetic algorithms [12] or simulated annealing [4], require a great number of objective function evaluations. The downhill Simplex method by Nelder and Mead [20] provides a simpler heuristic and has been applied for object registration with a reduced number of feature points [9]. The Nelder-Mead method uses the simplex concept. In 3D registration, the simplex is a set of 7 vertices that envelope the continuous 6D search space. These vertices are initialized randomly as tentative solutions around the initial estimation. Then, the objective function is evaluated for each vertex so that the worst vertex is substituted by a new one. This is generated by checking operations such as reflection, expansion, or contraction through the centroid of the remaining vertices. The new vertex has to enhance the value of the second worst vertex; if this does not happen, then the entire simplex is shrunk towards the best vertex. Generally, the objective function is not evaluated more than twice in each iteration. The Nelder-Mead algorithm is a deterministic method where the simplex can eventually get stuck on local optima. To avoid this, a Globalized Nelder-Mead Method (GNMM) is proposed in this work, with two major modifications in the search heuristics. First, restarting the simplex is a common way to make the search stochastic [13][33]. Particularly, GNMM avoids shrinking the entire simplex towards the best vertex; instead, it keeps the best vertex and randomly restarts the rest around the initial estimation. Second, evolution of a large population with the Nelder-Mead algorithm has been proposed to improve solution quality [14]. Based on this idea, instead of a simplex, a larger set of vertices is considered in GNMM. Both modifications provoke a slower search, but they avoid premature convergence to local optima.

IMU Laser Casing (IMU Base)

3D Scanner Device

Stepper Motor

Pitch Offset

WebCam Rotation Stand

Figure 2 The DGPS/IMU-aided 3D scanning system

3 Experiments 3.1 DGPS/IMU-aided 3D Scanner system A customized DGPS/IMU-aided 3D scanning system (see Fig. 2) has been employed in the experiments. A webcam has been added for documentation purposes. The Inertial Measurement Unit (IMU) combines accelerometers, gyroscopes and magnetometers. The static accuracy of the unit is ±0.75◦ for roll and pitch angles, and ±2◦ for yaw. The DGPS is a dual-frequency Real-Time Kinematic (RTK) receiver that accepts differential corrections from a local base station via a radio link. With good sky visibility, it provides horizontal and vertical positioning errors around 1.5 cm and 2 cm, respectively. This 3D scanner device has been constructed by adding an extra degree of freedom to a commercial 2D time-of-flight rangefinder. The 2D rangefinder provides up to 81 m range, with ±4 cm range error for ranges of 20 m. Its field of view is 180◦ , with horizontal angular resolution of Δφ = 12 ◦ . The extra degree of freedom consists of a mechanical articulation driven by a

5

stepper motor. Its rotation axis has been made to coincide with the rangefinder’s X axis through a holed cogwheel. The vertical angular resolution of a 3D scan is Δψ = 13 ◦ , with 60◦ of field of view. Thus, the total number of ranges is n = 361 × 181 = 65341. The maximum scan volume is contained by a rectangular prism defined by X = [−81 m, 81 m], Y = [0 m, 81 m], Z = [−40.5 m, 40.5 m] on the local sensor axes.

3.2 Case study scenes The results presented hereunder correspond to an indoor scene and three outdoor sites (an urban terrace, a park, a natural environment). These sites, with decreasing degrees of structuring, offer representative case studies of environment types in mobile robotics applications. Panoramic photographs of these environments are shown in Fig. 3. Next to each photograph, the scenes are depicted in range matrix format as recorded by the laser scanner. 3D data are represented in gray-scale, with closer ranges having a darker gray level. A logarithmic scale has been used so that range details in the foreground can be distinguished in the figure. For illustration purposes, the pixel aspect ratio of all the range images has been set to 3:2 on account of sensor resolutions Δφ and Δψ. White points in the range images correspond to the maximum range of the sensor. This indicates either that objects are beyond the sensor limits (e.g., the sky in outdoor scenes) or that the range is erroneous (as in certain surfaces in the indoor and urban scenes). Therefore, these ranges will not be employed for registration. For each scene, a pair of scans has been obtained to resemble the forward movement of the sensor on a mobile robot. A ground truth estimation (see Table 1) has been adjusted manually by aligning salient features [8], starting from the DGPS and INS estimation. For this purpose, cardboard boxes have been intentionally introduced in the natural environment. An illustration of the coarse binary cube concept is presented in Fig. 4 for the urban terrace. The point cloud of the first scan is shown in Fig. 4(a), where it can be observed that those areas closer to the sensor (i.e., with coordinates near x = y = z = 0 m) are sampled with higher density than farther areas. Coarse binary cubes avoid this problem, as seen in Fig. 4(b). In this sense, it can be considered that this process subsamples (i.e., filters) the original data. Fig. 5 illustrates cube coincidences when the second scan of the urban terrace is transformed according to the ground truth. It can be observed that most of the cubes overlap, even though

others are only visible from the first or the second points of view.

3.3 Registration results The proposed registration method has been analyzed though multiple experiments in each of the four environment types. Results have been obtained from a PC with an Intel Core I7-2720QM (2.20 GHz, 6 MBytes cache, Quad Core) and a 8 GBytes RAM (1.33 GHz) under the Linux operating system. Each row of Tables 2, 3 and 4 offers the average values for 25 different matches. For each scene, different experiments have been generated by adding random pose errors around the ground truth, which emulates mobile robot localization errors. The effect of cube edge size E is evaluated in Table 2. As expected, vector size v grows with smaller values of E. It can also be observed that v is reduced for the indoor case, whose readings fit in a smaller minimum boundary box. The number of cubes set to one  also increases with smaller cubes, but not proportionally to v. The value of J for the the ground truth transformation grows with a smaller E up to a certain value, but it can decrease if E is too small (e.g., E = 0.1 m for the park scene). GNMM has been executed with 15 vertices and 500 iterations. Initial estimations have been obtained by adding random errors within ±1 m for translations and ±8◦ for rotations to the ground truth. Table 2 shows the average value of the optimized objective function J. It can be observed that this value is very similar to the one obtained for the estimated ground truth, with the exception of the smallest size cubes E = 0.1 m, where GNMM fails to find an adequate solution due to the size of the search space, as  approaches n = 65341. Interestingly, the average value of J for coarse cubes is even slightly better than the ground truth, which can be explained by the limited accuracy of manual ground truth estimation. Let Δx, Δy, Δz, Δα, Δβ, Δγ be the errors between the estimated pose and the ground truth. Then, two accuracy measures [27] have been considered  (7) Es = Δx2 + Δy 2 + Δz 2 , Ea =

 Δα2 + Δβ 2 + Δγ 2 ,

(8)

where Es represents the Euclidean distance error, and Ea represents the angular error of the three rotations. The most accurate results are usually obtained for cube edges of 0.3 m or 0.5 m. By using coarser cubes, matching accuracy is gradually degraded.

6

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Figure 3 Panoramic photographs (a),(c),(e),(g) and gray-scale 3D range images (b),(d),(f),(h) for the building hall, urban terrace, park and natural environment, respectively Table 1 Estimations of the ground truth for different environments x0

y0

z0

α

β

γ

(m)

(m)

(m)

(◦ )

(◦ )

(◦ )

Indoor Urban

-2.570 0.406

3.144 3.662

-0.029 0.053

-20.858 27.126

1.651 5.282

5.999 -6.127

Park Natural

2.541 0.732

4.696 2.516

0.133 -0.049

-44.106 -1.675

-0.717 -14.666

6.294 -2.258

Scene

Computation time has also been measured as a performance index. It is almost independent of cube edge E, which corroborates the computational cost analysis offered in Sect. 2.2. Registration time depends on the number of objective function evaluations. In the worst case, with E = 0.1 in the indoor scene, each GNMM iteration requires an average of more than two evaluations of J, as vertices are restarted frequently due to the complexity of the search space. Note that outdoor scenes need less computation time than indoors because sky points from the second scan are discarded without computing their projections, I indices, and coincidences in V . The effect of adding vertices to the simplex is analyzed in Table 3. In this case, GNMM has been exe-

cuted with the same number of iterations, and the same erroneous initial estimations employed in the previous table, but with cube edge fixed to E = 0.5 m. It can be observed that the computing time is not affected, as the number of evaluations of the objective function are kept similar. In general, accuracy is improved with a number of vertices around 15. Note that the values of row with E = 0.5 m in Table 2 do not coincide with those of row of Table 3 with 15 vertices, because vertices have been initialized randomly, providing different results. Registration of coarse binary cubes with GNMM has been compared with ICP and NDT in Table 4. The ICP algorithm [3] has been implemented with a fixed threshold of 0.5 m to discard outliers. For termination,

7

(a)

(b) Figure 4 Point cloud (a), and coarse binary cubes of E = 1 m (b), for the first scan of the urban terrace

8 Table 2 Effect of cube edge on registration performance (each row shows the average for 25 matches) Scene

E (m)

Es (m)

Ea (◦ )

Time (s)

Number of evaluations

Averaged J

J for the ground truth

v (Mbits)



Indoor

0.1 0.3 0.5 0.7 0.9

0.4855 0.1383 0.0363 0.0629 0.0551

2.2977 0.1720 0.2573 0.3025 0.2953

3.53 2.68 2.61 2.57 2.62

1312 1088 1102 1064 1070

4294 4296 2382 1583 1090

6751 4295 2377 1582 1084

20.442 0.779 0.166 0.061 0.030

21762 7049 3369 2137 1450

Urban

0.1 0.3 0.5 0.7

0.5422 0.0337 0.0358 0.0548

2.8458 0.1618 0.1218 0.1233

2.73 2.14 2.02 2.06

1158 1069 1055 1055

3425 3509 2437 1709

4801 3481 2412 1694

76.331 2.859 0.621 0.226

17580 7158 3973 2500

0.9

0.0545

0.2066

2.05

1055

1230

1224

0.108

1715

0.1 0.3

2.5665 0.0595

6.2378 0.1250

2.29 1.89

1117 1077

2015 3706

2627 3682

66.531 2.513

21440 8291

0.5 0.7 0.9

0.0913 0.0687 0.1094

0.1990 0.1576 0.3327

1.81 1.82 1.81

1075 1061 1060

2366 1563 1128

2351 1550 1113

0.546 0.204 0.099

4708 2982 2100

0.1 0.3 0.5 0.7 0.9

0.2711 0.0327 0.0308 0.0460 0.0787

1.9608 0.8858 0.7996 0.8414 0.7887

2.61 2.19 2.12 2.17 2.16

1062 1050 1056 1058 1060

4512 2960 1830 1273 970

4591 2871 1776 1242 957

92.607 3.441 0.750 0.274 0.128

21698 7224 3928 2576 1896

Park

Natural

Table 3 Effect of the number of GNMM vertices on registration performance (each row shows the average for 25 matches) Scene

Indoor

Urban

Park

Natural

Number of vertices

Es (m)

Ea (◦ )

Time (s)

Number of evaluations

Averaged J

7

0.0376

0.2498

2.60

1097

2380

11 15 19 23

0.0370 0.0256 0.0494 0.0516

0.2518 0.2455 0.2992 0.3469

2.59 2.60 2.65 2.69

1096 1100 1120 1138

2381 2385 2370 2339

7 11 15 19

0.0723 0.0695 0.0352 0.0414

0.1227 0.1409 0.1201 0.1173

2.05 2.02 2.01 2.02

1072 1059 1053 1056

2415 2410 2436 2432

23

0.0347

0.0998

2.01

1053

2435

7 11

0.1021 0.0830

0.2053 0.1733

1.80 1.81

1072 1074

2355 2362

15 19 23

0.1152 0.0713 0.1097

0.2318 0.1273 0.2367

1.81 1.80 1.81

1072 1065 1076

2347 2369 2351

7 11 15 19 23

0.0337 0.0287 0.0302 0.0251 0.0314

0.7902 0.7687 0.7869 0.7755 0.7854

2.11 2.10 2.12 2.12 2.12

1055 1050 1058 1056 1060

1829 1831 1831 1830 1831

J for the ground truth

2377

2412

2351

1776

9

Figure 5 Ground truth transformation with coarse binary cubes of length E = 1 m

it applies a maximum of 150 iterations or a convergence threshold of 10−3 . A standard k-D tree structure with a bucket size of 10 points has been employed to speed up the search for the closest points [32]. The following parameters were used for NDT: three executions with decreasing size of cells (2 m, 1 m and 0.5 m), unoccupied cells linked to closest occupied cells, Newton’s method with line search and a convergence threshold of 10−6 [15].

Table 4 shows the effect of two different bounds for random initial misalignments. The first case is the same as the one presented in Table 2. The second corresponds to a more accurate initial estimation, where random errors are limited to ±0.25 m for translations and ±2◦ for rotations. The GNMM has been tuned with 15 vertices and with 500 iterations in the first case and with 250 iterations for the second one. The coarse binary cube method has been tested with with cube edges of 0.3 m or 0.5 m.

With small initial errors, the proposed method achieves an accuracy similar to ICP and NDT in all outdoor scenes. Indoors, results are comparable to ICP, but they are improved by NDT. With big initial errors, the coarse binary cube method is capable of obtaining similar accurate results for all the scenes, whereas the other methods do not always converge, specially in less structured environments (i.e., park and natural). In relation to computation time, the coarse binary cube method prevails over ICP and NDT in all cases: it is halved for big initial errors, and reduced to a about one third for small initial errors.

4 Conclusions The paper proposes a registration method for 3D laser data that takes into account the special characteristics of scene perception. Scene registration, as usually required by field robotics, differs from object registration

10 Table 4 Comparison between ICP, NDT and binary cube registration with GNMM (each row shows the average for 25 matches) Scene

Method

Es (m)

Ea (◦ )

Time (s)

ICP NDT Cubes (E = 0.3 m)

0.1501 0.0054 0.0205

0.7479 0.0596 0.1921

15.61 4.36 2.75

±.25 m, ±2◦

Cubes (E = 0.5 m) ICP NDT Cubes (E = 0.3 m) Cubes (E = 0.5 m)

0.0314 0.0168 0.0054 0.0226 0.0412

0.2343 0.1232 0.0596 0.1998 0.2586

2.59 4.15 2.51 1.32 1.29

±1 m, ±8◦

ICP NDT Cubes (E = 0.3 m) Cubes (E = 0.5 m)

0.0664 0.0499 0.0771 0.0327

0.3045 0.2145 0.1445 0.1108

13.62 4.67 2.14 2.01

±.25 m, ±2◦

ICP NDT Cubes (E = 0.3 m) Cubes (E = 0.5 m)

0.0252 0.0093 0.0317 0.0383

0.1307 0.2102 0.1596 0.1220

3.43 4.39 1.09 1.05

ICP NDT Cubes (E = 0.3 m) Cubes (E = 0.5 m) ICP

0.1342 0.1477 0.1274 0.0683 0.0821

0.4211 0.8951 0.2926 0.1343 0.0927

11.05 4.94 1.87 1.83 3.33

NDT Cubes (E = 0.3 m) Cubes (E = 0.5 m)

0.0728 0.0587 0.0792

0.2065 0.1321 0.1515

3.69 0.95 0.92

ICP NDT Cubes (E = 0.3 m) Cubes (E = 0.5 m) ICP NDT

0.0736 0.4452 0.0338 0.0291 0.0733 0.0383

0.9261 3.4294 0.8926 0.8026 0.9038 0.8081

10.10 5.47 2.19 2.11 3.57 4.56

Cubes (E = 0.3 m) Cubes (E = 0.5 m)

0.0297 0.0291

0.8745 0.7833

1.14 1.10

Initial errors

±1 m,

±8◦

Indoor

Urban

±1 m,

±8◦

Park ±.25 m, ±2◦

±1 m,

±8◦

Natural ±.25 m, ±2◦

because all scan directions and depths may contain relevant data. The principle of the presented method is that ranges are assigned to a binary spatial grid in Cartesian space, which is a uniform spatial distribution of data. Then, a globalized Nelder-Mead method is proposed to maximize the coincidence of one-valued cubes between two scans. The integer objective function is evaluated without using any 3D data structure with a computational order O(n), where n is the number of scan points. The key advantages of this solution are the following: – Denser range areas do not outweigh farther points in the optimization problem.

– Occlusions are considered in a natural way, with no need of special registration heuristics. – No planar surfaces are required, which is useful for natural environments. – The grid representation filters measurement noise, i.e., range errors and angular deviations. – It can be applied to the raw point cloud without prior subsampling. – It avoids the drag bias problem, where non-overlapping points match to border points. – The possibility of falling into local optima is reduced with respect to point-distance optimization. – There is no need to compute the corresponding points for all scan readings.

11

Experimental validation has been based in complex range images from four real case-study environments with different degrees of structuring. All in all, the paper discusses results from 1400 different matches performed by the coarse binary cube method, as well as 200 ICP and 200 NDT matches for comparison purposes. Experimental analysis regarding grid resolution has shown that coarse grids (around 0.5 m) can be employed with good results. With this cube size, memory requirements for the data structures are of only a few Mbits. In comparison with ICP and NDT, experimental tests in indoor and outdoor scenes indicate that the proposed method achieves a similar accuracy, but with less computation time and a better performance against initial misalignments. Future work includes tests with very long range laser scanners (about 150 m). It would also be interesting to extend the proposed approach to operate in parallel and to match multiple 3D scans. Acknowledgements This work was partially supported by the Spanish CICYT project DPI 2008-00533. The authors are grate¨ ful to Martin Magnugsson from Orebro University (Sweden) for the 3D-NDT code.

References 1. P. Alho, A. Kukko, H. Hyypp¨ a, H. Kaartinen, J. Hyypp¨ a, and A. Jaakkola, “Application of boat-based laser scanning for river survey,” Earth Surface Processes and Landforms, vol. 34, no. 13, pp. 1831–1838, 2009. 2. H. Andreasson, M. Magnusson, and A. Lilienthal, “Has something changed here? Autonomous difference detection for security patrol robots,” in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego (U.S.A.), 2007, pp. 3429–3435. 3. P. J. Besl and N. D. McKay, “A method for registration of 3-D shapes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 2, pp. 239–256, 1992. 4. G. Blais and M. D. Levine, “Registering multiview range data to create 3D computer objects,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 8, pp. 820–824, 1995. 5. D. M. Cole and P. M. Newman, “Using laser range data for 3D SLAM in outdoor environments,” in Proc. of the IEEE International Conference on Robotics and Automation, Orlando (U.S.A.), 2006, pp. 1556–1563. 6. A. Elfes, “Using occupancy grids for mobile robot perception and navigation,” Computer, vol. 22, no. 6, pp. 46–57, 1989. 7. G. Guidi, B. Frischer, M. Russo, A. Spinetti, L. Carosso, and L. L. Micoli, “Three-dimensional acquisition of large and detailed cultural heritage objects,” Machine Vision and Applications, vol. 17, pp. 349–360, 2006. 8. D. H¨ ahnel and W. Burgard, “Probabilistic matching for 3D scan registration,” in Proc. of the VDI Conference Robotik, Ludwgsburg (Germany), 2002. 9. J. Y. Kao and Y. S. Tarng, “The registration of CT image to the patient head by using an automated laser surface scanning system: A phantom study,” Computer Methods and Programs in BioMedicine, vol. 83, pp. 1–11, 2006.

10. E. Kim and G. Medioni, “Urban scene understanding from aerial and ground LIDAR data,” Machine Vision and Applications, vol. 22, pp. 691–703, 2011. 11. K. Lenac, E. Mumolo, and M. Nolich, “Fast genetic scan matching in mobile robotics,” in Evolutionary Image Analysis and Signal Processing. Springer-Verlag, 2009, pp. 133– 152. 12. E. Lomonosov, D. Chetverikov, and A. Ek´ art, “Preregistration of arbitrarily oriented 3D surfaces using a genetic algorithm,” Pattern Recognition Letters, vol. 27, pp. 1201–1208, 2006. 13. M. A. Luersen and R. L. Riche, “Globalized Nelder-Mead method for engineering optimization,” Computers and Structures, vol. 82, pp. 2251–2260, 2004. 14. C. Luo and B. Yu, “Low dimensional Simplex evolution: A hybrid heuristic for global optimization,” in Proc. of the 8th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, Qingdao (China), 2007, pp. 470–474. 15. M. Magnusson, A. N¨ uchter, C. L¨ orken, A. J. Lilienthal, and J. Hertzberg, “Evaluation of 3D registration reliability and speed: A comparison of ICP and NDT,” in Proc. of the IEEE International Conference on Robotics and Automation, Kobe (Japan), 2009, pp. 3907–3912. 16. M. Magnusson, A. Lilienthal, and T. Duckett, “Scan registration for autonomous mining vehicles using 3D-NDT,” Journal of Field Robotics, vol. 24, no. 10, pp. 803–827, 2007. 17. A. Mandow, J. L. Mart´ınez, A. J. Reina, and J. Morales, “Fast range-independent spherical subsampling of 3D laser scanner points and data reduction performance evaluation for scene registration,” Pattern Recognition Letters, vol. 31, pp. 1239–1250, 2010. 18. J. L. Mart´ınez, A. Mandow, A. Reina, and J. Morales, “Outdoor scene registration from 3D laser range data with coarse binary cubes,” in Proc. of the 35th Annual Conference of the IEEE Industrial Electronics Society, Porto (Portugal), 2009, pp. 2308–2313. 19. A. N¨ uchter, “3D range image registration,” in 3D Robotic Mapping: The Simultaneous Localization and Mapping Problem with Six Degrees of Freedom. Springer-Verlag, 2009, pp. 35–75. 20. J. A. Nelder and R. Mead, “A simplex method for function minimization,” Computer Journal, vol. 7, no. 4, pp. 308–313, 1965. 21. A. N¨ uchter, K. Lingemann, J. Hertzberg, and H. Surmann, “Heuristic-based laser scan matching for outdoor 6D SLAM,” Lecture Notes in Computer Science, vol. 3698, pp. 313–328, 2005. 22. S. Y. Park, J. Baek, and J. Moon, “Hand-held 3D scanning based on coarse and fine registration of multiple range images,” Machine Vision and Applications, vol. 22, pp. 563– 579, 2011. 23. S. Y. Park, S. I. Choi, J. Kim, and J. S. Chae, “Real-time 3D registration using GPU,” Machine Vision and Applications, 2011. 24. K. Pathak, N. Vaskevicius, J. Poppinga, M. Pfingsthorn, S. Schwertfeger, and A. Birk, “Fast 3D mapping by matching planes extracted from range sensor point-clouds,” in Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, St. Louis (U.S.A.), 2009, pp. 1150–1155. 25. I. Rekleitis, J.-L. Bedwani, and E. Dupuis, “Autonomous planetary exploration using LIDAR data,” in Proc. of the IEEE International Conference on Robotics and Automation, Kobe (Japan), 2009, pp. 3025–3030. 26. J. Ryde and H. Hu, “3D mapping with multi-resolution occupied voxel lists,” Autonomous Robots, vol. 28, no. 2, pp. 169–185, 2010.

12 27. K. Shahid and G. Okouneva, “Intelligent LIDAR scanning region selection for satellite pose estimation,” Computer Vision and Image Understanding, vol. 107, no. 3, pp. 203–209, 2007. 28. M. Strand, F. Erb, and R. Dillmann, “Range image registration using an octree based matching strategy,” in Proc. of the IEEE International Conference on Mechatronics and Automation, Harbin (China), 2007, pp. 1622–1627. 29. K. T. Sutherland and W. B. Thompson, “Localizing in unstructured environments: Dealing with the errors,” IEEE Transactions on Robotics and Automation, vol. 10, no. 6, pp. 740–754, 1994. 30. S. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics. The MIT Press, 2005. 31. N. Trawny and S. I. Roumeliotis, “A unified framework for nearby and distant landmarks in bearing-only SLAM,” in Proc. of the IEEE International Conference on Robotics and Automation, Orlando (U.S.A.), 2006, pp. 1923–1929. 32. Z. Zhang, “Iterative point matching for registration of freeform curves and surfaces,” International Journal on Computer Vision, vol. 13, no. 2, pp. 119–152, 1994. 33. Q. H. Zhao, D. Urosevic, N. Mladenovic, and P. Hansen, “A restarted and modified simplex search for unconstrained optimization,” Computers and Operations Research, vol. 36, pp. 3263–3271, 2009.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.