
- Acta Photonica Sinica
- Vol. 49, Issue 12, 61 (2020)
Abstract
0 Introduction
With the rapid development of various technologies in robotics field, it is insufficient for complex environments perception by using single sensor. Thus, information fusion of different sensors is adopted to identify the category and the location of the objects [
It is easier to extract the features for 3D LIDAR and camera by using polygon or a combination of rectangle and checkerboard [
The disadvantage of the traditional two-step method is that the two calibration procedures are relatively cumbersome and low in efficiency. In addition, the calibration results of the first step will be propagated to the second step and cause the accumulation of errors [
In this paper, a single-step calibration method without a specific calibration object for the 2D LIDAR and the camera is proposed. A mathematical model of single-step calibration between 2D LIDAR and camera is established. The points and lines features of pillar in the indoor scene are extracted by image process and point cloud process. Furthermore, the features are exploited to solve the projection matrix by the singular value decomposition method. The experimental platform is built and conducted to verify superiority of proposed method.
1 Calibration model for camera and 2D LIDAR
The calibration model of 2D LIDAR and camera is shown in Fig.1. The 2D LIDAR and camera are fixed on the mobile robot, and the corners or pillars with obvious line features are used as the calibration objects.

The image captured by the camera is represented by the two-dimensional coordinate system (u, v). Similarly, the 2D point cloud data is represented by the LIDAR coordinate system (x, y). In order to facilitate the derivation of the formula, the z coordinate of the point cloud on the same plane is uniformly specified as 0. Therefore, 2D point cloud coordinates can be expressed as 3D points. Meanwhile, the pixel coordinates in the image can be expressed as (u, v). In this way, the transformation from 3D point (x, y, 0) to 2D point (u, v) can be expressed as
where, the 3D point is represented by homogeneous coordinates, fu and fv are the effective focal length of the camera in the horizontal and vertical directions, respectively, u0 and v0 are the coordinates of the image center point,
From Eq. (1), it can be further obtained
Eq. (2) can be written as
The intersection line equation of pillar surface can be expressed as
where, a, b and c are parameters of the line equation.
It is the vital condition to establish the equation by using the point on the straight line, namely, the point-to-lines correspondence. If the pillar is considered as the calibration object for feature extraction, it is convenient to fit the intersection of two lines projected by the 2D LIDAR scanning on the pillar surface. Meanwhile, the pixel coordinate of intersection should be on intersection line of pillar surface in the pixel plane. According to the point-to-lines correspondence, by substituting Eq. (3) to Eq. (4), the line equation is
Eq. (5) could be rewritten as
It can be seen from Eq. (6) that 9 unknown parameters need to be solved to obtain the pose conversion relationship between the two sensors because the z coordinate of the space point is assumed to be 0. When the coordinates of a certain space point and its corresponding line equation are known, an equation about the projection matrix can be constructed. Thus, at least 9 sets of correspondences are needed to determine all parameters.
2 Projection matrix solution based on point-to-lines correspondences
The proposed method does not require a specific calibration object. It can accomplish the calibration by using an indoor scene with angular features, such as wall corners, square pillars or other objects. Afterwards, the correspondences between multiple sets of points and straight lines are obtained, and linear system of equations is established for solving the projection matrix. The flow chart of the proposed calibration method is shown in Fig. 2.

The proposed method is as follows:
1) Data collection: Firstly, the relative poses of the 2D LIDAR and the camera are fixed. Then, the distance and angle between the mobile robot and the pillar are changed to generate different observation data. It is worth mentioning that all data is collected in a static scene. Thus, data synchronization is not required.
2) Extraction of the straight line features from the image: Canny operator is used for Hough transform to extract the intersection line features of the pillar surface. Afterwards, the intersection line equation in each pose is obtained.
3) The fitting of the corner points: The intersection point of the two lines projected by the 2D LIDAR scanning on the pillar surface is formed. The straight lines are fitted by the point cloud data on the two walls and RANSAC method. Moreover, the intersection can be calculated. Obviously, the intersection must satisfy the line equation extracted in step 2).
4) Calculation of the projection matrix: According to the correspondences between the point and the straight line obtained in steps 2) and 3), the projection matrix is calculated. The reprojection error is calculated, and the data with large noise is eliminated. Then, recalculate the projection matrix to achieve the optimized result.
2.1 Line features extraction
It is necessary to process the pillar images captured by each pose to obtain the line equation in the corresponding pixel plane. Canny operator and Hough transform are used to extract the edge features of the image.
Firstly, it is important to perform grayscale processing on the original image. Furthermore, a Gaussian filter is adopted to smooth the grayscale image and prevent noise affecting the result of feature extraction. Secondly, calculate the gradient value and direction of the processed image. Afterwards, the non-maximum value suppression operation is conducted on the result according to the gradient direction. Finally, Hough transform is applied to the image to extract the line features.
The image captured in a certain pose is shown in Fig.3(a), and the image after grayscale transformation and Gaussian smoothing is shown in Fig.3(b).

The Canny operator and the standard Hough transform are conducted to extract line on the image, which is shown in Fig.4. The extreme points have been marked in Fig.4(a). The feature line of the pillar is shown in Fig.4(b).
Assuming that the points p1(u1, v1) and p2(u2, v2) are the maximum response points of the line feature corresponding to the intersection of two lines projected by the 2D LIDAR scanning on the pillar surface, the relevant parameters of Eq. (4) can be obtained

The same operation is carried out to solve the parameters of line equations for all images.
2.2 Corners fitting
It is necessary to find the point cloud coordinates on the straight line in the 2D LIDAR observation data. The key to ensuring the correctness of the calibration is how to accurately find the intersection of two lines projected by the 2D LiDAR scanning on the pillar surface.
The method of fitting a straight line will be used to solve the feature point. There are a large number of point clouds distributed on the pillar surface when the LIDAR obtains a frame of point cloud data. However, it is difficult to ensure that a certain laser point is exactly on the extracted line of the pillar. Thus, the feature point must be estimated. The scanning plane of the LIDAR will intersect the two planes of the pillar in two straight lines. Therefore, the corresponding point cloud data can fit two straight lines well. The point cloud data observed by a certain pose is shown in Fig. 5(b). The color of the point cloud will change as the distance changes.

As shown in the schematic diagram of Fig. 6(a), the 2D LIDAR will generate two scan lines on the pillar. Afterwards, the generated point cloud is extracted and drawn separately, as shown in Fig. 6(b).

After obtaining the target point cloud, the RANSAC method is employed to fit the two straight lines separately. Firstly, two points are randomly extracted from the target point cloud and a straight line is generated. Then the distance from all points to this straight line can be calculated. Afterwards, continue to randomly select two points and compare the sum of the distances until the value of the distance converges or reaches the number of iterations. One of the straight lines is successfully fitted, and the same operation is performed on the point with the larger error to find another straight line. Finally, the intersection of the two straight lines is found as shown in Fig. 7.

The blue and red in Fig. 7 represent two different RANSAC operations. The black point is the intersection of two lines projected by the 2D LIDAR scanning on the pillar surface. All point cloud data are performed the same operations to achieve the feature points corresponding to each pose.
2.3 Remove outlier corresponding data
After obtaining the preliminary projection matrix, the pixel distance between the fitted intersection and the corresponding line is defined as the reprojection error. Part of point-to-lines correspondences will affect the experimental results due to the noise. Therefore, if the reprojection error of the correspondences data is more than twice the average value, the data is removed. Afterwards, the projection matrix is recalculated. Finally, the optimized projection matrix is obtained.
3 Experimental results
The 2D LIDAR used is RPLIDAR A2 with an angular resolution of 0.9° and a scanning angle of 0~360°. The color camera of Kinect is used. The mobile robot is shown in Fig. 8.

A total of 12 sets of images and point cloud data were collected in different poses. It is ensured that the pillar observed by each pose is at a different inclination angle and the position in the image. In this way, the parameters of the extracted line equations will not have linear correlation to ensure the validity of the data. Moreover, the accuracy of calibration can be improved. The 12 images collected are shown in Fig. 9.

The images and point clouds of each group of data should be processed accordingly. The parameters of the 12 sets of feature points and straight line equations obtained are shown in Table 1.
S/N | x/m | y/m | a | b | c |
---|---|---|---|---|---|
1 | -0.645 679 53 | 0.025 819 36 | -275 | 172 | 83 874 |
2 | -0.663 591 86 | -0.157 623 35 | -269 | 219 | 32 780 |
3 | -0.643 988 79 | -0.256 961 64 | -307 | 131 | 22 593 |
4 | -0.892 779 23 | -0.220 791 22 | -423 | 244 | 56 064 |
5 | -0.919 356 82 | -0.116 384 39 | -381 | 177 | 82 758 |
6 | -0.930 696 01 | 0.445 599 08 | -357 | -268 | 260 015 |
7 | -1.013 609 30 | 0.073 209 52 | -474 | -182 | 210 346 |
8 | -0.919 355 99 | -0.183 984 88 | -189 | -133 | 62 447 |
9 | -1.143 314 60 | -0.059 158 44 | -330 | -183 | 132 366 |
10 | -1.206 768 00 | 0.121 189 59 | -346 | -359 | 193 661 |
11 | -1.210 898 40 | 0.296 762 59 | -477 | -243 | 269 037 |
12 | -1.266 390 80 | -0.534 483 91 | -258 | -99 | 45 486 |
Table 1. Data of fitted corner points and linear equation parameters
According to Eq. (6) and the data in Table 1, the statically indeterminate equations are listed. The singular value decomposition method can be used to solve 9 unknown parameters. The projection matrix is
According to the definition of reprojection error in Section 2.3, the current result is calculated for the error value. The results are shown in Table 2. The average value of the reprojection error at this time is 0.375 5 pixel. The reprojection errors is within 0.751 pixel in addition the fourth and fifth groups. Obviously, the fourth and fifth group data will affect the accuracy of projection matrix.
S/N | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|
error | 0.572 4 | 0.049 9 | 0.426 9 | 0.848 8 | 1.423 1 | 0.117 6 |
S/N | 7 | 8 | 9 | 10 | 11 | 12 |
error | 0.196 3 | 0.274 4 | 0.186 7 | 0.196 9 | 0.108 5 | 0.104 3 |
Table 2. Reprojection errors before optimization
After removing the experimental data with serial numbers 4 and 5, the projection matrix is recalculated
The optimized projection matrix is used to recalculate the reprojection error. The average reprojection error after optimization is 0.045 9 pixel, which reduces about 0.33 pixel than before optimization. The experiments show that removing experimental data with large errors can further improve the accuracy.
By exhibiting the effect of point cloud projection on images in different scenes, the proposed calibration method is intuitively more advanced compared with calibration methods of Ref. [19]. The method of Ref [19] is uploaded inhttps://github.com/MegviiRobot/CamLaserCalibraTool. This method uses a recorded bag file for calibration. And the calibration scenarios of the method is shown in Fig. 10. They are normalized by unit and shown in Table 3 in order to better compare the calibration matrix of different methods. The reprojection errors of Ref.[19] are also shown in Table 4. And the line chart of the reprojection errors of different methods are shown in Fig. 11.

Method | Matrix |
---|---|
Optimized method | |
Ref. [ |
Table 3. Matrices of different calibration methods

It can be seen from Table 4 and Fig. 11 that the reprojection errors of the 10 points of the proposed algorithm are all smaller than those using Ref. [19]. The reprojection errors in Ref. [19] are almost greater than 0.1 pixels and the average error is 0.223 39 pixels. The average error of the algorithm proposed in this paper after optimization is 0.045 9 pixels. Obviously, the algorithm proposed in this paper has higher accuracy.
S/N | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
This work | 0.025 9 | 0.078 8 | 0.052 8 | 0.017 8 | 0.054 3 |
Ref. [ | 0.572 4 | 0.049 9 | 0.426 9 | 0.117 6 | 0.196 3 |
S/N | 6 | 7 | 8 | 9 | 10 |
This work | 0.090 5 | 0.087 5 | 0.021 5 | 0.004 5 | 0.025 7 |
Ref. [ | 0.274 4 | 0.186 7 | 0.196 9 | 0.108 5 | 0.104 3 |
Table 4. Reprojection error comparison
According to Eq. (3), the data collected in the experiment were reprojected using the projection matrices before and after optimization to verify the correctness of the proposed method. All the point clouds in a scene have been reprojected in Fig. 12 (a) and Fig. 12 (b). Similarly, the same scene is shown in Fig. 12 (c) with the method of Ref. [19]. It can be clearly seen from the change in color depth that the deepest distance in the pillar is the closest. At the same time, the point cloud does not extend beyond the edge of the pillar and corresponds well to the pillar of the image. Comparing the projection effects before and after optimization, it can be seen that the optimized effect in Fig. 12 (b) is better at the right edge of the pillar. The points are more complete on the wall without exceeding the pillar too much. Similarly, it can be clearly seen from the reprojection effect of Ref. [19] that the point cloud is partially protruding from the right edge of the wall. Therefore, experiments have proved that optimization can improve the projection effect. Furthermore, the algorithm proposed in this paper is better than the Ref. [19].

Furthermore, other verification scenes were separately collected to view the effect of reprojection in order to further illustrate the correctness of the results, which is shown in Fig. 13. Fig. 13 (a) and Fig. 13 (b) are the projection effects of the matrix before optimization. Fig. 13 (c) and Fig. 13 (d) are the projection effects after optimization. The scene 1 is more complex and multiple objects are arranged alternately. It can be seen that the point cloud has obvious color changes and position differences on different objects. There is no point cloud beyond the edges of different objects. All point clouds correspond well to the objects in the image. Scene 2 is a relatively regular scene and basically constructed from different planes. It can be clearly seen that the depth of the point cloud color changes at the place where the planes intersect. Furthermore, the effect of the point cloud on the edge of the object is slightly improved by comparing scenes before and after optimization.
Fig. 13 (e) and Fig. 13 (f) are the reprojection effects of Ref. [19]. At the left edge of the wall in Scene 1 and Scene 2, the background point cloud in Fig. 13(e) is more present on the wall in the foreground than in Fig. 13(c). At the same time, the fitting accuracy of the point cloud at the edge of the column in Fig. 13(f) is slightly worse than that of Fig. 13(d). Although the method of Ref. [19] has a good projection effect, it is still slightly worse than the method proposed here. In summary, the proposed calibration method has a good reprojection effect and can realize the data fusion of 2D LIDAR and color camera.

4 Conclusions
A calibration method combining internal and external parameters is proposed in this paper, which does not require separate calibration objects. Indoor structured objects such as walls with edges and corners can be used as targets for the calibration. The Hough transform based on the Canny operator is adopted to extract the line features of the pillar image. The parameters of the straight line equation are obtained. Afterwards, by fitting straight lines to the point cloud on the pillar and calculating the intersection of the straight lines, the correspondence between the point cloud and the image can be obtained. All the data are used to build the system of line equations to solve the initial matrix. Moreover, the reprojection error is calculated according to the initial matrix to remove the experimental data with large noise. The projection matrix is recalculated to achieve better result. Experimental results show that the average reprojection error of the pillar feature points is reduced from 0.375 5 pixel to 0.045 9 pixel by adopting optimized projection matrix. Compared with the improved version of Ref. [19], this calibration method has high efficiency and accuracy. There is no mismatching phenomenon such as point cloud misalignment at the boundary of the object. The calibration method has high efficiency and accuracy. Moreover, it is not necessary to make calibration objects with complex shapes or patterns.
References
[1] Tong QIN, Pei-liang LI, Shao-jie SHEN. VINS-Mono: a robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics, 34, 1004-1020(2018).
[2] C R QI, Wei LIU, Chen-xia WU. Frustum pointnets for 3D object detection from RGB-D data(2017).
[3] Ting-ting CUI, Shun-ping JI, Jie SHAN. Line-based registration of panoramic images and LiDAR point clouds for mobile mapping. Sensors, 17, 70(2016).
[4] Y PARK, S YUN. Calibration between color camera and 3D LIDAR instruments with a polygonal planar board. Sensors, 14, 5333-5353(2014).
[5] P AN, T MA, K YU. Geometric calibration for LiDAR-camera system fusing 3D-2D and 3D-3D point correspondences. Optics Express, 28, 2122-2141(2020).
[6] Zheng-you ZHANG. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis & Machine, 22, 1330-1334(2000).
[7] J CHEN, K BENZEROUAL, R S ALLISON. Calibration for high-definition camera rigs with marker chessboard(2012).
[8] A NAPIER, P CORKE, P NEWMAN. Cross-calibration of push-broom 2D LIDARs and cameras in natural scenes(2013).
[9] S WASIELEWSKI, O STRAUSS. Calibration of a multi-sensor system laser rangefinder/camera(1995).
[10] Ling-li YU, Meng PENG, Zuo YOU. Separated calibration of a camera and a laser rangefinder for robotic heterogeneous sensors. International Journal of Advanced Robotic Systems, 10, 56985(2013).
[11] J Y BOUGUET. Camera calibration toolbox for matlab. www.vision.caltech.edu/bouguetj
[12] Q ZHANG, R PLESS. Extrinsic calibration of a camera and laser range finder (improves camera calibration)(2004).
[13] Li-pu ZHOU. A new minimal solution for the extrinsic calibration of a 2D LIDAR and a camera using three plane-line correspondences. IEEE Sensors Journal, 14, 442-454(2014).
[14] R UNNIKRISHNAN, M HEBERT. Fast extrinsic calibration of a laser rangefinder to a camera. Pennsylvania(2005).
[15] D G CHOI, I S KWEON. Extrinsic calibration of a camera and a 2D laser without overlap. Robotics & Autonomous Systems, 78, 17-28(2016).
[16] A Y KHALIL, M BASSAM, A W KHALID. Extrinsic calibration of camera and 2D laser sensors without overlap. Sensors, 17, 2346-2369(2017).
[17] Li-pu ZHOU, Zhi-dong DENG. A new algorithm for the establishing data association between a camera and a 2-D LIDAR. Journal of Tsinghua University: Natural Sciences English Edition, 19, 314-322(2014).
[18] R HARTLEY, A ZISSERMAN. Multiple view geometry in computer vision(2003).

Set citation alerts for the article
Please enter your email address