Calibration of the Relative Orientation between Multiple Depth Cameras Based on a Three-Dimensional Target

Depth cameras play a vital role in three-dimensional (3D) shape reconstruction, machine vision, augmented/virtual reality and other visual information-related fields. However, a single depth camera cannot obtain complete information about an object by itself due to the limitation of the camera’s field of view. Multiple depth cameras can solve this problem by acquiring depth information from different viewpoints. In order to do so, they need to be calibrated to be able to accurately obtain the complete 3D information. However, traditional chessboard-based planar targets are not well suited for calibrating the relative orientations between multiple depth cameras, because the coordinates of different depth cameras need to be unified into a single coordinate system, and the multiple camera systems with a specific angle have a very small overlapping field of view. In this paper, we propose a 3D target-based multiple depth camera calibration method. Each plane of the 3D target is used to calibrate an independent depth camera. All planes of the 3D target are unified into a single coordinate system, which means the feature points on the calibration plane are also in one unified coordinate system. Using this 3D target, multiple depth cameras can be calibrated simultaneously. In this paper, a method of precise calibration using lidar is proposed. This method is not only applicable to the 3D target designed for the purposes of this paper, but it can also be applied to all 3D calibration objects consisting of planar chessboards. This method can significantly reduce the calibration error compared with traditional camera calibration methods. In addition, in order to reduce the influence of the infrared transmitter of the depth camera and improve its calibration accuracy, the calibration process of the depth camera is optimized. A series of calibration experiments were carried out, and the experimental results demonstrated the reliability and effectiveness of the proposed method.


Introduction
In recent years, the number of depth camera applications in our daily lives has increased dramatically with the reduction of equipment cost [1]. Depth cameras play an important role in machine vision-related fields, such as human face recognition [2,3], three-dimensional scene reconstruction [4], and limb motion capture [5]. Due to the diversification of application requirements, higher demands are put forward for 3D information acquisition, such as more precise scanning of scenes and more rapid modeling of objects. A single depth camera no longer satisfies the practical requirements of some application scenarios. Multiple depth camera systems can potentially solve the above problems with accuracy and speed, and offer outstanding advantages compared to single depth cameras.
In order to apply multiple depth camera systems to different scenes and ensure the accuracy and efficiency of measurements, we need to calibrate these systems to obtain the key parameters. In recent years, many scholars have conducted investigations on camera system calibration, and have achieved remarkable progress [6][7][8][9][10][11][12][13][14][15]. Among them, the active vision-based camera calibration method allows the calculation of the internal camera parameters without a calibration object, with only performing specific motions with the camera [6]. The advantage of this method is its simple algorithm and robustness, but the difficulty of its application in multiple depth camera systems is higher due to the high cost of the system and the requirement of experimental setup. Traditional camera calibration methods need to use calibration plates with known dimensions. By establishing the corresponding relationship between feature points with known coordinates on the calibration plates and their pixel points, the relationship between the three-dimensional coordinates and the two-dimensional plane coordinates can be obtained, and the internal and external parameters of the camera model can be obtained using certain algorithms [7,8]. In addition, the most classical camera calibration method is to use plane targets for the calibration, as proposed by Zhang Z. [12]. This method uses a camera to capture a chessboard pattern in different directions to achieve calibration, which is easy to perform and provides high accuracy. Therefore, it is widely used for single camera calibration. Since the measurement range of the planar target is limited to one feature surface, and multiple depth camera systems have large viewing angles, the planar target method is not suitable for calibrating multiple depth camera systems. The merging of multiple depth camera systems with multiple coordinate systems into a single coordinate system and calibrating without overlapping fields of view is an important research topic. Thus, the integration of the coordinate system of multiple depth cameras based on their view field is the key issue for their calibration. The calibration method proposed by Manuel et al. can be defined as a hybrid solution, consisting of photogrammetry and self-calibration methods [13], which needs special markers for the calibration. Pedersini et al. used two overlapping planar calibration boards to calibrate the multiple camera systems [14]. Their method is based on a self-calibration approach, which allows the refinement of the prior knowledge of the world coordinates of the targets while estimating the parameters of the camera model. Shen et al. presented a complete calibration methodology using a non-planar calibration target with a sphere for rapid calibration [15] based on the precise fitting of ellipses. Avetisyan et al. presented a simple method for calibrating multiple cameras reliably using a tracking system with a trackable calibration target [16]. Beck et al. realized a method for the calibration and registration of multiple RGBD sensors by sweeping a tracked checkerboard through the capturing space in front of each sensor [17,18]. All methods require a tracked calibration target to eliminate the problems caused by the needs for overlapping views. However, the calibration process of these methods relies heavily on the manufacturing accuracy of calibration target. Huang et al. designed a 3D calibration object to calibrate multiple camera systems [19]. This method does not calibrate the 3D calibration object, but only relies on ideal world coordinates for subsequent calibration, and its error is difficult to measure. Current calibration methods have many limitations, such as complex algorithms, poor operability, vulnerability to the influence of multi-camera fields of view and other factors and are thus inapplicable for actual products to a satisfactory degree.
In addition, scholars have proposed many improved methods for the calibration of depth cameras [20]. The functionality of popular depth cameras on the market is mainly accomplished through infrared imaging and projectors, hence the depth image is "carried" by the infrared image [21]. So, in essence, the calibration of depth cameras is equivalent to calibrating an infrared camera. Herrera et al. proposed a calibration model of depth camera parameters based on the original disparity of the camera [22], and this method can obtain more accurate results through longer calculations.
Smisek et al. proposed to calibrate infrared cameras by shielding the infrared projector to avoid the speckle effect [23], as the calibration error of depth cameras is mainly caused by the speckle of the infrared projector. This method is simple and easy to apply. However, current methods are time consuming for high accuracy results, or the results are unstable.
In this paper, a method for the calibration of multiple depth camera systems using a 3D target at specific angles is proposed. The specific angle of the three-dimensional target is determined by the combination of cameras in the multiple depth camera systems. This angle can ensure that multiple cameras can simultaneously capture chessboard images in full camera field of view for calibration. This 3D target allows the calibration of multiple depth cameras at relative orientations with limited or no overlapping fields of view. In addition, the point cloud is obtained by scanning the 3D target with a high-precision lidar. These point clouds are used to calibrate the 3D target to ensure that the position and posture of the calibration plate are accurate and reliable. Moreover, the calibration process of the depth camera is optimized. The results of a series of comparative experiments demonstrate that the method proposed in this paper is efficient and accurate.
The paper is organized as follows. In Section 2, a 3D calibration target is designed, a method for precise calibration using 3D targets and a lidar is presented and a pin-hole depth camera calibration model is introduced. In addition, a global calibration method for multiple depth cameras is proposed. In Section 3, the experiment results we obtained from the application of the proposed method are compared against well-known methods. Finally, Section 4 concludes the work.

Principle
In order to calibrate the relative orientation between multiple depth cameras simultaneously, a 3D target has been designed. It can be used to integrate the coordinates of each depth camera into one coordinate system quickly and effectively. Each calibration plate on the 3D target can calibrate a camera independently, so it can calibrate multiple cameras simultaneously.

Design of 3D Target
The 3D target is as shown in Figure 1, and consists of three planar calibration boards to ensure that each camera can capture enough calibration information. The established coordinate system is shown in Figure 1, with the bottom of the middle calibration board as the origin of the unified coordinate. Since the location of each point is already known, the spatial coordinates of feature points can be easily obtained. The method for unifying the coordinates of the feature points on the 3D target to the same world coordinate system is described in Section 2.2. combination of cameras in the multiple depth camera systems. This angle can ensure that multiple cameras can simultaneously capture chessboard images in full camera field of view for calibration. This 3D target allows the calibration of multiple depth cameras at relative orientations with limited or no overlapping fields of view. In addition, the point cloud is obtained by scanning the 3D target with a high-precision lidar. These point clouds are used to calibrate the 3D target to ensure that the position and posture of the calibration plate are accurate and reliable. Moreover, the calibration process of the depth camera is optimized. The results of a series of comparative experiments demonstrate that the method proposed in this paper is efficient and accurate. The paper is organized as follows. In Section 2, a 3D calibration target is designed, a method for precise calibration using 3D targets and a lidar is presented and a pin-hole depth camera calibration model is introduced. In addition, a global calibration method for multiple depth cameras is proposed. In Section 3, the experiment results we obtained from the application of the proposed method are compared against well-known methods. Finally, Section 4 concludes the work.

Principle
In order to calibrate the relative orientation between multiple depth cameras simultaneously, a 3D target has been designed. It can be used to integrate the coordinates of each depth camera into one coordinate system quickly and effectively. Each calibration plate on the 3D target can calibrate a camera independently, so it can calibrate multiple cameras simultaneously.

Design of 3D Target
The 3D target is as shown in Figure 1, and consists of three planar calibration boards to ensure that each camera can capture enough calibration information. The established coordinate system is shown in Figure 1, with the bottom of the middle calibration board as the origin of the unified coordinate. Since the location of each point is already known, the spatial coordinates of feature points can be easily obtained. The method for unifying the coordinates of the feature points on the 3D target to the same world coordinate system is described in Section 2.2.  Figure 2. The distance between the camera and the 3D target is determined by the focal length of the camera and should be such that the calibration board covers the field of view of each camera as fully as possible, so that each camera can capture enough calibration information. Each camera must be able to obtain a clear image at this distance. Each camera corresponds to a calibration board and only needs to capture the corresponding board information; camera_up corresponds to the Right calibration board, camera_down corresponds to the Left calibration board and camera_mid corresponds to the Mid calibration board. As long as the calibration board is fully covered by the camera's field of view, it the angle between the calibration boards does not matter. There are no other special requirements for the tolerance of the 3D calibration target, because a high precision lidar will be used to accurately  Figure 2. The distance between the camera and the 3D target is determined by the focal length of the camera and should be such that the calibration board covers the field of view of each camera as fully as possible, so that each camera can capture enough calibration information. Each camera must be able to obtain a clear image at this distance. Each camera corresponds to a calibration board and only needs to capture the corresponding board information; camera_up corresponds to the Right calibration board, camera_down corresponds to the Left calibration board and camera_mid corresponds to the Mid calibration board. As long as the calibration board is fully covered by the camera's field of view, it the angle between the calibration boards does not matter. There are no other special requirements for the tolerance of the 3D calibration target, because a high precision lidar will be used to accurately calibrate the relative position of the 3D calibration target, as explained in Section 2.2. When calibrating the internal parameters of the cameras, Zhang's plane calibration method can be used to obtain the camera's internal parameters by slightly moving the camera to take some photos. When calibrating the external parameters of the camera, the only operation is to place the three depth cameras in front of the 3D target and obtain an image. Because the coordinates of the feature points are obtained from the same coordinate system, the three depth camera coordinate systems can be unified into a single system using three photos, thus the spatial position of the cameras can be finally obtained. calibrate the relative position of the 3D calibration target, as explained in Section 2.2. When calibrating the internal parameters of the cameras, Zhang's plane calibration method can be used to obtain the camera's internal parameters by slightly moving the camera to take some photos. When calibrating the external parameters of the camera, the only operation is to place the three depth cameras in front of the 3D target and obtain an image. Because the coordinates of the feature points are obtained from the same coordinate system, the three depth camera coordinate systems can be unified into a single system using three photos, thus the spatial position of the cameras can be finally obtained.

Calibration of the 3D Target
In this part, the calibration using the 3D target and the unification of the coordinate systems of the feature points are introduced in detail.
The calibration of the 3D target is the basis of the whole calibration process. Accurate calibration of the 3D target can reduce the error in the subsequent calibration process effectively. The point cloud is obtained by scanning the 3D target with a high precision lidar, and the point cloud is used to calibrate the target.
Firstly, the plane of the point cloud should be fitted. The plane equation can be described as: where (cos ∝, cos , cos ) is the normal vector direction cosine at point (x, y, z) on the plane. | p | is the distance from the origin to the plane. The distance from any data point to the plane is: In order to obtain the best fitting plane, Equation (3)'s value needs to be normalized: Equation (4) can be obtained by calculating the partial derivatives of for d, a, b and c respectively: Therefore, the normal vector of the plane can be expressed as: Figure 2. Structure of the designed multiple depth camera system.

Calibration of the 3D Target
In this part, the calibration using the 3D target and the unification of the coordinate systems of the feature points are introduced in detail.
The calibration of the 3D target is the basis of the whole calibration process. Accurate calibration of the 3D target can reduce the error in the subsequent calibration process effectively. The point cloud is obtained by scanning the 3D target with a high precision lidar, and the point cloud is used to calibrate the target.
Firstly, the plane of the point cloud should be fitted. The plane equation can be described as: x cos ∝ +y cos β + z cos γ + p = 0, where (cos ∝, cos β, cos γ) is the normal vector direction cosine at point (x, y, z) on the plane. | p | is the distance from the origin to the plane. The distance from any data point to the plane is: In order to obtain the best fitting plane, Equation (3)'s value needs to be normalized: Equation (4) can be obtained by calculating the partial derivatives of f for d, a, b and c respectively: Therefore, the normal vector of the plane can be expressed as: After obtaining the normal vectors of the three calibration plates, it is necessary to unify the three planes into the laser scanner's coordinate system. If we define the normal vector n 0 = (0, 0, 1), the rotation angle from each plane to the laser scanner is: The plane where the rotation angle is located is composed of vectors n le f t and n 0 . Then, the axis of rotation must be perpendicular to that plane according to the cross-product equation: where i, j, k are the unit vectors of the x, y, z axes respectively. So, the rotation axis c le f t (c 1 , c 2 , c 3 ) is: The unit vector of the rotation axis is obtained as follows: Given the rotation axis and its corresponding rotation angle, the rotation matrix can be obtained according to Rodrigues' rotation equation [24]: with where I is the 3 × 3 unit matrix, and ω le f t denotes the "cross-product matrix" for the unit vector c le f t . Similarly, we obtain the rotation matrices R mid , R right . After rotating the plane using the rotating matrix, the translation matrix can be obtained by calculating the distance between two points. Before the rotation, we first fit the plane, and we can decipher that the center point of the best fitting plane is p le f t . After the rotation, the plane is transferred to the coordinate system of the laser scanner, so we choose point p 0 = (0, 0, 0). And the translation matrix is: similarly, we can obtain the translation matrix T mid , T right . So far, the rotation and translation matrices of the three planar calibration targets have been obtained. The feature points' coordinates of the 3D target are unified into the same world coordinate system simultaneously.

Calibration of Multiple Depth Cameras
In this section, the calibration of three depth cameras using the 3D target is introduced. The depth camera consists of an infrared camera and an infrared projector. Thus, depth information is contained in infrared images. Because the infrared projector will interfere with the image taken by the infrared camera, the infrared projector is blocked during calibration. In order to ensure stable calibration results, halogen lamps are used to increase ambient brightness and to obtain proper infrared images.
The infrared camera is suitable for the pin-hole model, including radial correction and tangential correction. According to Zhang's calibration method, the transformation relationship between the camera coordinate system and world coordinate system is: where s is an arbitary scale factor, (u, v) is the coordinate of a point in the image coordinate system, A is called the camera's intrinsic matrix, (u 0 , v 0 ) represents the coordinate of the principal point, α and β represent the scale factors in the image's u and v axes, and γ is the parameter describing the skew of the two image axes. Normally, the default value of γ is 0. R is the rotation matrix and T is the translation matrix which are called the extrinsic parameters. X w , Y w , Z w are the world coordinates. In Section 2.2, the coordinates of the feature points on the 3D target X w , Y w , Z w were unified into the same world coordinate system. The corresponding u and v coordinates can be obtained by processing the calibrated image. By moving the camera to obtain several sets of photos, the corresponding internal parameters can be obtained. The R and T of each depth camera relative to the calibration board can also be obtained. The relative position relationship between depth cameras can be obtained from the following steps. Firstly, R and T from the Left calibration board to the Mid calibration board can be expressed as: Secondly, R and T from camera_down to Mid calibration board can be expressed as: Finally, the transformation relation between camera_mid and camera_down is described in Equation (17): Similarly, the transformation relation between camera_mid and camera_up can be obtained. The rotation matrix can be decomposed into: where R x , R y , R z are the rotation matrixes that rotate around the x, y and z axes, respectively. R x contains the angles between the cameras.

Hardware Setup of Multiple Depth Camears
In order to verify the effectiveness of the proposed method, a multiple depth camera system was designed as shown in Figure 3. The system consisted of a 3D target with three planar chessboards combined at a specific angle, three ORBBEC Astra infrared cameras with a pixel resolution of 640 × 480 and a Faro Focus 3D X330 lidar with a distance accuracy of ±1 mm, which has very low noise. The lidar's horizontal field of vision is 360 • and its vertical field of vision is 300 • . The vertical and horizontal step lengths of the lidar are both 0.09 • . The cameras and Faro lidar follow the right-hand coordinate system. The experimental computer was a laptop with a CPU clock speed of 2.2 GHz, and the software environment was MATLAB R2015a.
The theoretical angle between the middle calibration board and the other two calibration boards was 138 • , which is dependent on the angle between the multiple depth cameras. However, errors will be introduced in manufacturing and assembling, resulting in a deviation of the actual angle values. Our calibration using stereo targets solves these errors. Each planar chessboard had a size of 400 × 300 mm with 12 × 9 squares on it, and each square had a size of 30 × 30 mm. The Faro Focus 3D X330 lidar will automatically compensate during the scanning process to improve the accuracy. The theoretical angle between the middle calibration board and the other two calibration boards was 138°, which is dependent on the angle between the multiple depth cameras. However, errors will be introduced in manufacturing and assembling, resulting in a deviation of the actual angle values. Our calibration using stereo targets solves these errors. Each planar chessboard had a size of 400 × 300 mm with 12 × 9 squares on it, and each square had a size of 30 × 30 mm. The Faro Focus 3D X330 lidar will automatically compensate during the scanning process to improve the accuracy. The structure of the multiple depth cameras is shown in Figure 4. Each depth camera contains an infrared projector and an infrared camera. The theoretical value of the angle between each camera is 42°, but the actual value contains errors introduced during manufacturing and assembly, which require calibration to correct them.

Calibration
By placing the camera in a position where the 3D board covers the filled with field of view, the specific calibration steps are designed as follows: The structure of the multiple depth cameras is shown in Figure 4. Each depth camera contains an infrared projector and an infrared camera. The theoretical value of the angle between each camera is 42 • , but the actual value contains errors introduced during manufacturing and assembly, which require calibration to correct them. The theoretical angle between the middle calibration board and the other two calibration boards was 138°, which is dependent on the angle between the multiple depth cameras. However, errors will be introduced in manufacturing and assembling, resulting in a deviation of the actual angle values. Our calibration using stereo targets solves these errors. Each planar chessboard had a size of 400 × 300 mm with 12 × 9 squares on it, and each square had a size of 30 × 30 mm. The Faro Focus 3D X330 lidar will automatically compensate during the scanning process to improve the accuracy. The structure of the multiple depth cameras is shown in Figure 4. Each depth camera contains an infrared projector and an infrared camera. The theoretical value of the angle between each camera is 42°, but the actual value contains errors introduced during manufacturing and assembly, which require calibration to correct them.

Calibration
By placing the camera in a position where the 3D board covers the filled with field of view, the specific calibration steps are designed as follows:

Calibration
By placing the camera in a position where the 3D board covers the filled with field of view, the specific calibration steps are designed as follows: 1.
Initialize the lidar and select medium scan quality. The 3D target is placed approximately 1 m away in front of the lidar. The point cloud data are processed to extract the calibration plate's point cloud from the scanned point cloud. Calibration using the 3D target's processed point cloud data and the R, T of each calibration board relative to the lidar is performed.

2.
Block the infrared projector and use a halogen lamp to illuminate the 3D target. Move the camera around the 3D target repeatedly and ensure that images of the 3D target from different angles are obtained. Approximately 15 pictures are obtained. 3.
The images are processed, and the coordinates of feature points are extracted for calibration. The internal parameters of the depth camera and the external parameters of each camera relative to the 3D target are obtained.

4.
Each depth camera takes a picture of the corresponding calibration board filling its field of view. For example, camera_up photographs the Right calibration board. Process the captured image to obtain the R, T and angle between each depth camera, that is, the spatial relationship between each depth camera.
The external parameters of each calibration board are given in Table 1, and the intrinsic parameters are given in Table 2, where f x , f y are the equivalent focal length of the x and y axes, and u 0 , v 0 are the principal points of the x and y axes, respectively. The external parameters of each camera are given in Table 3. The distribution of the reprojection error is as shown in Figure 5.  Figure 5, each photograph contains the different reprojection errors of each camera at the fifteen locations. The errors are mainly distributed between −0.1 to 0.1 pixel in all three photographs. This demonstrates that the accuracy and stability of the method are both high. In Figure 5, each photograph contains the different reprojection errors of each camera at the fifteen locations. The errors are mainly distributed between -0.1 to 0.1 pixel in all three photographs. This demonstrates that the accuracy and stability of the method are both high.

Evaluation
In order to verify the effectiveness of the calibration method proposed in this paper, an additional experiment was carried out. Four common calibration methods were selected to compare with the method proposed in this paper. In method A, a single chessboard is used to calibrate the external parameters using Zhang Z.'s calibration method [12]. In method B, the external parameters are calibrated directly using an uncalibrated 3D target using Huang's method [19]. In method C, the 3D target is scanned using a camera with known internal parameters, the relative pose of the 3D target is calculated, and the feature point coordinates of different coordinate systems are unified into the same coordinate system. Then, the external parameters of the multiple depth cameras are calculated. In method D, a method for the calibration and registration of multiple RGBD sensors by sweeping a tracked checkerboard through the desired capturing space in front of each sensor is used [17].
The feature point coordinates of the three planar calibration boards were unified on the same plane after calibration. Theoretically, the normal vector of this plane is (0,0,1) . Therefore, we simulated an ideal point cloud plane with a normal vector (0,0, −1) , which corresponds to the middle calibration board of the 3D target. The coordinates of the ideal plane with the normal vectors of (0,0, −1) were unified using the rotation matrix of the middle calibration board obtained by the three calibration methods, respectively. The normal vectors of the two new planes obtained using the three calibration methods are shown in Table 4. It is clear that the angle between the actual plane and the ideal plane is significantly reduced when the laser is used to calibrate the 3D target. This means that our calibration method has a smaller than the other two methods, and there has been a significant improvement with respect to accuracy.
Then, further experimental comparisons on the calibration using the 3D target with different calibration methods were carried out. Five methods were applied to calibrate the multiple depth camera system. The angle between the cameras for the five methods is given in Table 5 and the RMS value of reprojection errors of the five calibration methods are given in Table 6.

Evaluation
In order to verify the effectiveness of the calibration method proposed in this paper, an additional experiment was carried out. Four common calibration methods were selected to compare with the method proposed in this paper. In method A, a single chessboard is used to calibrate the external parameters using Zhang Z.'s calibration method [12]. In method B, the external parameters are calibrated directly using an uncalibrated 3D target using Huang's method [19]. In method C, the 3D target is scanned using a camera with known internal parameters, the relative pose of the 3D target is calculated, and the feature point coordinates of different coordinate systems are unified into the same coordinate system. Then, the external parameters of the multiple depth cameras are calculated. In method D, a method for the calibration and registration of multiple RGBD sensors by sweeping a tracked checkerboard through the desired capturing space in front of each sensor is used [17].
The feature point coordinates of the three planar calibration boards were unified on the same plane after calibration. Theoretically, the normal vector of this plane is (0, 0, 1) T . Therefore, we simulated an ideal point cloud plane with a normal vector (0, 0, −1) T , which corresponds to the middle calibration board of the 3D target. The coordinates of the ideal plane with the normal vectors of (0, 0, −1) T were unified using the rotation matrix of the middle calibration board obtained by the three calibration methods, respectively. The normal vectors of the two new planes obtained using the three calibration methods are shown in Table 4. It is clear that the angle between the actual plane and the ideal plane is significantly reduced when the laser is used to calibrate the 3D target. This means that our calibration method has a smaller than the other two methods, and there has been a significant improvement with respect to accuracy.
Then, further experimental comparisons on the calibration using the 3D target with different calibration methods were carried out. Five methods were applied to calibrate the multiple depth camera system. The angle between the cameras for the five methods is given in Table 5 and the RMS value of reprojection errors of the five calibration methods are given in Table 6. As shown in Table 5, the single chessboard used in method A is limited by the field of view coincidence of the multi-camera system, which results in very large error in the calibration results, and the results fluctuate greatly in repeated calibrations. The 3D target used in method B is limited by the manufacturing accuracy of the device, and the calibration result error is large. In method C, although the relative pose of 3D target has been taken into account, the accuracy of the camera is limited, and the result is less accurate. The accuracy of calibration results is further improved in method D. The angle between the cameras obtained using the method proposed in this paper is very close to the ideal angle which shows that this method results in a high accuracy for calibrating the external parameters of multiple depth camera systems. As shown in Table 6, the method proposed in this paper greatly reduces RMS error which means our method has higher stability and accuracy compared with existing solutions.

3D Measurement Results
In order to verify the effectiveness of the 3D results in practical applications, the multiple depth cameras calibration data obtained using method C and our method were applied to the point cloud fusion of the scanned clouds. Method C was chosen for this comparison because it is the best-performing of the three compared methods. We scanned the real scene of Figure 6 using the multiple depth camera system. The results of the point cloud fusion are shown in Figure 7.
It is evident that the result of point cloud fusion using the calibration result obtained by method C is not ideal. The point cloud scanned using each of the two depth cameras systems cannot be perfectly fused, and the cracks are large. This shows that the external parameters of the calibrated depth cameras are inaccurate, and there is considerable discrepancy compared with the actual values. Conversely, there are no cracks between the point clouds when using the calibration result obtained by our method and better fusion is achieved, which means that the actual values of external parameters between the depth cameras used are very close. As shown in Figure 7d, the results obtained using the two methods are quite different. Using Equations (1)-(6), we calculated the angle between the actual ground and the ground obtained by two methods. The angle value of method C was 3.588 • , while the angle of our method is 0.342 • , which means that the point cloud fusion results obtained using method C were distorted greatly by the ground calculations, while our method is more accurate. This experiment shows that the calibration method using the 3D target and the multiple depth camera systems proposed in this paper has very high accuracy.  In addition, we used the same system to measure a car model at an auto show. Comparing the results of the point cloud fusion in Figure 8, the advantages of the calibration method presented in this paper are outstanding. The results obtained by method C show that there are huge cracks and unevenness on the ground, while the top of the vehicle model is distorted. In Figure 8d, the angle obtained using method C is 4.855°, and the angle of our method is 0.219°. The actual height and length of the car obtained from the official website are 163.50 cm and 441.50 cm respectively. The measured values by the two methods are shown in Table 7.
In contrast to method C, our method yields a flat ground and a vehicle model very close to the object. When applied to a large-scale model, the error of method C is greater, and our model gives better stability and accuracy.
In order to further compare the accuracy of method C and D with the proposed method, a closed regular room was used as a calibration object. The room was scanned by a Faro Focus 3D X330 lidar, as shown in Figure 9. The obtained Width, Height and Length are 243.80, 263.59, and 421.45 cm, regarded as the ground truth. The room was also measured by the three methods, as listed in Table 8.   In addition, we used the same system to measure a car model at an auto show. Comparing the results of the point cloud fusion in Figure 8, the advantages of the calibration method presented in this paper are outstanding. The results obtained by method C show that there are huge cracks and unevenness on the ground, while the top of the vehicle model is distorted. In Figure 8d, the angle obtained using method C is 4.855°, and the angle of our method is 0.219°. The actual height and length of the car obtained from the official website are 163.50 cm and 441.50 cm respectively. The measured values by the two methods are shown in Table 7.
In contrast to method C, our method yields a flat ground and a vehicle model very close to the object. When applied to a large-scale model, the error of method C is greater, and our model gives better stability and accuracy.
In order to further compare the accuracy of method C and D with the proposed method, a closed regular room was used as a calibration object. The room was scanned by a Faro Focus 3D X330 lidar, as shown in Figure 9. The obtained Width, Height and Length are 243.80, 263.59, and 421.45 cm, regarded as the ground truth. The room was also measured by the three methods, as listed in Table 8. In addition, we used the same system to measure a car model at an auto show. Comparing the results of the point cloud fusion in Figure 8, the advantages of the calibration method presented in this paper are outstanding. The results obtained by method C show that there are huge cracks and unevenness on the ground, while the top of the vehicle model is distorted. In Figure 8d, the angle obtained using method C is 4.855 • , and the angle of our method is 0.219 • . The actual height and length of the car obtained from the official website are 163.50 cm and 441.50 cm respectively. The measured values by the two methods are shown in Table 7.
In contrast to method C, our method yields a flat ground and a vehicle model very close to the object. When applied to a large-scale model, the error of method C is greater, and our model gives better stability and accuracy.
In order to further compare the accuracy of method C and D with the proposed method, a closed regular room was used as a calibration object. The room was scanned by a Faro Focus 3D X330 lidar, as shown in Figure 9. The obtained Width, Height and Length are 243.80, 263.59, and 421.45 cm, regarded as the ground truth. The room was also measured by the three methods, as listed in Table 8.         The experimental results show that the proposed method is able to significantly improve the calibration accuracy of the method C. The calibration accuracy of our method is below 3 cm in the range of 4m. To summary, all the test results show that our calibrated system has good stability and accuracy in measuring complex three-dimensional indoor scenes.