Relative pose estimation with a 2D lidar scanner

Zuqing Xie

Duration: 6 months
Completition: January 2024
Supervisor: Dr. Thomas Kuebler (Bosch Rexroth AG)
Examiner: Prof. Dr.-Ing. Norbert Haala

Motivation

Automated Guided Vehicles (AGVs) in large logistics warehouses have to perform tasks such as docking to their charging stations or handling goods like Europallets. A common way for navigating the warehouse is laser SLAM. While SLAM works well for larger structures, such as finding the correct hall, shelf or waypoint, there is a conflict between robustness to highly dynamic environments and the high precision required for direct interaction with target objects. This research investigates how to enable an AGV to localize to a target object using a 2D Lidar scanner.

Methodology

I investigate a feature-based perception pipeline. The FALKO keypoint detector detects interest points. Descriptors like BSC, CGH and a novel deep learning-based descriptor (DLD) are employed to describe the spatial context information. I describe a process of aggregating a target object template based on multiple scans with different perspectives on the target based on a clustering approach. Subsequently, RANSAC is utilized to match the detected keypoints against the template. The resulting pose transformation is employed to derive the relative pose.

Adopting domain adaptation methods becomes crucial to bridge this gap and enhance the model's performance when applied to SemanticSlamantic, ensuring robust semantic segmentation and SLAM across diverse LiDAR sensor setups and FOV ranges in real-world scenarios.

Experiments

The proposed approach is evaluated in multiple real-world experiments resembling the environment of large logistics warehouses. I evaluate scenarios of increasing complexity to demonstrate the degree of template-to-object variability that can be handled as well as how the system copes with harsh working conditions. Laser scans and global AGV pose are recorded from a SICK Lidar nanoScan3 and an OptiTrack system. Experiment 1 focuses on the detection of Europallets, including testing to determine the main factors influencing the distinctiveness between keypoints. The experiment investigates whether the algorithm can adapt to relatively aged and deformed Europallets. Additionally, it explores whether the modified FALKO is more stable compared to the original version. Experiment 2 is dedicated to the deep learning-based descriptor DLD, focusing on studying its performance through dimensionality reduction and comparing it with different other descriptors. Experiment 3 is designed to test the generalization capability of the pose estimator. It involves testing various target objects and assessing their performance under conditions of high dynamics and occlusion.

Figure 6: Generalization: Relative pose estimation of different target objects in dynamic environment.

Table 1: This table summarizes the average pose error statistics for scenarios with dynamic points filtering using MinkUnet+LPX and without dynamic points filtering. The comparison highlights the impact of dynamic points filtering on the accuracy of pose estimation.

Result and Conclusion

The proposed relative pose estimation pipeline is well-designed with its proper visualization and evaluation functions. Localizing relative to an Europallet exhibited a Euclidean error below 3.84 cm and an angular error below 1.79°in over 90% of the pose estimates with a detection rate of 67%. Due to synchronization and reliability issues with the ground truth measurement system, this is only an upper bound to the achieved accuracy. Modified FALKO is stabler in keypoints detection than original FALKO. Moreover, our approach can estimate the relative pose of different target objects with the same level of accuracy, even in environments with occlusion and high dynamics, but at the cost of a lower detection rate. The analysis and experiments also reveal that the integrated novel DLD descriptor is superior to other classical descriptors in terms of discriminating between individual feature points with 98.9%.

Bibliography

Gian Diego Tipaldi and Kai O Arras. Flirt-interest regions for 2d range data. In 2010 IEEE International Conference on Robotics and Automation, pages 3616–3622. IEEE, 2010.

Fabjan Kallasi, Dario Lodi Rizzini, and Stefano Caselli. Fast keypoint features from laser scanner for robot localization and mapping. IEEE Robotics and Automation Letters, 1(1):176–183, 2016.

Muhammad Usman, Abdul Manan Khan, Ahmad Ali, Sheraz Yaqub, Khalil Muhammad Zuhaib, Ji Yeong Lee, and Chang-Soo Han. An extensive approach to features detection and description for 2-d range data using active b-splines. IEEE Robotics and Automation Letters, 4(3):2934–2941, 2019.

Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 815–823, 2015.

Relative pose estimation with a 2D lidar scanner

Zuqing Xie