Stereo Camera Based Detection of Moving Objects as Basis for Collision Avoidance for an Autonomous Tower Crane

Yen-Teh Li

Duration: 6 months
Completition: August 2022
Supervisor: MSc. Lena Joachim, Prof. Dr.-Ing. Norbert Haala
Examiner: Prof. Dr.-Ing. Norbert Haala

Background

The field of moving object detection has always been an area of research of great interest. With accurate moving object detection, it is possible to achieve many missions, including the goal of this thesis - the automation of tower crane path planning with risk map. While the surrounding environment that does not change frequently can be reconstructed by the SfM (Structure from Motion) process, SfM is not fast enough to continuously track and update the position and extent of moving objects. This leads to our goal of developing an algorithm capable of detecting moving objects and producing risk map for further collision-free path planning. The algorithm is required to be executable on NVIDIA Jetson Xavier NX platform by ROS (Robot Operating System) with ZED 2 stereo camera as image source.

Figure 1: Platform of implementation – NVIDIA Jetson Xavier NX & ZED 2 stereo camera

Detection Methodology

The algorithm makes use of images and depth measurements from RGB-D camera, integrating them to perform over-segmentation on 2.5D image. Segments in over-segmented image are treated as vertices in weighted undirected graph and merged using RAG (Region Adjacency Graph) based on their similarity with the neighboring segments. Moving objects refined from moving segments are extracted from static background with ORB (Oriented FAST and Rotated BRIEF) feature detection. At last, the extracted moving objects are marked on the 2D risk map using an encoding system to achieve a compact representation of 3D position and extents of moving objects.

Figure 2: Overlay of stationary features and segmentation result

Result

Evaluations are conducted on both synthetic generated by CARLA simulator and real world data to justify the accuracy and efficiency of the algorithm. The evaluation result shows 95.83% of overall accuracy, and 89.98% of producer's accuracy of moving objects. For safety reasons, the algorithm has been optimized to achieve high producer's accuracy, avoiding collisions as much as possible. As a result of allowing false-positive in purpose, the 15.26% low user’s accuracy of moving object appear as the main drawback of the algorithm, which relies on further research and modification to overcome. Nonetheless, due to specific requirements of the algorithm in this thesis, the result is acceptable. With the high producer’s accuracy, it is certain that most of the moving objects are detected and encoded on the risk map, so that the safety in the construction site is guaranteed by the collision-free path planning of the tower crane.

While the 1-2 frames per second efficiency is far from capable of dealing with real-time task, it is suitable for the task of automatic tower crane path planning. The relatively slow moving speed of the tower crane reduce the efficiency requirement for the proposed algorithm.

Figure 3: Detection result and corresponding 2D risk map (point of view: zenith)

Further Work

The main focus for further improvement of the proposed method consists of two points. To increase user’s accuracy of moving objects detection, which improves the usability of the proposed method, more robust depth estimation or further filtering of the result is required. It is possible that the efficiency of the proposed method can be increased with parallel processing since most process in the proposed algorithm is independent with each other. The ideal goal is to create a method which has both high producer’s accuracy and high user’s accuracy of the result that is capable of dealing with real-time detection task.

References

LABAYRADE, R. ; AUBERT, D. ; TAREL, J.-P. : Real time obstacle detection in stereovision on non flat road geometry through” v-disparity” representation. In: Intelligent Vehicle Symposium, 2002. IEEE Bd. 2 IEEE, 2002, S. 646–651

FELZENSZWALB, P. F. ; HUTTENLOCHER, D. P.: Efficient graph-based image segmentation. In: International journal of computer vision 59 (2004), Nr. 2, S. 167–181

RUBLEE, E. ; RABAUD, V. ; KONOLIGE, K. ; BRADSKI, G. : ORB: An efficient alternative to SIFT or SURF. In: 2011 International conference on computer vision Ieee, 2011, S. 2564–2571

Stereo Camera Based Detection of Moving Objects as Basis for Collision Avoidance for an Autonomous Tower Crane

Yen-Teh Li