Monocular Distance Estimation for UAVs

Masterarbeit am ifp - Thomas Heidelberger

Thomas Heidelberger

Monocular Distance Estimation for UAVs

Duration: 6 months
Completition: August 2021
Supervisor: Dr. David Adjiashvili (Drone Harmony), Prof. Dr.-Ing. Norbert Haala
Examiner: Prof. Dr.-Ing. Norbert Haala



Unmanned aerial vehicles (UAV) play already a major role in a broad range of tasks like surveying, inspection and mapping. A high level of autonomy is a key capability for drones and an area of intensive research. Drone control software like the Drone Harmony application is widely used and allow an automated planning and execution of flight plans. These applications mainly rely on fix waypoint flight plans and global navigation satellite systems (GNSS) for drone positioning. Two of the most popular use cases for such applications are terrain mapping and tower inspection missions. Thereby, the mission is planned using maps, elevation models or 3-D object models. Important key factors for satisfying results are accurately planned missions and precise positioning of the drone. The available digital elevation models (DEM) are often too unprecise to ensure a constant height above the ground during terrain following flights. Furthermore, inaccurate GNSS positioning during tower inspections leads to varying distances between tower and UAV which should be avoided.
The aim of this work was to investigate, how a drone’s standard hardware can be used to estimate distances between a drone and the ground / an object. Thereby, only image data from the drone’s main camera and GNSS measurements are used.

Terrain Following

o measure the distance between a drone and the ground, a simulated stereo vision approach is used. Thereby, two successive images from a single moving camera are treated as an axially parallel stereo image pair. The distance between the two recordings is called ‘baseline’ and can be calculated from the velocity of the UAV. Optical flow is used to track feature points in the scene from which the pixel disparity can be calculated. Afterwards, triangulation is used to calculate the distance from the ground. A combination of ORB features and the Kanade-Lucas-Tomasi (KLT) optical flow tracker has shown to provide the best results in both accuracy and calculation time. The following diagrams show the measurements from different flight tests. These results show that the proposed method provides satisfying results and demonstrate its usability in a mobile control application.

Figure 1: Result 1 Terrain Following
Figure 2: Result 2 Terrain Following

Tower Inspections

The task of visual distance estimation to an object like a tower is more complex than estimating distance to a planar surface like the ground. The scene captured by the UAV’s camera consists of the inspection object and the background. The background is irrelevant for drone positioning and should not be considered. Furthermore, the complex motion of the UAV and more challenging feature tracking during a tower inspection flight makes a more advanced approach necessary. A prominent method for visual positioning in the field of robotics are simultaneous localisation and mapping (SLAM) systems. Therefore, the use of the ORB-SLAM for drone position estimation is investigated for tower inspections missions. To ensure that only the tower is considered for positioning, a tower detection step is necessary to localise the tower in the scene. Modern deep-learning based methods for object detection require a huge amount of training data, which is not available for the task of tower detections. Therefore, a tower detection method which is based on disparity maps was developed. Thereby, the disparity between two successive frames of a moving camera is used, to separate the foreground from the rest of the scene. After a disparity map was created, handcrafted features are used to select the foreground area in the image.

The following image illustrates the workflow:

gure 3: Tower Detection using Disparity Maps
Figure 4: Result 1 Tower Detection
Figure 5: Result 2 Tower Detection

After tower localisation, the ORB-SLAM is used to estimate the position of the drone relative to the tower. In case of a monocular vision system, the absolute scale cannot be retrieved by the SLAM system. However, the relative scale can be used to detect varying distances between the drone and the tower and to adapt the flightpath accordingly.

The following images show the reconstruction of the flight path. It is shown that the ORB-SLAM can archive good results in estimating the position of a UAV during tower inspections flights. Especially in cases where the inspection object is thin and texture-less, an accurate position estimation is difficult and an integration of GNSS measurements is needed to estimate the drone’s position correctly.

Figure 6: Drone Position Estimation using ORB-SLAM
Figure 7: Result ORB-SLAM (green), actual flight path (blue)


Dieses Bild zeigt Norbert Haala
apl. Prof. Dr.-Ing.

Norbert Haala

Stellvertretender Institutsleiter

Zum Seitenanfang