DSM Inpainting using a Convolutional Neural Network

Christian Mayr

Duration of the Thesis: 6 months
Completion: October 2020
Supervisor & Examiner: Dr. Mathias Rothermel & Prof. Dr. Norbert Haala

Introduction

A digital surface model (DSM) is a virtual representation of the surface of a scanned
area like for example a city. Such models are derived from 3D point clouds that are
recorded by laser scanners or calculated from aerial images. Multiple reasons, for example occlusions or problems during matching, can lead to incomplete data.

Whereas typically data gaps are filled using various interpolation types, this thesis is investigating an alternative way to solve this problem. The goal is to improve the completion of DSMs by adapting a Convolutional Neural Network for the task of DSM interpolation.

Methodology

We build on previous work on RGB-image in-painting and adapted the architecture for predicting height values for data gaps in DSMs. To efficiently train the network a data loader including data augmentation functionality was implemented. Moreover, utility scripts to ease predictions were implemented. Beside the encoder-decoder architecture itself, loss functions were adapted to boost prediction performance.

Figure 1: Demonstration of implemented scripts

The ground truth (GT) used for training was generated by dense image matching of a high overlap airborne images. Input data was generated by matching a sparse version of the very same image block (by removing single strips). Reduced redundancy results in data gaps in Dsms and served as input data. Data used in training, validation and test phases were generated from spatially separated areas.

Results

While comparing the predicted results (figure 2) to the GT, several things have been noticed:

Topology and geometry mostly correct in predictions
No systematic offset was observed
Prediction completes from a few existing data points
Prediction performs kind of a background interpolation

Figure 2: Comparison GT ↔ Prediction

On the other hand we observed artifacts in the predictions for large patches of missing data in the input images.This effect is shown in Figure 3. The reason for those artifacts is that the network has not seen sufficient quantity of holes of comparable size during its training.

Figure 3: Demonstration of artifacts

In Figure 4 visualizes the differences, between our CNN completion and data interpolated by the commercial software package SURE (denoted IP). Additionally we indicate the overlap scenario used to generate the input data, e.g 60/60 denotes a 60 forward-, 60 side-ward overlap in the flight configuration. We observed the following points:

House gorge (marked in the top right image) always better in the prediction than in the IP
Geometry reconstruction of the house (marked pink - center image) better in prediction
Reconstruction of single vegetation (pink - image top center) better in IP

Figure 4: Comparison predictions ↔ interpolated DSMs

In the Figure below (Figure 5) one can see that the prediction clearly outperforms the interpolation in the case of e.g. courtyards. This is the case for all regions in which the hole is surrounded by structure of bigger height then the hole.

Figure 5: Comparison predictions ↔ interpolated DSMs

Conclusion

The in this thesis investigated method to complete DSMs using CNN looks very promising. But in order to achieve robust and sufficiently accurate results for commercial or any other use with the need of actual precision, further work would need to be invested.
What has to be noted is that most likely there is a lot of undiscovered potential. For example there was no hyper parameter optimization performed, the influence of other loss components could be investigated and, as one of the most promising additions, color information could be considered as well.

DSM Inpainting using a Convolutional Neural Network

Christian Mayr