Facade Detection in Urban Scenes using Semantic Image Segmentation

André Wiedemann

Dauer der Arbeit: 6 Monate
Abschluss: Mai 2017
Betreuer: Dipl.-Ing. Patrick Tutzauer
Prüfer: Prof. Dr.-Ing. Norbert Haala

Full text: https://goo.gl/7RrQ4C

Motivation

Over the past several years, realistic 3D city models have come into high demand. Model production requires huge amounts of façade information with little distortion and high resolution – ideally from a street view position. Current convolutional neural networks perform well but use millions of manually labelled and therefore expensive training images.

In this thesis a working prediction pipeline based on methods of semantic image segmentation is build that recognizes and extracts facades from urban scenes containing facades and other typical objects. Supervised object classifiers and detectors are tested, that have proven to work efficient on other objects such as faces, vehicles or street signs. For training, a rather small set of 651 images is used.

The prediction task is split into two sub-problems that are addressed individually:

detection: Detect location of façade information and highlight it through a surrounding rectangular Bounding Box
segmentation: For every pixel decide whether it is part of the object class ‘façade’

Streetview Image (left) as used for training or validation and Ground Truth Label Image (right) representing areas of object class “façade” and other typical object classes from LabelMe Façade (Fröhlich, 2010)

Figure 1: Streetview Image (left) as used for training or validation and Ground Truth Label Image (right) representing areas of object class “façade” and other typical object classes from LabelMe Façade (Fröhlich, 2010)

Implementation Facade Detection in Urban Scenes

In the context of this thesis Casade Object Detectors and Bag of Features (BoF) / Support Vector Machine (SVM) Classifiers are trained and tested for their suitability to solve the prediction problem of detecting and extracting façade information. Hereby the object feature types Haar-like (HAAR), Histogramm of Oriented Gradients (HOG) and Speeded Up Robust Features (SURF) find use. To support the segmentation process tools of image segmentation such as Region Growing and Superpixel are tested.

Three main products are derived:

Pixelwise Facade Segmentation: A performant pipeline (74.5% F₁ Score) for pixelwise façade segmentation based on the use of two binary BoF/SVM classifiers, Superpixels and Morphological Operations for image optimization is developed.
Efficient, Accurate Facade Detection: A detector based on the use of two binary BoF/SVM classifiers and Superpixels is provided that achieves great performance in finding bounding boxes (94.6% SuccessfulDetectionRate)that surround all facades within an image while also being time efficient.
Real-time Façade Detection: A very fast, real-time capable detector based on the detection of windows through a Cascade detector and Haar-like Features is provided. While reducing the detection time enormously, it only lacks little of the detection quality that the BoF/SVM detection pipeline achieves and is therefore a considerable alternative for huge datasets or real-time applications.

Figure 2: RGB Image (left), Ground Truth (middle left); Result Segmentation (middle right, right)

Figure 3: Results of Detection (red: ground truth, green: Detection using BoF Classifier and blue: Detection using Cascade Detector)

Facade Detection in Urban Scenes using Semantic Image Segmentation

André Wiedemann