Dense Image Matching in Close Range Applications
Image matching methods can be used to reconstruct 3D surfaces from images. By finding corresponding pixels between images collected from different angles, the depth can be estimated using a viewing ray intersection in space. Recently, methods for the reconstruction of 3D data without initial information have been developed which are using Feature Points to find and describe pixels and their correspondences reliable. Structure and Motion reconstruction methods employ these feature points to determine a sparse 3D point cloud and the camera position and rotation in space for each image.
This orientation information can be used to perform a dense image matching step, which determines a correspondence for almost each pixel in the image. This leads to a very dense point cloud. The key challenge of this dense image matching step is the resolution of ambiguities. Since grey values are usually not unique in an image, a method has to be found for the reliable determination of correspondences. One solution is represented by the Semi Global Matching algorithm, proposed by Heiko Hirschm¬® uller in 2005. It uses an approximation of a global smoothness constraint of the observed surface over the image. By enforcing smoothness along paths through the image in different directions, not only ambiguities are resolved but also small untextured gaps can be filled. Also, the noise in the point cloud is reduced.
However, the Semi Global Matching algorithm was initially developed for the processing of aerial imagery, where the depth in relation to the acquisition distance is small. This is not the case for close range applications, where large depth variations can occur. In order to be able to process high resolution imagery from close range scenes, the Semi Global Matching algorithm was modified.
Usually, all possible correspondences are evaluated for each pixel and for each possible depth within a certain range. Since this range is very large for close range imagery, the requirements regarding computation time and physical memory are very high. Thus, we implemented a hierarchical approach, where the depth search range is reduced for each pixel individually using an image pyramid. On low resolutions the possible depths are also significantly smaller, which enables very fast computations. By using this information as initial information in the next higher level of resolution the depth range can be narrowed down subsequently.
By matching not only on one stereo pair, but many images instead, redundant observations are available for each point on the object surface. These multiple observations in image space to the object point enable a triangulation with noise reduction and outlier rejection. Consequently, a reliable low noise point cloud can be derived with quality information for each point.
Within a cultural heritage data recording project this modified dense image method was used in combination with an extended Structure and Motion technique to acquire point clouds with a resolution and accuracy below 1mm. The objects were two the Tympanums at the Royal Palace of Amsterdam with a complex relief surface covering an area of about 125m². Within 10 days 10,000 images were acquired using a multi-camera rig described above, which we specifically designed for acquiring such complex geometries at short distance. Finally, about 2 billion 3D points were computed.
|Figure 1: Dense image matching: finding corresponding pixels and intersecting their viewing rays in space.|
|Figure 2: Point cloud extract from the Amsterdam project. 2 Billion 3D points were derived from about 10,000 images with sub-mm resolution and accuracy.|
Wenzel, K., Abdel-Wahab, M., Cefalu, A. & Fritsch D. 
A Multi-Camera System for Efficient Point Cloud Recording in Close Range Applications. LC3D workshop, Berlin, December 2011.