In many domains, including remote sensing, obtaining high-quality ground truth data is both essential and resource-intensive. Paid crowdsourcing offers a scalable alternative by enabling the collection of large quantities of annotations from non-expert contributors. While individual annotations may be noisy or inconsistent, the collective intelligence of the crowd, referred to as the wisdom of crowds, can be harnessed to produce reliable and accurate results, provided that robust integration methods are used. Crowdsourcing is particularly relevant for tasks in remote sensing that require manual segmentation, such as in object delineation or classifications, where expert annotations are costly and time-consuming. As a result, by aggregating multiple independent segmentations from non-expert crowdworkers, we can mitigate their individual errors, calculate the underlying consensus, and move toward creating high-quality data suitable for both training and evaluation purposes. One such example is shown below.
On the left, individual segmentations provided by different contributors are displayed in yellow, illustrating the diversity and variability of the raw input data. The integration process uses a pixel-wise majority vote, where each pixel is included in the final outline if it is part of at least 50% of the input polygons. The resulting consensus segmentation is shown in red, providing a single, definitive outline that visualizes the integrated result of all crowdworkers. The plot next to it on the right-hand side illustrates the quality of these integrations, measured as Intersection over Union (IoU), between the integrated outline and the ground truth, as the number of input segmentations increases. Two key effects emerge here: At the beginning, the quality of the integration appears to steadily improve with more acquisitions as input, thanks to the redundancy that helps correct individual errors or outliers. For higher n values, i.e., as more acquisitions are used as input for the integration, the curve approaches a plateau, indicating that the result becomes increasingly stable and reaches a saturation point. Further additions provide only minor refinements.
Although being only a simple example, this demonstration highlights the redundant outline collection and subsequent data integration. While the integrated result eventually plateaus at an IoU value of around 0.90, the improvement in quality compared to the non-integrated results is substantial. Notably, the resulting shape remains coherent, although all acquisitions were included in the integration process, including those containing noisy or spam-like input. Current areas of research aim to further optimize this integration process by mitigating the saturation effect through adaptive filtering, variable thresholds, and other refinement strategies.

David Collmar
M.Sc.Research Associate