Bernhard, Maximilian (2024): Deep learning methods for image recognition in remote sensing. Dissertation, LMU München: Fakultät für Mathematik, Informatik und Statistik |
Vorschau |
PDF
Bernhard_Maximilian.pdf 30MB |
Abstract
In recent years, the prevalence of remote sensing images has substantially increased due to technological advances in capturing devices such as satellites and unmanned aerial vehicles. Automatically processing and interpreting remote sensing images has many important applications, including urban planning, disaster management, and climate research. Deep learning methods are highly popular for this task as they allow training powerful neural networks that excel at extracting valuable information from images. A common obstacle to the practical application of deep learning is the lack of large amounts of high-quality data annotations, which are typically needed for supervision during training. This issue is even exacerbated by the fact that remote sensing applications generally require expert annotators with domain knowledge. In order to reduce the need for manual labeling by experts in remote sensing, it is therefore crucial to optimize the use of available data and annotations. Moreover, it is essential to effectively leverage the characteristics of remote sensing images. For instance, remote sensing images can be combined with data from other sources or modalities based on their geolocation. This enables the utilization of external information, either as model input or for supervision. Furthermore, the geolocation and the spatiotemporal context of the images itself can be leveraged. Taking spatiotemporal information into account is often worthwhile as many real-world concepts, such as land cover, exhibit a strong spatial and temporal correlation. In this dissertation, we present solutions to various image recognition problems in remote sensing. Thereby, we harness the characteristics of remote sensing images and address specific challenges coming with remote sensing images. With regard to object detection, we propose a method for correcting imprecise point annotations. In doing so, we tackle the problem of misalignments between images and annotations that often arise when images and annotations from different sources are merged based on their geolocation. As an extension, we present a novel method for robust training of object detectors with noisy and incomplete annotations. Next, we show that the appropriate use of image metadata, such as geolocation and capture time, enhances the quality of pseudo-labels and, thus, also the overall model performance in semi-supervised image classification. For the task of change detection with bi-temporal remote sensing images, we propose to use existing land cover maps as additional model input to condition the prediction on the previous land cover type and improve the detection of changed areas. Finally, since a reliable evaluation of learning-based methods is critical in both research and practice, and different applications may focus on different aspects of prediction quality, we introduce a set of detailed and informative error metrics for evaluating semantic segmentation models. Overall, the methods presented in this dissertation cover the tasks of image classification, object detection, semantic segmentation, and change detection, as well as learning settings with full, incomplete, and noisy supervision.
Dokumententyp: | Dissertationen (Dissertation, LMU München) |
---|---|
Themengebiete: | 000 Allgemeines, Informatik, Informationswissenschaft
000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik |
Fakultäten: | Fakultät für Mathematik, Informatik und Statistik |
Sprache der Hochschulschrift: | Englisch |
Datum der mündlichen Prüfung: | 9. Dezember 2024 |
1. Berichterstatter:in: | Schubert, Matthias |
MD5 Prüfsumme der PDF-Datei: | 4b68f6c94b23b0d7257c71918da482ca |
Signatur der gedruckten Ausgabe: | 0001/UMC 30918 |
ID Code: | 34627 |
Eingestellt am: | 19. Dec. 2024 12:31 |
Letzte Änderungen: | 19. Dec. 2024 12:31 |