| Rehms, Raphael (2025): Addressing uncertainty and complex data structures through Bayesian and classical approaches. Dissertation, LMU München: Faculty of Mathematics, Computer Science and Statistics |
Preview |
PDF
Rehms_Raphael.pdf 9MB |
Abstract
Answering questions based on real-world data can pose considerable challenges to analysts. It often requires the use of data that are of questionable quality, exhibit high uncertainty, and may originate from multiple sources. Such data bear a high degree of complexity in their structure with respect to the underlying data-generating process. This cumulative thesis aims to address these issues in the context of selected research areas. The thesis is divided into two parts. The first part introduces the necessary methodology. The second part presents the four contributing articles. The methodological part provides an introduction chapter to statistical inference and probabilistic modeling by presenting Bayesian inference using Markov chain Monte Carlo as a general approach, and the generalized linear model (GLM) as a classical statistical method. Furthermore, the first part provides three chapters of methodological background in selected areas of research. The first area to be discussed is infectious disease modeling. The focus is on time-shifting operations that can be used to combine information from multiple time series. This lays the foundation for the first two contributions, which employ a Bayesian hierarchical approach to infectious disease modeling in the context of COVID-19 data. Next, an overview of measurement error theory is presented, followed by a discussion on how the Bayesian approach addresses these challenges. The third contribution demonstrates the flexibility of the Bayesian approach by applying it to data from the Wismut cohort, which presents considerable complexity and requires the use of multiple measurement error models. Finally, the last discussed chapter delves into the field of federated learning and privacy-preserving methods. The fourth contribution builds on the presented methodological background to develop an algorithm that is able to validate learned classification models through a GLM-based formulation of the ROC curve. An underlying theme of this thesis is the notion of uncertainty. In the Bayesian approach, uncertainty is encoded through the formulation of prior distributions and the overall probabilistic model, which inherently propagates and quantifies the uncertainty in a posterior distribution. The fourth contribution leverages the concept of uncertainty to preserve individual privacy by adding calibrated noise.
| Item Type: | Theses (Dissertation, LMU Munich) |
|---|---|
| Keywords: | Bayes, Uncertainty, Probabilistic modeling, Epidemiology, Privacy |
| Subjects: | 300 Social sciences 300 Social sciences > 310 General statistics |
| Faculties: | Faculty of Mathematics, Computer Science and Statistics |
| Language: | English |
| Date of oral examination: | 30. June 2025 |
| 1. Referee: | Küchenhoff, Helmut |
| MD5 Checksum of the PDF-file: | b0b9e44d32735d0648b150f16423694d |
| Signature of the printed copy: | 0001/UMC 31294 |
| ID Code: | 35512 |
| Deposited On: | 11. Jul 2025 11:46 |
| Last Modified: | 11. Jul 2025 12:33 |