Logo Logo
Hilfe
Kontakt
Switch language to English
Flexible distributed lag models and their application to geophysical data
Flexible distributed lag models and their application to geophysical data
Regression models with lagged covariate effects are often used in biostatistical and geo- physical data analysis. In the difficult and all-important subject of earthquake research, strong long-lasting rainfall is assumed to be one of many complex trigger factors that lead to earthquakes. Geophysicists interpret the rain effect with an increase of pore pressure due to the infiltra- tion of rain water over a long time period. Therefore, a sensible statistical regression model examining the influence of rain on the number of earthquakes on day t has to contain rain information of day t and of preceding days t − 1 to t − L. In the first part of this thesis, the specific shape of lagged rain influence on the number of earthquakes is modeled. A novel penalty structure for interpretable and flexible estimates of lag coefficients based on spline representations is presented. The penalty structure enables smoothness of the resulting lag course and a shrinkage towards zero of the last lag coefficient via a ridge penalty. This additional ridge penalty offers an approach to another problem neglected in previous work. With the help of the additional ridge penalty, a suboptimal choice of the lag length L is no longer critical. We propose the use of longer lags, as our simulations indicate that superfluous coefficients are correctly estimated close to zero. We provide a user-friendly implementation of our flexible distributed lag (FDL) ap- proach, that can be used directly in the established R package mgcv for estimation of generalized additive models. This allows our approach to be immediately included in com- plex additive models for generalized responses even in hierarchical or longitudinal data settings, making use of established stable and well-tested algorithms. We demonstrate the performance and utility of the proposed flexible distributed lag model in a case study on (micro-) earthquake data from Mount Hochstaufen, Bavaria with focus on the specific shape of the lagged rain influence on the occurrence of earthquakes in different depths. The complex meteorological and geophysical data set was collected and provided by the Geophysical Observatory of the Ludwig-Maximilians University Munich. The benefit of flexible distributed lag modeling is shown in a detailed simulation study. In the second part of the thesis, the penalization concept is extended to lagged non- linear covariate influence. Here, we extend an approach of Gasparrini et al. (2010), that was up to now unpenalized. Detailed simulation studies illustrate again the benefits of the penalty structure. The flexible distributed lag nonlinear model is applied to data of the volcano Merapi in Indonesia, collected and provided by the Geophysical Observatory in Fürstenfeldbruck. In this data set, the specific shape of lagged rain influence on the occurrence of block and ash flows is examined.
earthquakes, flexible lags, generalized additive models, lagged linear covariates, penalized splines
Obermeier, Viola
2014
Englisch
Universitätsbibliothek der Ludwig-Maximilians-Universität München
Obermeier, Viola (2014): Flexible distributed lag models and their application to geophysical data. Dissertation, LMU München: Fakultät für Mathematik, Informatik und Statistik
[thumbnail of Obermeier_Viola.pdf]
Vorschau
PDF
Obermeier_Viola.pdf

5MB

Abstract

Regression models with lagged covariate effects are often used in biostatistical and geo- physical data analysis. In the difficult and all-important subject of earthquake research, strong long-lasting rainfall is assumed to be one of many complex trigger factors that lead to earthquakes. Geophysicists interpret the rain effect with an increase of pore pressure due to the infiltra- tion of rain water over a long time period. Therefore, a sensible statistical regression model examining the influence of rain on the number of earthquakes on day t has to contain rain information of day t and of preceding days t − 1 to t − L. In the first part of this thesis, the specific shape of lagged rain influence on the number of earthquakes is modeled. A novel penalty structure for interpretable and flexible estimates of lag coefficients based on spline representations is presented. The penalty structure enables smoothness of the resulting lag course and a shrinkage towards zero of the last lag coefficient via a ridge penalty. This additional ridge penalty offers an approach to another problem neglected in previous work. With the help of the additional ridge penalty, a suboptimal choice of the lag length L is no longer critical. We propose the use of longer lags, as our simulations indicate that superfluous coefficients are correctly estimated close to zero. We provide a user-friendly implementation of our flexible distributed lag (FDL) ap- proach, that can be used directly in the established R package mgcv for estimation of generalized additive models. This allows our approach to be immediately included in com- plex additive models for generalized responses even in hierarchical or longitudinal data settings, making use of established stable and well-tested algorithms. We demonstrate the performance and utility of the proposed flexible distributed lag model in a case study on (micro-) earthquake data from Mount Hochstaufen, Bavaria with focus on the specific shape of the lagged rain influence on the occurrence of earthquakes in different depths. The complex meteorological and geophysical data set was collected and provided by the Geophysical Observatory of the Ludwig-Maximilians University Munich. The benefit of flexible distributed lag modeling is shown in a detailed simulation study. In the second part of the thesis, the penalization concept is extended to lagged non- linear covariate influence. Here, we extend an approach of Gasparrini et al. (2010), that was up to now unpenalized. Detailed simulation studies illustrate again the benefits of the penalty structure. The flexible distributed lag nonlinear model is applied to data of the volcano Merapi in Indonesia, collected and provided by the Geophysical Observatory in Fürstenfeldbruck. In this data set, the specific shape of lagged rain influence on the occurrence of block and ash flows is examined.