Brockhaus, Sarah (2016): Boosting functional regression models. Dissertation, LMU München: Faculty of Mathematics, Computer Science and Statistics 

PDF
Brockhaus_Sarah.pdf 6MB 
Abstract
In functional data analysis, the data consist of functions that are defined on a continuous domain. In practice, functional variables are observed on some discrete grid. Regression models are important tools to capture the impact of explanatory variables on the response and are challenging in the case of functional data. In this thesis, a generic framework is proposed that includes scalaronfunction, functiononscalar and functiononfunction regression models. Within this framework, quantile regression models, generalized additive models and generalized additive models for location, scale and shape can be derived by optimizing the corresponding loss functions. The additive predictors can contain a variety of covariate effects, for example linear, smooth and interaction effects of scalar and functional covariates. In the first part, the functional linear array model is introduced. This model is suited for responses observed on a common grid and covariates that do not vary over the domain of the response. Array models achieve computational efficiency by taking advantage of the Kronecker product in the design matrix. In the second part, the focus is on models without array structure, which are capable to capture situations with responses observed on irregular grids and/or timevarying covariates. This includes in particular models with historical functional effects. For situations, in which the functional response and covariate are both observed over the same time domain, a historical functional effect induces an association between response and covariate such that only past values of the covariate influence the current value of the response. In this model class, effects with more general integration limits, like lag and lead effects, can be specified. In the third part, the framework is extended to generalized additive models for location, scale and shape where all parameters of the conditional response distribution can depend on covariate effects. The conditional response distribution can be modeled very flexibly by relating each distribution parameter with a link function to a linear predictor. For all parts, estimation is conducted by a componentwise gradient boosting algorithm. Boosting is an ensemble method that pursues a divideandconquer strategy for optimizing an expected loss criterion. This provides great flexibility for the regression models. For example, minimizing the check function yields quantile regression and minimizing the negative loglikelihood generalized additive models for location, scale and shape. The estimator is updated iteratively to minimize the loss criterion along the steepest gradient descent. The model is represented as a sum of simple (penalized) regression models, the so called baselearners, that separately fit the negative gradient in each step where only the bestfitting baselearner is updated. Componentwise boosting allows for highdimensional data settings and for automatic, datadriven variable selection. To adapt boosting for regression with functional data, the loss is integrated over the domain of the response and baselearners suited to functional effects are implemented. To enhance the availability of functional regression models for practitioners, a comprehensive implementation of the methods is provided in the \textsf{R} addon package \pkg{FDboost}. The flexibility of the regression framework is highlighted by several applications from different fields. Some features of the functional linear array model are illustrated using data on curing resin for car production, heat values of fossil fuels and Canadian climate data. These require functiononscalar, scalaronfunction and functiononfunction regression models, respectively. The methodological developments for nonarray models are motivated by biotechnological data on fermentations, modeling a key process variable by a historical functional model. The motivating application for functional generalized additive models for location, scale and shape is a time series on stock returns where expectation and standard deviation are modeled depending on scalar and functional covariates.
Item Type:  Thesis (Dissertation, LMU Munich) 

Keywords:  functional data analysis, functional regression, gradient boosting, variable selection 
Subjects:  000 Computers, Information and General Reference 000 Computers, Information and General Reference > 004 Data processing computer science 
Faculties:  Faculty of Mathematics, Computer Science and Statistics 
Language:  English 
Date Accepted:  31. August 2016 
1. Referee:  Greven, Sonja 
Persistent Identifier (URN):  urn:nbn:de:bvb:19198685 
MD5 Checksum of the PDFfile:  8c0945108291f9def51d98850b38e180 
Signature of the printed copy:  0001/UMC 24113 
ID Code:  19868 
Deposited On:  27. Sep 2016 09:08 
Last Modified:  27. Sep 2016 09:08 