Logo Logo
Hilfe
Kontakt
Switch language to English
New approaches in network data analysis
New approaches in network data analysis
This thesis introduces two extensions to statistical approaches improving modeling and estimation in the field of network data analysis. The first contributing publication focuses on cross-sectional networks based on Markov graphs, whereas the second takes the evolution of networks with dynamical structure into account. Analyzing network data is challenging in terms of modeling and computation due to large and dependent data sets. The dissertation starts with an overview of network data in general and gives an introduction to the well-known model framework of exponential random graphs models with its dependence assumptions, estimation routines, challenges, and solution approaches. At the end of the introduction, main ideas of dynamic network models, the profile likelihood approach for multivariate counting processes for network data, and the analogy of the Cox proportional hazards and Poisson model with semiparametric estimation are presented. The first part of this work proposes an extension for sampling Markov graphs as a subclass of exponential random graph models in parallel to accelerate computation time in simulation-based routines. The estimation of network models, especially of large networks, is demanding and requires Markov chain Monte Carlo simulations. This publication recommends to exploit the conditional independence structure in networks to make use of parallel draws. This idea is applied to a large ego network of Facebook friendships, where an additional log transformation of network statistics accounts for degeneracy problems. This extension is implemented in the open source R package pergm, available on GitHub and a short introduction to the main functionalities is elaborated on in the thesis. The second part of this work focuses on dynamic networks. In comparison to cross-sectional networks from the first part, the development and application of longitudinal network data concentrates on modeling changes of relations. Therefore, a profile likelihood approach to model time-stamped event data is combined with a semiparametric approach including covariates built from network history. This flexible semiparametric approach is applicable to large networks because standard software can be used for estimation due to the analogy of the Cox proportional hazards and Poisson model with artificial data structure. This extended method is applied to patent collaboration data of patents submitted jointly by inventors with German residency between 2000 and 2013. Based on penalized smoothing techniques, we include time dependent network statistics and exogenous covariates to capture internal and external effects.
Not available
Bauer, Verena
2019
Englisch
Universitätsbibliothek der Ludwig-Maximilians-Universität München
Bauer, Verena (2019): New approaches in network data analysis. Dissertation, LMU München: Fakultät für Mathematik, Informatik und Statistik
[thumbnail of Bauer_Verena.pdf]
Vorschau
PDF
Bauer_Verena.pdf

4MB

Abstract

This thesis introduces two extensions to statistical approaches improving modeling and estimation in the field of network data analysis. The first contributing publication focuses on cross-sectional networks based on Markov graphs, whereas the second takes the evolution of networks with dynamical structure into account. Analyzing network data is challenging in terms of modeling and computation due to large and dependent data sets. The dissertation starts with an overview of network data in general and gives an introduction to the well-known model framework of exponential random graphs models with its dependence assumptions, estimation routines, challenges, and solution approaches. At the end of the introduction, main ideas of dynamic network models, the profile likelihood approach for multivariate counting processes for network data, and the analogy of the Cox proportional hazards and Poisson model with semiparametric estimation are presented. The first part of this work proposes an extension for sampling Markov graphs as a subclass of exponential random graph models in parallel to accelerate computation time in simulation-based routines. The estimation of network models, especially of large networks, is demanding and requires Markov chain Monte Carlo simulations. This publication recommends to exploit the conditional independence structure in networks to make use of parallel draws. This idea is applied to a large ego network of Facebook friendships, where an additional log transformation of network statistics accounts for degeneracy problems. This extension is implemented in the open source R package pergm, available on GitHub and a short introduction to the main functionalities is elaborated on in the thesis. The second part of this work focuses on dynamic networks. In comparison to cross-sectional networks from the first part, the development and application of longitudinal network data concentrates on modeling changes of relations. Therefore, a profile likelihood approach to model time-stamped event data is combined with a semiparametric approach including covariates built from network history. This flexible semiparametric approach is applicable to large networks because standard software can be used for estimation due to the analogy of the Cox proportional hazards and Poisson model with artificial data structure. This extended method is applied to patent collaboration data of patents submitted jointly by inventors with German residency between 2000 and 2013. Based on penalized smoothing techniques, we include time dependent network statistics and exogenous covariates to capture internal and external effects.