Logo Logo
Help
Contact
Switch language to German
Modeling spatio-temporal enhancer expression in Drosophila segmentation
Modeling spatio-temporal enhancer expression in Drosophila segmentation
Thermodynamic models are a key tool to investigate transcription control in the segmentation of Drosophila. By modeling the binding of transcription factors to DNA sequences and their effect on transcription initiation, thermodynamic models predict expression patterns directly from the enhancer sequence, given the binding motifs and concentrations of all relevant transcription factors (TFs). However, many parameters of the model are impossible to measure, e.g. the interaction strength between the TFs and the core promoter. Hence, it is necessary to estimate these parameters by training the thermodynamic model on known data, i.e. to fit the model predictions to already measured expression patterns of known enhancers. The quality of the parameter training result, evaluated on independent test data, indicates how well the model recapitulates the biological measurements, which can help us to improve our understanding of the underlaying mechanisms of transcription control. Therefore, proper parameter training is a crucial step for the construction of thermodynamic models. In this thesis, I develop a thorough parameter training setup that uses the limited amount of available training data efficiently and reduces parameter overfitting significantly. This optimized training setup applies a global parameter training algorithm, a method to artificially increase the amount of training data, called data augmentation, and parameter penalties, which is a technique to limit overfitting. I apply the novel training setup to expand the scope of thermodynamic models of Drosophila segmentation considerably by incorporating additional TFs into the model, and to investigate many aspects of transcription control in greater detail than it was possible before. Among these topics are the specificity of TF binding motifs, the nature of TF cooperativity and DNA accessibility. With the help of the here developed impact score, I assess the influence of all relevant TFs in silico, delineate the cooperativity range of the key TF bcd, and determine the importance of weak binding sites. Finally, I develop and discuss two alternative models of transcription control that lack the prediction quality of thermodynamic models, but, nevertheless, give valuable insights into the architectural principles of enhancers. This project is part of a larger effort to advance our current understanding of transcription regulation by reconstructing the segmentation network of Drosophila in silico. The results of this thesis facilitate future modeling efforts by optimally leveraging the available data as well as by improving our understanding of thermodynamic models.
Gene Regulation, Thermodynamic Models, Drosophila Segmentation, Enhancer Expression, Model Training
Reutern, Marc Nikolaus Max von
2017
English
Universitätsbibliothek der Ludwig-Maximilians-Universität München
Reutern, Marc Nikolaus Max von (2017): Modeling spatio-temporal enhancer expression in Drosophila segmentation. Dissertation, LMU München: Faculty of Chemistry and Pharmacy
[thumbnail of Reutern_Marc_N_M_von.pdf]
Preview
PDF
Reutern_Marc_N_M_von.pdf

11MB

Abstract

Thermodynamic models are a key tool to investigate transcription control in the segmentation of Drosophila. By modeling the binding of transcription factors to DNA sequences and their effect on transcription initiation, thermodynamic models predict expression patterns directly from the enhancer sequence, given the binding motifs and concentrations of all relevant transcription factors (TFs). However, many parameters of the model are impossible to measure, e.g. the interaction strength between the TFs and the core promoter. Hence, it is necessary to estimate these parameters by training the thermodynamic model on known data, i.e. to fit the model predictions to already measured expression patterns of known enhancers. The quality of the parameter training result, evaluated on independent test data, indicates how well the model recapitulates the biological measurements, which can help us to improve our understanding of the underlaying mechanisms of transcription control. Therefore, proper parameter training is a crucial step for the construction of thermodynamic models. In this thesis, I develop a thorough parameter training setup that uses the limited amount of available training data efficiently and reduces parameter overfitting significantly. This optimized training setup applies a global parameter training algorithm, a method to artificially increase the amount of training data, called data augmentation, and parameter penalties, which is a technique to limit overfitting. I apply the novel training setup to expand the scope of thermodynamic models of Drosophila segmentation considerably by incorporating additional TFs into the model, and to investigate many aspects of transcription control in greater detail than it was possible before. Among these topics are the specificity of TF binding motifs, the nature of TF cooperativity and DNA accessibility. With the help of the here developed impact score, I assess the influence of all relevant TFs in silico, delineate the cooperativity range of the key TF bcd, and determine the importance of weak binding sites. Finally, I develop and discuss two alternative models of transcription control that lack the prediction quality of thermodynamic models, but, nevertheless, give valuable insights into the architectural principles of enhancers. This project is part of a larger effort to advance our current understanding of transcription regulation by reconstructing the segmentation network of Drosophila in silico. The results of this thesis facilitate future modeling efforts by optimally leveraging the available data as well as by improving our understanding of thermodynamic models.