Logo Logo
Help
Contact
Switch language to German
Analysing and quantitatively modelling nucleosome binding preferences
Analysing and quantitatively modelling nucleosome binding preferences
The main emphasis of my work as a PhD student was the analysis and prediction of nucleosome positioning, focusing on the role sequence features play. Part I gives a broad overview of nucleosomes, before defining important technical terms. It continues by describing and reviewing experiments that measure nucleosome positioning and bioinformatic methods that learn the sequence preferences of nucleosomes to predict their positioning. Part II describes a collaboration project with the Gaul-lab, where I analyzed MNase-Seq measurements of nucleosomes in Drosophila. The original intention was to investigate the extent to which experimental biases influence the measurements. We extended the analysis to categorize and explore fragile, average and resistant nucleosome populations. I focused on the relation between nucleosome fragility and the sequence landscape, especially at promoters and enhancers. Analyzing the partial unwrapping of nucleosomes genome-wide, I found that the G+C ratio is a determinant of asymmetric unwrapping. I excluded an analysis of histone modifications from this work, which was part of this collaboration, due to its low relevance to the rest of the presented work. Part III describes my main project of developing a probabilistic nucleosome-position prediction method. I developed a maximum likelihood approach to learn a biophysical model of nucleosome binding. By including the low positional resolution of MNase-Seq and the sequence bias of CC-Seq into the likelihood, I could separate them from the nucleosome binding preferences and learn highly correlated nucleosome binding energy models. My analysis shows that nucleosomes have a position-specific binding preference and might be uninfluenced by G+C content or even disfavor it – contrary to the Consensus in literature. Part IV describes further analysis I did during my time as a PhD student that are not part of any planned publications. The main topics are: ancillary elements of my main project, unsuccessful attempts to correct experimental biases, analysis of the quality of experimental measurements, and adapting my probabilistic nucleosome-position prediction method to work with occupancy measurements. Lastly, I give a general outlook that reflects on my results and discusses next steps, like ways to improve my method further. I excluded two collaboration projects I participated in from this thesis, because they are still ongoing: a systematic analysis of how the core promoter sequence influences gene expression in Drosophila and the development of an experiment to measure nucleosome occupancy more precisely.
Bioinformatic, biology, computer science, biochemistry, nucleosome, probabilistic model, binding preference, sequence features, nucleosome fragility, nucleosome positioning, thermodynamic model, MNase-seq
Heron, Mark Eric Leslie
2017
English
Universitätsbibliothek der Ludwig-Maximilians-Universität München
Heron, Mark Eric Leslie (2017): Analysing and quantitatively modelling nucleosome binding preferences. Dissertation, LMU München: Faculty of Chemistry and Pharmacy
[img]
Preview
PDF
Heron_Mark_E._L.pdf

23MB

Abstract

The main emphasis of my work as a PhD student was the analysis and prediction of nucleosome positioning, focusing on the role sequence features play. Part I gives a broad overview of nucleosomes, before defining important technical terms. It continues by describing and reviewing experiments that measure nucleosome positioning and bioinformatic methods that learn the sequence preferences of nucleosomes to predict their positioning. Part II describes a collaboration project with the Gaul-lab, where I analyzed MNase-Seq measurements of nucleosomes in Drosophila. The original intention was to investigate the extent to which experimental biases influence the measurements. We extended the analysis to categorize and explore fragile, average and resistant nucleosome populations. I focused on the relation between nucleosome fragility and the sequence landscape, especially at promoters and enhancers. Analyzing the partial unwrapping of nucleosomes genome-wide, I found that the G+C ratio is a determinant of asymmetric unwrapping. I excluded an analysis of histone modifications from this work, which was part of this collaboration, due to its low relevance to the rest of the presented work. Part III describes my main project of developing a probabilistic nucleosome-position prediction method. I developed a maximum likelihood approach to learn a biophysical model of nucleosome binding. By including the low positional resolution of MNase-Seq and the sequence bias of CC-Seq into the likelihood, I could separate them from the nucleosome binding preferences and learn highly correlated nucleosome binding energy models. My analysis shows that nucleosomes have a position-specific binding preference and might be uninfluenced by G+C content or even disfavor it – contrary to the Consensus in literature. Part IV describes further analysis I did during my time as a PhD student that are not part of any planned publications. The main topics are: ancillary elements of my main project, unsuccessful attempts to correct experimental biases, analysis of the quality of experimental measurements, and adapting my probabilistic nucleosome-position prediction method to work with occupancy measurements. Lastly, I give a general outlook that reflects on my results and discusses next steps, like ways to improve my method further. I excluded two collaboration projects I participated in from this thesis, because they are still ongoing: a systematic analysis of how the core promoter sequence influences gene expression in Drosophila and the development of an experiment to measure nucleosome occupancy more precisely.