Logo Logo
Help
Contact
Switch language to German
Recognition of short functional motifs in protein sequences
Recognition of short functional motifs in protein sequences
The main goal of this study was to develop a method for computational de novo prediction of short linear motifs (SLiMs) in protein sequences that would provide advantages over existing solutions for the users. The users are typically biological laboratory researchers, who want to elucidate the function of a protein that is possibly mediated by a short motif. Such a process can be subcellular localization, secretion, post-translational modification or degradation of proteins. Conducting such studies only with experimental techniques is often associated with high costs and risks of uncertainty. Preliminary prediction of putative motifs with computational methods, them being fast and much less expensive, provides possibilities for generating hypotheses and therefore, more directed and efficient planning of experiments. To meet this goal, I have developed HH-MOTiF – a web-based tool for de novo discovery of SLiMs in a set of protein sequences. While working on the project, I have also detected patterns in sequence properties of certain SLiMs that make their de novo prediction easier. As some of these patterns are not yet described in the literature, I am sharing them in this thesis. While evaluating and comparing motif prediction results, I have identified conceptual gaps in theoretical studies, as well as existing practical solutions for comparing two sets of positional data annotating the same set of biological sequences. To close this gap and to be able to carry out in-depth performance analyses of HH-MOTiF in comparison to other predictors, I have developed a corresponding statistical method, SLALOM (for StatisticaL Analysis of Locus Overlap Method). It is currently available as a standalone command line tool.
Motif search, protein sequences, positional data, statistical analysis
Prytuliak, Roman
2018
English
Universitätsbibliothek der Ludwig-Maximilians-Universität München
Prytuliak, Roman (2018): Recognition of short functional motifs in protein sequences. Dissertation, LMU München: Faculty of Biology
[img]
Preview
PDF
Prytuliak_Roman.pdf

2MB

Abstract

The main goal of this study was to develop a method for computational de novo prediction of short linear motifs (SLiMs) in protein sequences that would provide advantages over existing solutions for the users. The users are typically biological laboratory researchers, who want to elucidate the function of a protein that is possibly mediated by a short motif. Such a process can be subcellular localization, secretion, post-translational modification or degradation of proteins. Conducting such studies only with experimental techniques is often associated with high costs and risks of uncertainty. Preliminary prediction of putative motifs with computational methods, them being fast and much less expensive, provides possibilities for generating hypotheses and therefore, more directed and efficient planning of experiments. To meet this goal, I have developed HH-MOTiF – a web-based tool for de novo discovery of SLiMs in a set of protein sequences. While working on the project, I have also detected patterns in sequence properties of certain SLiMs that make their de novo prediction easier. As some of these patterns are not yet described in the literature, I am sharing them in this thesis. While evaluating and comparing motif prediction results, I have identified conceptual gaps in theoretical studies, as well as existing practical solutions for comparing two sets of positional data annotating the same set of biological sequences. To close this gap and to be able to carry out in-depth performance analyses of HH-MOTiF in comparison to other predictors, I have developed a corresponding statistical method, SLALOM (for StatisticaL Analysis of Locus Overlap Method). It is currently available as a standalone command line tool.