Logo Logo
Switch language to English
Piskol, Robert (2011): Structural and population genetic determinants of RNA secondary structure evolution. Dissertation, LMU München: Fakultät für Biologie



Since their discovery, RNA molecules have been shown to carry functions that extend far beyond their initially ascribed role as intermediates in protein biosynthesis. These noncoding RNAs (ncRNAs) are involved in fundamental cellular processes including the regulation of gene expression and maintenance of genome stability. In most cases the biogenesis or function of the RNA molecule is only possible if the molecule folds into a characteristic two- and three-dimensional shape via formation of intra-molecular base pairs. The disruption of these paired regions through mutations in the primary sequence can result in conformational changes of the molecule that impair its ability to function correctly. However, compensatory mutations can restore the original conformation of the molecule. Under the influence of various evolutionary forces, such as mutation and selection, a paired region (helix) will accumulate these nucleotide double-substitutions (covariations). The chance of a substitution and thus the rate of evolution depends on different properties of the helix. We developed a logistic regression approach to analyze the evolutionary dynamics of RNA secondary structures (Piskol and Stephan, 2008). This method was applied to a set of computationally predicted RNA secondary structures in vertebrate introns. Our aim was to discover structural and population genetic determinants of the compensatory mutation rate in RNA molecules. As predicted by Kimura’s (1985) model of compensatory evolution, our results are in agreement with the hypothesis that the physical distance between pairing nucleotides has a negative influence on the occurrence of covariations. Furthermore, we found that longer pairing regions have the ability to tolerate more wobbles (GU base pairs) and mismatches, and ultimately also contain more covariations. The position-wise analysis of all nucleotides in paired regions revealed that covariations occur preferentially at the helix ends, whereas wobbles and mismatches are more frequent in the middle of a helix. This pattern is largely determined by the GC content. We extended the study described above from structured regions in introns of vertebrate genes to folded RNA molecules that are scattered across the whole nuclear genomes of drosophilids (Drosophila melanogaster/D. simulans) and hominids (human/chimp). For these molecules we estimated genome wide selective constraints (Piskol and Stephan, 2011). In comparison to neutrally evolving regions of the same genomes we observed substantially reduced rates of substitutions at paired and unpaired sites of folded molecules. We found that more than 90% of novel mutations in ncRNAs are removed from the sequence by purifying selection. These values exceed estimates that were previously obtained for amino-acid changing positions of protein coding genes. It points to the overall importance of many folded genomic regions, which carry quite diverse functions (correct splicing, splicing efficiency, protein localization, RNA editing). We did not find significant differences in constraints between folded molecules based on their genomic location (coding/noncoding, genic/intergenic, UTR/non-UTR). Therefore, the restricted evolution of ncRNAs seems to be mostly driven by the basic need of the molecule to remain in its original conformation through continuous maintenance of pairings between nucleotides and only to a smaller extent by the location of the molecule in the genome. In addition, a comparison of selective coefficients between drosophilids and hominids enabled us to find evidence for the impact of the effective population size on RNA evolution, which resulted in significantly higher constraints in drosophilids than hominids and led to larger differences in selective constraints at unpaired than at paired positions. Motivated by the evidence for a potential role of the effective population size in the evolution of ncRNA molecules we explored this topic in greater detail. The effective population size of a species (N_e) is a fundamental quantity in population genetics. Its impact on the efficacy of selection has been the focus of many theoretical and empirical studies over the recent years. Yet, the effect of N_e was mostly investigated in connection with the evolution of independently evolving sites in a genome, while its impact on the evolution of epistatic interactions is not well understood. Our previous work (see previous paragraph) showed evidence for the role of N_e in the evolution of ncRNA molecules (which consist to a large extent of coevolving regions). To increase our knowledge of the impact of N_e on the evolution at independently evolving and coevolving sites, we focused on transfer RNAs (tRNAs) - a class of RNA molecules with well studied structure and function. We compared the rates of evolution at paired and unpaired positions in orthologous tRNAs of various vertebrate and Drosophila species. Therefore, we chose groups of species that differ in their long-term effective population sizes and compared the level of selective constraint between them. These pairs included human/macaque, macaque/marmoset, dog/cat, chicken/zebra finch, mouse/rat, D. melanogaster/D. yakuba, and D. melanogaster/D. simulans. Indeed, we were able to detect differences in selective constraints between species pairs of different N_e. These differences can be explained well by theoretical predictions for the evolution of independently evolving and coevolving sites. Specifically, we found that constraints in orthologous tRNAs of a species pair increase with increasing long-term N_e. Thereby, the effect of N_e is stronger at unpaired(independently evolving) sites than paired (coevolving) sites. Furthermore, for all species pairs we identified sets of orthologous tRNAs with high structural similarity to tRNAs from all major kingdoms of life (’core’ sets), and tRNAs with lower similarity (’peripheral’ sets). We found the core sets to be under strong overall constraints and only subject to a negligible effect of N_e. In the peripheral set, however, we discovered a strong influence of N_e on constraints. We also investigated whether the difference in N_e between autosomes and X chromosome, due to the presence of the X chromosome in one copy in males, has an effect on differences in evolutionary rates. We were able to show that constraints are more relaxed in X-linked tRNAs.