Logo Logo
Help
Contact
Switch language to German
Statistical inference of complex demographic models in Drosophila melanogaster and two wild tomato species
Statistical inference of complex demographic models in Drosophila melanogaster and two wild tomato species
The aim of this thesis was to use the genealogical information contained in genetic variation profiles of natural populations to describe the evolution of a particular species. In the first project we analysed the colonization process that brought Drosophila melanogaster from Africa to Asia. Southeast Asian populations of the fruit fly D. melanogaster differ from ancestral African and derived European populations by several morphological characteristics. It has been argued that this morphological differentiation could be the result of an early colonization of Southeast Asia that predated the migration of D. melanogaster to Europe after the last glacial period (around 10,000 years ago). To investigate the colonization process of Southeast Asia, we collected nucleotide polymorphism data for 200 X-linked and 50 autosomal loci from a population of Malaysia. We analysed this new SNP dataset jointly with already existing data from an African and a European population by employing an Approximate Bayesian Computation (ABC) approach. By contrasting different demographic models of these three populations, we do not find any evidence for an early divergence between the African and the Asian populations. Rather, we show that Asian and European populations of D. melanogaster share a non-African most recent common ancestor (MRCA) that existed about 2500 years ago. The second project of my PhD thesis is an analysis of the importance of seed dormancy at the population level in two wild tomato species. Seed banks, that is, plant seeds remaining in soils for several generations before germination, are of practical importance in conservation biology because they diminish the immediate ecological impact of habitat fragmentation and prevent species extinction. From an evolutionary perspective, seed banks increase the genetic diversity of plant populations and buffer the effect of varying climatic conditions by magnifying the effects of good years and by dampening the effects of bad years. In this study we estimate the germination rates for two wild tomato species (Solanum chilense and Solanum peruvianum) found in western South-America in a wide range of habitats by using DNA sequences coupled to a coalescent model in combination with ecological data. We develop an ABC framework to integrate ecological information on above ground population census sizes, in order to estimate seed bank and metapopulation parameters for each species. We provide the first evidence that it is possible to disentangle the effect of the metapopulation structure from that of the seed bank on the effective population size and to obtain accurate estimates of germination rates based on a coalescent model. The third and last project of this thesis is related to the development of a computational tool that facilitates the analysis of nucleotide polymorphism datasets in an ABC framework. With the availability of whole-genome sequence data, biologists are able to test hypotheses regarding the demography of populations. Furthermore, the advancement of the ABC methodology allows the demographic inference to be performed in a simple framework using summary statistics. We present here msABC, a coalescent-based software that facilitates the simulation of multi-locus data, suitable for an ABC analysis. msABC is based on Hudson's ms algorithm, which is used extensively for simulating neutral demographic histories of populations. The flexibility of the original algorithm has been extended so that sample size may vary among loci, missing data can be incorporated in simulations and calculations, and a multitude of summary statistics for single or multiple populations is generated. The source code of msABC is available at http://bio.lmu.de/~pavlidis/msabc.
Demography, Population Genetics, Approximate Bayesian Computation, Drosophila melanogaster, Solanum peruvianum, Solanum chilense
Laurent, Stefan
2011
English
Universitätsbibliothek der Ludwig-Maximilians-Universität München
Laurent, Stefan (2011): Statistical inference of complex demographic models in Drosophila melanogaster and two wild tomato species. Dissertation, LMU München: Faculty of Biology
[thumbnail of Laurent_Stefan.pdf]
Preview
PDF
Laurent_Stefan.pdf

1MB

Abstract

The aim of this thesis was to use the genealogical information contained in genetic variation profiles of natural populations to describe the evolution of a particular species. In the first project we analysed the colonization process that brought Drosophila melanogaster from Africa to Asia. Southeast Asian populations of the fruit fly D. melanogaster differ from ancestral African and derived European populations by several morphological characteristics. It has been argued that this morphological differentiation could be the result of an early colonization of Southeast Asia that predated the migration of D. melanogaster to Europe after the last glacial period (around 10,000 years ago). To investigate the colonization process of Southeast Asia, we collected nucleotide polymorphism data for 200 X-linked and 50 autosomal loci from a population of Malaysia. We analysed this new SNP dataset jointly with already existing data from an African and a European population by employing an Approximate Bayesian Computation (ABC) approach. By contrasting different demographic models of these three populations, we do not find any evidence for an early divergence between the African and the Asian populations. Rather, we show that Asian and European populations of D. melanogaster share a non-African most recent common ancestor (MRCA) that existed about 2500 years ago. The second project of my PhD thesis is an analysis of the importance of seed dormancy at the population level in two wild tomato species. Seed banks, that is, plant seeds remaining in soils for several generations before germination, are of practical importance in conservation biology because they diminish the immediate ecological impact of habitat fragmentation and prevent species extinction. From an evolutionary perspective, seed banks increase the genetic diversity of plant populations and buffer the effect of varying climatic conditions by magnifying the effects of good years and by dampening the effects of bad years. In this study we estimate the germination rates for two wild tomato species (Solanum chilense and Solanum peruvianum) found in western South-America in a wide range of habitats by using DNA sequences coupled to a coalescent model in combination with ecological data. We develop an ABC framework to integrate ecological information on above ground population census sizes, in order to estimate seed bank and metapopulation parameters for each species. We provide the first evidence that it is possible to disentangle the effect of the metapopulation structure from that of the seed bank on the effective population size and to obtain accurate estimates of germination rates based on a coalescent model. The third and last project of this thesis is related to the development of a computational tool that facilitates the analysis of nucleotide polymorphism datasets in an ABC framework. With the availability of whole-genome sequence data, biologists are able to test hypotheses regarding the demography of populations. Furthermore, the advancement of the ABC methodology allows the demographic inference to be performed in a simple framework using summary statistics. We present here msABC, a coalescent-based software that facilitates the simulation of multi-locus data, suitable for an ABC analysis. msABC is based on Hudson's ms algorithm, which is used extensively for simulating neutral demographic histories of populations. The flexibility of the original algorithm has been extended so that sample size may vary among loci, missing data can be incorporated in simulations and calculations, and a multitude of summary statistics for single or multiple populations is generated. The source code of msABC is available at http://bio.lmu.de/~pavlidis/msabc.