Logo Logo
Hilfe
Kontakt
Switch language to English
Measuring primate gene expression evolution using high throughput transcriptomics and massively parallel reporter assays
Measuring primate gene expression evolution using high throughput transcriptomics and massively parallel reporter assays
A key question in biology is how one genome sequence can lead to the great cellular diversity present in multicellular organisms. Enabled by he sequencing revolution, RNA sequencing (RNA-seq) has emerged as a central tool to measure transcriptome-wide gene expression levels. More recently, single cell RNA-seq was introduced and is becoming a feasible alternative to the more established bulk sequencing. While many different methods have been proposed, a thorough optimisation of established protocols can lead to improvements in robustness, sensitivity, scalability and cost effectiveness. Towards this goal, I have contributed to optimizing the single cell RNA-seq method "Single Cell RNA Barcoding and sequencing" (SCRB-seq) and publishing an improved version that uses optimized reaction conditions and molecular crowding (mcSCRB-seq). mcSCRB-seq achieves higher sensitivity at lower cost per cell and shows the highest RNA capture rate when compared with other published methods. We next sought the direct comparison to other scRNA-seq protocols within the Human Cell Atlas (HCA) benchmarking effort. Here we used mcSCRB-seq to profile a common reference sample that included heterogeneous cell populations from different sources. Transfer of the acquired knowledge on single cell RNA sequencing methods to bulk RNA-seq, led to the development of the prime-seq protocol. A sensitive, robust and cost-efficient bulk RNA-seq protocol that can be performed in any molecular biology laboratory. We compared the data generated, using the prime-seq protocol to the gold standard method TruSeq, using power simulations and found that the statistical power to detect differentially expressed genes is comparable, at 40-fold lower cost. While gene expression is an informative phenotype, the regulation that leads to the different phenotypes is still poorly understood. A state-of-the-art method to measure the activity of cis-regulatory elements (CRE) in a high throughput fashion are Massively Parallel Reporter Assays (MPRA). These assays can be used to measure the activity of thousands of cis-Regulatory Elements (CRE) in parallel. A good way to decode the genotype to phenotype conundrum is using evolutionary information. Cross-species comparisons of closely related species can help understand how particular diverging phenotypes emerged and how conserved gene regulatory programs are encoded in the genome. A very useful tool to perform comparative studies are cell lines, particularly induced Pluripotent Stem Cells (iPSCs). iPSCs can be reprogrammed from different primary somatic cells and are per definition pluripotent, meaning they can be differentiated into cells of all three germlayers. A main challenge for primate research is to obtain primary cells. To this end I contributed to establishing a protocol to generate iPSCs from a non-invasive source of primary cells, namely urine. By using prime-seq we characterized the primary Urine Derived Stem Cells (UDSCs) and the reprogrammed iPSCs. Finally, I used an MPRA to measure activity of putative regulatory elements of the gene TRNP1 across the mammalian phylogeny. We found co-evolution of one particular CRE with brain folding in old world monkeys. To validate the finding we looked for transcription factor binding sites within the identified CRE and intersected the list with transcription factors confirmed to be expressed in the cellular system using prime-seq. In addition we found that changes in the protein coding sequence of TRNP1 and neural stem cell proliferation induced by TRNP1 orthologs correlate with brain size. In summary, within my doctorate I developed methods that enable measuring gene expression and gene regulation in a comparative genomics setting. I further applied these methods in a cross mammalian study of the regulatory sequences of the gene TRNP1 and its association with brain phenotypes.
Transcriptomics, RNA-seq, Gene expression evolution, Massively parallel reporter assay
Wange, Lucas Esteban
2022
Englisch
Universitätsbibliothek der Ludwig-Maximilians-Universität München
Wange, Lucas Esteban (2022): Measuring primate gene expression evolution using high throughput transcriptomics and massively parallel reporter assays. Dissertation, LMU München: Fakultät für Biologie
[thumbnail of Wange_Lucas.pdf]
Vorschau
PDF
Wange_Lucas.pdf

59MB

Abstract

A key question in biology is how one genome sequence can lead to the great cellular diversity present in multicellular organisms. Enabled by he sequencing revolution, RNA sequencing (RNA-seq) has emerged as a central tool to measure transcriptome-wide gene expression levels. More recently, single cell RNA-seq was introduced and is becoming a feasible alternative to the more established bulk sequencing. While many different methods have been proposed, a thorough optimisation of established protocols can lead to improvements in robustness, sensitivity, scalability and cost effectiveness. Towards this goal, I have contributed to optimizing the single cell RNA-seq method "Single Cell RNA Barcoding and sequencing" (SCRB-seq) and publishing an improved version that uses optimized reaction conditions and molecular crowding (mcSCRB-seq). mcSCRB-seq achieves higher sensitivity at lower cost per cell and shows the highest RNA capture rate when compared with other published methods. We next sought the direct comparison to other scRNA-seq protocols within the Human Cell Atlas (HCA) benchmarking effort. Here we used mcSCRB-seq to profile a common reference sample that included heterogeneous cell populations from different sources. Transfer of the acquired knowledge on single cell RNA sequencing methods to bulk RNA-seq, led to the development of the prime-seq protocol. A sensitive, robust and cost-efficient bulk RNA-seq protocol that can be performed in any molecular biology laboratory. We compared the data generated, using the prime-seq protocol to the gold standard method TruSeq, using power simulations and found that the statistical power to detect differentially expressed genes is comparable, at 40-fold lower cost. While gene expression is an informative phenotype, the regulation that leads to the different phenotypes is still poorly understood. A state-of-the-art method to measure the activity of cis-regulatory elements (CRE) in a high throughput fashion are Massively Parallel Reporter Assays (MPRA). These assays can be used to measure the activity of thousands of cis-Regulatory Elements (CRE) in parallel. A good way to decode the genotype to phenotype conundrum is using evolutionary information. Cross-species comparisons of closely related species can help understand how particular diverging phenotypes emerged and how conserved gene regulatory programs are encoded in the genome. A very useful tool to perform comparative studies are cell lines, particularly induced Pluripotent Stem Cells (iPSCs). iPSCs can be reprogrammed from different primary somatic cells and are per definition pluripotent, meaning they can be differentiated into cells of all three germlayers. A main challenge for primate research is to obtain primary cells. To this end I contributed to establishing a protocol to generate iPSCs from a non-invasive source of primary cells, namely urine. By using prime-seq we characterized the primary Urine Derived Stem Cells (UDSCs) and the reprogrammed iPSCs. Finally, I used an MPRA to measure activity of putative regulatory elements of the gene TRNP1 across the mammalian phylogeny. We found co-evolution of one particular CRE with brain folding in old world monkeys. To validate the finding we looked for transcription factor binding sites within the identified CRE and intersected the list with transcription factors confirmed to be expressed in the cellular system using prime-seq. In addition we found that changes in the protein coding sequence of TRNP1 and neural stem cell proliferation induced by TRNP1 orthologs correlate with brain size. In summary, within my doctorate I developed methods that enable measuring gene expression and gene regulation in a comparative genomics setting. I further applied these methods in a cross mammalian study of the regulatory sequences of the gene TRNP1 and its association with brain phenotypes.