Development of machine learning and biostatistical models for cancer pharmacogenomics screens

www.lmu.de | UB | Blättern | FAQ

Zur erweiterten Suche

English

Zur erweiterten Suche

Cancer is a complex genetic disease emerging from the accumulation of somatic alterations that drive tumour growth. This disease is remarkably heterogeneous, comprising several subtypes driven by various distinct mutational events and with individual response mechanisms. Notably, its complexity renders this disease hard to research and contributes to be one of the top deadliest worldwide. High-throughput drug screens have empowered numerous targeted and combination therapies for personalised patient treatment by revealing potentially relevant biomarkers. The application of large scale of genomic datasets, such as the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Therapeutics Response Portal (CTRP), has sparked the need for suitable bioinformatic tools to properly mine, model and analyse cancer biomarkers in the data. In this dissertation, I focused on three aims towards cancer biomarker discovery and developed distinct algorithms to analyse each task. Aim 1, analysing drug resistance mechanisms using statistical frame- works; Aim 2, investigating synergistic drug combinations in cells with uncontrolled proliferation markers using curve fitting methodologies; and Aim 3, identifying new cancer-specific driver genes based on a network-based approach. Aim 1: To investigate acquired resistance to a treatment from initially responsive cell lines, I developed an outlier statistical model that identifies unexpectedly resistant cell lines from the GDSC and CCLE drug screens. This method not only reproduced known biomarkers in lung adenocarcinoma, but also outperformed a standardised outlier detection method. Furthermore, the proposed hierarchical statistical frame- work was also tested in terms of false discovery rate bounds. Aim 2: Secondly, I looked into the modelling of drug responses with unexpectedly increase cell viability missed by standard methodologies, and proposed to leverage drug-induced uncontrolled proliferation as a new synergistic combination therapy with drugs that act on fast proliferating cells, e.g., DNA damaging agents. Building on this, I developed two mathematical frameworks based on Gaussian and linear models to capture cancer-type biomarkers of increased viability. Promising candidates in lung cancer were tested in additional drug screen experiments and potential synergistic drug combinations were hypothesised. Aim 3: I proposed the weighted Protein-Protein Interaction (wPPI) tool based on PPI networks, combined with Gene Ontology and Human Phenotype Ontology datasets, to infer new tissue-specific genes closely related to cancer driver genes. Subsequently, the gene expression profiles of the top highest scoring candidates were used to develop drug response machine learning models in breast cancer. The performance of the built models was assessed and cross-compared with models created with several gene feature sets, namely unspecific tissue-specific genes and genes prioritised with other network-based methodology. In summary, this dissertation introduces innovative and robust computational methodologies to advance tissue-specific cancer biomarker discovery. These approaches address multiple challenges associated with limited statistical power in precision oncology, including the investigation of rare phenomena and the insufficient understanding of key players of cancer progression. As an overarching goal, these methodologies are envisioned to not only enhance insights into the complex mechanisms underlying cancer, but also con- tribute to the design of refined targeted therapeutic strategies.

Cancer, Pharmacogenomics Screens, Machine Learning, Biostatistical Models, Cell lines

Paulo Galhoz, Ana Cláudia

14. Jul. 2025

2025

Englisch

Universitätsbibliothek der Ludwig-Maximilians-Universität München

https://nbn-resolving.org/urn:nbn:de:bvb:19-356225

Paulo Galhoz, Ana Cláudia (2025): Development of machine learning and biostatistical models for cancer pharmacogenomics screens. Dissertation, LMU München: Fakultät für Biologie

[thumbnail of Paulo_Galhoz_Ana_Claudia.pdf]

Vorschau

PDF
Paulo_Galhoz_Ana_Claudia.pdf
12MB

DOI: 10.5282/edoc.35623

URN: urn:nbn:de:bvb:19-356225

Abstract

Dokumententyp:	Dissertationen (Dissertation, LMU München)
Keywords:	Cancer, Pharmacogenomics Screens, Machine Learning, Biostatistical Models, Cell lines
Themengebiete:	500 Naturwissenschaften und Mathematik 500 Naturwissenschaften und Mathematik > 570 Biowissenschaften, Biologie
Fakultäten:	Fakultät für Biologie
Sprache der Hochschulschrift:	Englisch
Datum der mündlichen Prüfung:	14. Juli 2025
1. Berichterstatter:in:	Menden, Michael
MD5 Prüfsumme der PDF-Datei:	4dc064275d64a021973b0cc9305e7870
Signatur der gedruckten Ausgabe:	0001/UMC 31401
ID Code:	35623
Eingestellt am:	19. Aug. 2025 08:55
Letzte Änderungen:	19. Aug. 2025 09:05

Nur für Administratoren und Editoren: Dokument bearbeiten