Logo Logo
Help
Contact
Switch language to German
Representation learning for uncertainty-aware clinical decision support
Representation learning for uncertainty-aware clinical decision support
Over the last decade, there has been an increasing trend towards digitalization in healthcare, where a growing amount of patient data is collected and stored electronically. These recorded data are known as electronic health records. They are the basis for state-of-the-art research on clinical decision support so that better patient care can be delivered with the help of advanced analytical techniques like machine learning. Among various technical fields in machine learning, representation learning is about learning good representations from raw data to extract useful information for downstream prediction tasks. Deep learning, a crucial class of methods in representation learning, has achieved great success in many fields such as computer vision and natural language processing. These technical breakthroughs would presumably further advance the research and development of data analytics in healthcare. This thesis addresses clinically relevant research questions by developing algorithms based on state-of-the-art representation learning techniques. When a patient visits the hospital, a physician will suggest a treatment in a deterministic manner. Meanwhile, uncertainty comes into play when the past statistics of treatment decisions from various physicians are analyzed, as they would possibly suggest different treatments, depending on their training and experiences. The uncertainty in clinical decision-making processes is the focus of this thesis. The models developed for supporting these processes will therefore have a probabilistic nature. More specifically, the predictions are predictive distributions in regression tasks and probability distributions over, e.g., different treatment decisions, in classification tasks. The first part of the thesis is concerned with prescriptive analytics to provide treatment recommendations. Apart from patient information and treatment decisions, the outcome after the respective treatment is included in learning treatment suggestions. The problem setting is known as learning individualized treatment rules and is formulated as a contextual bandit problem. A general framework for learning individualized treatment rules using data from observational studies is presented based on state-of-the-art representation learning techniques. From various offline evaluation methods, it is shown that the treatment policy in our proposed framework can demonstrate better performance than both physicians and competitive baselines. Subsequently, the uncertainty-aware regression models in diagnostic and predictive analytics are studied. Uncertainty-aware deep kernel learning models are proposed, which allow the estimation of the predictive uncertainty by a pipeline of neural networks and a sparse Gaussian process. By considering the input data structure, respective models are developed for diagnostic medical image data and sequential electronic health records. Various pre-training methods from representation learning are adapted to investigate their impacts on the proposed models. Through extensive experiments, it is shown that the proposed models delivered better performance than common architectures in most cases. More importantly, uncertainty-awareness of the proposed models is illustrated by systematically expressing higher confidence in more accurate predictions and less confidence in less accurate ones. The last part of the thesis is about missing data imputation in descriptive analytics, which provides essential evidence for subsequent decision-making processes. Rather than traditional mean and median imputation, a more advanced solution based on generative adversarial networks is proposed. The presented method takes the categorical nature of patient features into consideration, which enables the stabilization of the adversarial training. It is shown that the proposed method can better improve the predictive accuracy compared to traditional imputation baselines.
Representation Learning, Clinical Decision Support, Uncertainty-Awareness
Wu, Zhiliang
2022
English
Universitätsbibliothek der Ludwig-Maximilians-Universität München
Wu, Zhiliang (2022): Representation learning for uncertainty-aware clinical decision support. Dissertation, LMU München: Faculty of Mathematics, Computer Science and Statistics
[thumbnail of Wu_Zhiliang.pdf]
Preview
PDF
Wu_Zhiliang.pdf

4MB

Abstract

Over the last decade, there has been an increasing trend towards digitalization in healthcare, where a growing amount of patient data is collected and stored electronically. These recorded data are known as electronic health records. They are the basis for state-of-the-art research on clinical decision support so that better patient care can be delivered with the help of advanced analytical techniques like machine learning. Among various technical fields in machine learning, representation learning is about learning good representations from raw data to extract useful information for downstream prediction tasks. Deep learning, a crucial class of methods in representation learning, has achieved great success in many fields such as computer vision and natural language processing. These technical breakthroughs would presumably further advance the research and development of data analytics in healthcare. This thesis addresses clinically relevant research questions by developing algorithms based on state-of-the-art representation learning techniques. When a patient visits the hospital, a physician will suggest a treatment in a deterministic manner. Meanwhile, uncertainty comes into play when the past statistics of treatment decisions from various physicians are analyzed, as they would possibly suggest different treatments, depending on their training and experiences. The uncertainty in clinical decision-making processes is the focus of this thesis. The models developed for supporting these processes will therefore have a probabilistic nature. More specifically, the predictions are predictive distributions in regression tasks and probability distributions over, e.g., different treatment decisions, in classification tasks. The first part of the thesis is concerned with prescriptive analytics to provide treatment recommendations. Apart from patient information and treatment decisions, the outcome after the respective treatment is included in learning treatment suggestions. The problem setting is known as learning individualized treatment rules and is formulated as a contextual bandit problem. A general framework for learning individualized treatment rules using data from observational studies is presented based on state-of-the-art representation learning techniques. From various offline evaluation methods, it is shown that the treatment policy in our proposed framework can demonstrate better performance than both physicians and competitive baselines. Subsequently, the uncertainty-aware regression models in diagnostic and predictive analytics are studied. Uncertainty-aware deep kernel learning models are proposed, which allow the estimation of the predictive uncertainty by a pipeline of neural networks and a sparse Gaussian process. By considering the input data structure, respective models are developed for diagnostic medical image data and sequential electronic health records. Various pre-training methods from representation learning are adapted to investigate their impacts on the proposed models. Through extensive experiments, it is shown that the proposed models delivered better performance than common architectures in most cases. More importantly, uncertainty-awareness of the proposed models is illustrated by systematically expressing higher confidence in more accurate predictions and less confidence in less accurate ones. The last part of the thesis is about missing data imputation in descriptive analytics, which provides essential evidence for subsequent decision-making processes. Rather than traditional mean and median imputation, a more advanced solution based on generative adversarial networks is proposed. The presented method takes the categorical nature of patient features into consideration, which enables the stabilization of the adversarial training. It is shown that the proposed method can better improve the predictive accuracy compared to traditional imputation baselines.