Loading...
Thumbnail Image
Item

Predictive Uncertainty Quantification and Explainable Machine Learning in Healthcare

Gligorijevic, Djordje
Citations
Altmetric:
Genre
Thesis/Dissertation
Date
2018
Group
Department
Computer and Information Science
Permanent link to this record
Research Projects
Organizational Units
Journal Issue
DOI
http://dx.doi.org/10.34944/dspace/1290
Abstract
Predictive modeling is an ever-increasingly important part of decision making. The advances in Machine Learning predictive modeling have spread across many domains bringing significant improvements in performance and providing unique opportunities for novel discoveries. A notably important domains of the human world are medical and healthcare domains, which take care of peoples' wellbeing. And while being one of the most developed areas of science with active research, there are many ways they can be improved. In particular, novel tools developed based on Machine Learning theory have drawn benefits across many areas of clinical practice, pushing the boundaries of medical science and directly affecting well-being of millions of patients. Additionally, healthcare and medicine domains require predictive modeling to anticipate and overcome many obstacles that future may hold. These kinds of applications employ a precise decision--making processes which requires accurate predictions. However, good prediction by its own is often insufficient. There has been no major focus in developing algorithms with good quality uncertainty estimates. Ergo, this thesis aims at providing a variety of ways to incorporate solutions by learning high quality uncertainty estimates or providing interpretability of the models where needed for purpose of improving existing tools built in practice and allowing many other tools to be used where uncertainty is the key factor for decision making. The first part of the thesis proposes approaches for learning high quality uncertainty estimates for both short- and long-term predictions in multi-task learning, developed on top for continuous probabilistic graphical models. In many scenarios, especially in long--term predictions, it may be of great importance for the models to provide a reliability flag in order to be accepted by domain experts. To this end we explored a widely applied structured regression model with a goal of providing meaningful uncertainty estimations on various predictive tasks. Our particular interest is in modeling uncertainty propagation while predicting far in the future. To address this important problem, our approach centers around providing an uncertainty estimate by modeling input features as random variables. This allows modeling uncertainty from noisy inputs. In cases when model iteratively produces errors it should propagate uncertainty over the predictive horizon, which may provide invaluable information for decision making based on predictions. In the second part of the thesis we propose novel neural embedding models for learning low-dimensional embeddings of medical concepts, such are diseases and genes, and show how they can be interpreted to allow accessing their quality, and show how can they be used to solve many problems in medical and healthcare research. We use EHR data to discover novel relationships between diseases by studying their comorbidities (i.e., co-occurrences in patients). We trained our models on a large-scale EHR database comprising more than 35 million inpatient cases. To confirm value and potential of the proposed approach we evaluate its effectiveness on a held-out set. Furthermore, for select diseases we provide a candidate gene list for which disease-gene associations were not studied previously, allowing biomedical researchers to better focus their often very costly lab studies. We furthermore examine how disease heterogeneity can affect the quality of learned embeddings and propose an approach for learning types of such heterogeneous diseases, while in our study we primarily focus on learning types of sepsis. Finally, we evaluate the quality of low-dimensional embeddings on tasks of predicting hospital quality indicators such as length of stay, total charges and mortality likelihood, demonstrating their superiority over other approaches. In the third part of the thesis we focus on decision making in medicine and healthcare domain by developing state-of-the-art deep learning models capable of outperforming human performance while maintaining good interpretability and uncertainty estimates.
Description
Citation
Citation to related work
Has part
ADA compliance
For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
Embedded videos