Browsing Theses and Dissertations by Subject "Machine learning"
Now showing items 1-4 of 4
ADVANCED MACHINE LEARNING MODELS IN PREDICTION OF MEDICAL CONDITIONSThe primary goal of Machine learning (ML) models in the prediction of medical conditions is to accurately predict (classify) the occurrence of a disease, or therapy. Many ML models, traditional and deep, have been utilized for the prediction of disease diagnosis, or prediction of the most optimal therapeutic approach. Almost all categories of medical conditions were subject to ML analysis. When creating predictive ML algorithms in medicine, it is pivotal to consider what problems are intended to be solved and how much and what types of training data are available. For challenging prediction (classification) problems, the understanding of disease pathogenesis makes the selection of an adequate ML model and accurate prediction more likely. The hypothesis of the research was to demonstrate that the optimal and adequate selection of model inputs as well as the selection and design of adequate ML methods improves the prediction accuracy of occurrence of diseases and their outcomes. The effectiveness and accuracy of created deep learning and traditional methods have been analyzed and compared. The impact of different medical conditions and different medical domains on optimal selection and performance of ML models was also studied. The effectiveness of advanced ML models was tested on four different diseases: Alzheimer’s disease (AD), Diabetes Mellitus type 2 (DM2), Influenza, and Colorectal cancer (CRC). The objective of the first part of the thesis (AD study) was to determine could prediction of AD from Electronic medical records (EMR) data alone be significantly improved by applying domain knowledge in positive dataset selection rather than setting naïve filters. Selected Clinically Relevant Positive (SCRP) datasets were used as inputs to a Long-Short-Term Memory (LSTM) Recurrent Neural Network (RNN) deep learning model to predict will the patient develop AD. The LSTM RNN method performed significantly better when learning from the SCRP dataset than when datasets were selected naïvely. Accurate prediction of AD is significant in the identification of patients for clinical trials, and a better selection of patients who need imaging diagnostics. The objective of the DM2 research was to predict if patients with DM2 would develop any of ten selected complications. RNN LSTM and RNN Gated Recurrent Units (GRU) models were designed and compared to Random Forest and Multilayer Perceptron traditional models. The number of hospitalizations registered in the EMR data was an important factor for the prediction accuracy. The prediction accuracy of complications decreases over time. The RNN GRU model was the best choice for EMR type of data, followed by the RNN LSTM model. An accurate prediction of the occurrence of complications of DM2 is important in the planning of targeted measures aimed to slow down or prevent their development. The objective of the third part of the thesis was to improve the understanding of spatial spreading of complicated cases of influenza that required hospitalizations, by constructing social network models. A novel approach was designed, which included the construction of heatmaps for geographic regions in New York state and power-law networks, to analyze the distribution of hospitalized flu cases. The methodology constructed in the study allowed to identify critical hubs and routes of spreading of Influenza, in specific geographic locations. Obtained results could enable better prediction of the distribution of complicated flu cases in specific geographic regions and better prediction of required resources for prevention and treatment of hospitalized patients with Influenza. The fourth part of the thesis proposes approaches to discover risk factors (comorbidities and genes) associated with the development of CRC, which can be used for future ML models to predict the influence of risk factors on prognosis and outcomes of cancer and other chronic diseases. A novel social network and text mining model was developed to study specific risk factors of CRC. Identified associations between comorbidities, CRC, and shared genes can have important implications on early discovery, and prognosis of CRC, which can be subject to predictive ML models in the future. Prediction ML models could help physicians to select the most effective diagnostic, preventive and therapeutic choices available. These ML models can provide recommendations to select suitable patients for clinical trials, which is very important in searching for medical solutions in health emergencies. Successful ML models can make medicine more efficient, improve outcomes, and decreases medical errors.
COMBINING CONVOLUTIONAL NEURAL NETWORKS AND GRAPH NEURAL NETWORKS FOR IMAGE CLASSIFICATIONConvolutional Neural Networks (CNNs) have dominated the task of imageclassification since 2012. Some key components of their success are that the underlying architecture integrates a set inductive biases such as translational invariance and the training computation can be significantly reduced by employing weight sharing. CNNs are powerful tools for generating new representations of images tailored to a particular task such as classification. However, because each image is passed through the network independent of other images, CNNs are not able to effectively aggregate information between examples. In this thesis, we explore the idea of using Graph Neural Networks (GNNs) in conjunction with CNNs to produce an architecture that has both the representational capacity of a CNN and the ability to aggregate information between examples. Graph Neural Networks apply the concept of convolutions directly on graphs. A result of this is that GNNs are able to learn from the connections between nodes. However, when working with image datasets, there is no obvious choice on how to construct a graph. There are certain heuristics such as ensuring homophily that have empirically been shown to increase the performance of GNNs. In this thesis, we apply different schemes of constructing a graph from image data for the downstream task of image classification and experiment with settings such as using multiple feature spaces and enforcing a bipartite graph structure. We also propose a model that allows for end to end training using CNNs and GNNs with proxies and attention that improves classification accuracy in comparison to a regular CNN.
IMPROVED SEGMENTATION FOR AUTOMATED SEIZURE DETECTION USING CHANNEL-DEPENDENT POSTERIORSThe electroencephalogram (EEG) is the primary tool used for the diagnosis of a varietyof neural pathologies such as epilepsy. Identification of a critical event, such as an epileptic seizure, is difficult because the signals are collected by transducing extremely low voltages, and as a result, are corrupted by noise. Also, EEG signals often contain artifacts due to clinical phenomena such as patient movement. These artifacts are easily confused as seizure events. Factors such as slowly evolving morphologies make accurate marking of the onset and offset of a seizure event difficult. Precise segmentation, defined as the ability to detect start and stop times within a fraction of a second, is a challenging research problem. In this dissertation, we improve seizure segmentation performance by developing deep learning technology that mimics the human interpretation process. The central thesis of this work is that separation of the seizure detection problem into a two-phase problem – epileptiform activity detection followed by seizure detection – should improve our ability to detect and localize seizure events. In the first phase, we use a sequential neural network algorithm known as a long short-term memory (LSTM) network to identify channel-specific epileptiform discharges associated with seizures. In the second phase, the feature vector is augmented with posteriors that represent the onset and offset of ictal activities. These augmented features are applied to a multichannel convolutional neural network (CNN) followed by an LSTM network. The multiphase model was evaluated on a blind evaluation set and was shown to detect 106 segment boundaries within a 2-second margin of error. Our previous best system, which delivers state-of-the-art performance on this task, correctly detected only 9 segment boundaries. Our multiphase system was also shown to be robust by performing well on two blind evaluation sets. Seizure detection performance on the TU Seizure Detection (TUSZ) Corpus development set is 41.60% sensitivity with 5.63 false alarms/24 hours (FAs/24 hrs). Performance on the corresponding evaluation set is 48.21% sensitivity with 16.54 FAs/24 hrs. Performance on a previously unseen corpus, the Duke University Seizure (DUSZ) Corpus is 46.62% sensitivity with 7.86 FAs/24 hrs. Our previous best system yields 30.83% sensitivity with 6.74 FAs/24 hrs on the TUSZ development set, 33.11% sensitivity with 19.89 FAs/24 hrs on the TUSZ evaluation set and 33.71% sensitivity with 40.40 FAs/24 hrs on DUSZ. Improving seizure detection performance through better segmentation is an important step forward in making automated seizure detection systems clinically acceptable. For a real-time system, accurate segmentation will allow clinicians detect a seizure as soon as it appears in the EEG signal. This will allow neurologists to act during the early stages of the event which, in many cases, is essential to avoid permanent damage to the brain. In a similar way, accurate offset detection will help with delivery of therapies designed to mitigate postictal (after seizure) period symptoms. This will also help reveal the severity of a seizure and consequently provide guidance for medicating a patient.
Models for fitting correlated non-identical bernoulli random variables with applications to an airline data problemOur research deals with the problem of devising models for fitting non- identical dependent Bernoulli variables and using these models to predict fu- ture Bernoulli trials.We focus on modelling and predicting random Bernoulli response variables which meet all of the following conditions: 1. Each observed as well as future response corresponds to a Bernoulli trial 2. The trials are non-identical, having possibly different probabilities of occurrence 3. The trials are mutually correlated, with an underlying complex trial cluster correlation structure. Also allowing for the possible partitioning of trials within clusters into groups. Within cluster - group level correlation is reflected in the correlation structure. 4. The probability of occurrence and correlation structure for both ob- served and future trials can depend on a set of observed covariates. A number of proposed approaches meeting some of the above conditions are present in the current literature. Our research expands on existing statistical and machine learning methods. We propose three extensions to existing models that make use of the above conditions. Each proposed method brings specific advantages for dealing with correlated binary data. The proposed models allow for within cluster trial grouping to be reflected in the correlation structure. We partition sets of trials into groups either explicitly estimated or implicitly inferred. Explicit groups arise from the determination of common covariates; inferred groups arise via imposing mixture models. The main motivation of our research is in modelling and further understanding the potential of introducing binary trial group level correlations. In a number of applications, it can be beneficial to use models that allow for these types of trial groupings, both for improved predictions and better understanding of behavior of trials. The first model extension builds on the Multivariate Probit model. This model makes use of covariates and other information from former trials to determine explicit trial groupings and predict the occurrence of future trials. We call this the Explicit Groups model. The second model extension uses mixtures of univariate Probit models. This model predicts the occurrence of current trials using estimators of pa- rameters supporting mixture models for the observed trials. We call this the Inferred Groups model. Our third methods extends on a gradient descent based boosting algorithm which allows for correlation of binary outcomes called WL2Boost. We refer to our extension of this algorithm as GWL2Boost. Bernoulli trials are divided into observed and future trials; with all trials having associated known covariate information. We apply our methodology to the problem of predicting the set and total number of passengers who will not show up on commercial flights using covariate information and past passenger data. The models and algorithms are evaluated with regards to their capac- ity to predict future Bernoulli responses. We compare the models proposed against a set of competing existing models and algorithms using available air- line passenger no-show data. We show that our proposed algorithm extension GWL2Boost outperforms top existing algorithms and models that assume in- dependence of binary outcomes in various prediction metrics.