Loading...
Thumbnail Image
Item

Knowledge Discovery Through Probabilistic Models

Ristovski, Kosta
Citations
Altmetric:
Genre
Thesis/Dissertation
Date
2012
Group
Department
Computer and Information Science
Permanent link to this record
Research Projects
Organizational Units
Journal Issue
DOI
http://dx.doi.org/10.34944/dspace/2229
Abstract
Probabilistic models are dominant in many research areas. To learn those models we need to find a way to determine parameters of distributions over variables which are included in the model. The main focus of my research is related to continuous variables. Thus, Gaussian distribution over variables is the most dominant factor in all models used in this document. I have been working on different and important real-life problems such as Uncertainty of Neural Network Based Aerosol Retrieval, Regression Learning with Multiple Noise Oracles and Model Predictive Control (MPC) for Sepsis Treatment, Clustering Causes of Action in Federal Courts. These problems will be discussed in the following chapters. Aerosols, small particles emanating from natural and man-made sources, along with green house gases have been recognized as very important factors in ongoing climate changes. Accurate estimation of aerosol composition and concentration is one of the main challenges in current climate research. Algorithm for prediction of aerosol designed by domain scientists does not provide quantitative information about aerosol estimation uncertainty. We deployed algorithm which uses neural networks to determine both uncertainty and the estimation of the aerosol. The uncertainty estimator has been built under an assumption that uncertainty is a function of variables used for aerosol prediction. Also, the uncertainty of predictions has been computed as the variance of the conditional distribution of targets given the input data. In regression learning, it is often difficult to obtain the true values of the label variables, while multiple sources of noisy estimates of lower quality are readily available. To address this problem, I propose a new Bayesian approach that learns a regression model from a data with noisy labels which are provided by multiple oracles. This method gives closed form solution for model parameters and it is applicable to both linear and nonlinear regression problems. Sepsis is a medical condition characterized as a systemic inflammatory response to an infection. High mortality rate (30-35%) of septic patients is usually caused by inadequate treatment. Thus, development of tools that can aid clinicians in designing optimal strategies for inflammation treatments is of utmost importance. Towards this objective I developed a data driven approach for therapy optimization where a predictive model for patients' behavior is learned directly from historical data. As such, the predictive model is incorporated into a model predictive control optimization algorithm to find optimal therapy, which will lead the patient to a healthy state. A more careful targeting of specific therapeutic strategies to more biologically homogeneous groups of patients is essential to developing effective sepsis treatment. We propose a kernel-based approach to characterize dynamics of inflammatory response in a heterogeneous population of septic patients. The method utilizes Linear State Space Control (LSSC) models to take into account dynamics of inflammatory response over time as well as the effect of therapy applied to the patient. We use a similarity measure defined on kernels of LSSC models to find homogeneous groups of patients. In addition to clustering of dynamics of inflammatory response we also explored a clustering of civil litigation from its inception by examining the content of civil complaints. We utilize spectral cluster analysis on a newly compiled federal district court dataset of causes of action in complaints to illustrate the relationship of legal claims to one another, the broader composition of lawsuits in trial courts, and the breadth of pleading in individual complaints. Our results shed light not only on the networks of legal theories in civil litigation but also on how lawsuits are classified and the strategies that plaintiffs and their attorneys employ when commencing litigation.
Description
Citation
Citation to related work
Has part
ADA compliance
For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
Embedded videos