• Disease Mapping on a Moving Population

      Vucetic, Slobodan; Latecki, Longin; Obradovic, Zoran; Mennis, Jeremy (Temple University. Libraries, 2011)
      Residential data are often used as a source of spatial information when representing the geographical distribution of disease in a population. Early examples of presenting the prevalence of a disease in a population on a map can be found with Seaman's analysis of yellow fever outbreaks in New York at the beginning of the 19th century and Snow's analysis of cholera in London in the mid 19th century. Modern epidemiology of non-infectious or environmental disease seeks to understand the spatial distribution of disease risk by determining the statistical relationship between residential location about a population and the presence of disease in that population. Residential location is often used as a proxy to unobserved or unavailable genetic, demographic, or environmental factors influencing each individual's disease risk. People, however, may be exposed to disease risk at locations other than their residence, for example school or work. Additionally, during a life course, people typically live at multiple residences. Both short-term and long term mobility information is becoming increasingly available to epidemiologists. However, there are currently very few statistical approaches that may be used for disease mapping on moving populations. The goal of my research was to explore whether using movement data could improve accuracy of disease maps. In my presentation, I will start with a brief overview of the existing methodology involved in creating disease maps from residential data, such as disease clustering and Bayesian disease mapping. Then, I will present a novel hierarchical Bayesian model for disease mapping from moving populations, which can cope with multiple spatial sources of disease risk factors. The presented hierarchical model consists of logistic likelihood, a prior modeling spatial distribution of risk, and a hyper-prior which models the smoothness of spatial risk across the region of study. Starting with an assumption that personal risk is an average of spatial risk at visited locations, weighted by the amount of time spent at each location, I will show that disease mapping can be accomplished using a spatially regularized logistic regression. When more detailed estimates of spatial risk are desired, or when more complicated assumptions about personal risk are needed, a fully Bayesian approach allows a computationally costly alternative to estimation of spatial risk. I evaluated the proposed method on a synthetic, but realistic, data set mimicking the daily movement patterns of the entire population of Portland, OR, containing 1.6 million residents and a variety of spatial risk scenarios ranging from very smooth to almost random. Results show that using movement information can significantly improve accuracy of generated disease maps. Influencing factors such as spatial resolution of movement information, sample size, and smoothness of disease risk are analyzed in relation to accuracy of produced risk maps. Finally, I will also discuss some issues related to privacy preservation of movement information in disease mapping.