Loading...
Thumbnail Image
Item

New Hierarchical Nonlinear Modeling for Count Data: Estimation and Testing in The Presence of Overdispersion

Zheng, Shuo
Citations
Altmetric:
Genre
Thesis/Dissertation
Date
2011
Group
Department
Statistics
Permanent link to this record
Research Projects
Organizational Units
Journal Issue
DOI
http://dx.doi.org/10.34944/dspace/3923
Abstract
In studies of traffic accidents, disease occurrence, mismatches in genetic code and impact of pollution on ecological communities the key observational variable is often a count. For example, daily counts of accidents on a segment of highway will vary with weather and other traffic conditions: highway engineers seek to relate accident counts to these roadway conditions. Formal statistical analysis of this kind of data is critically dependent on using the correct model for the random count data. However, statistical analysis packages use only those random count models that are relatively easy to implement. This thesis shows that applying standard statistical procedures in situations where these limited models are not correct can lead to important errors in statistical inference. A more general and realistic count model is proposed and a statistical analysis is implemented using advanced statistical software. The Poisson log-linear model is a common choice for modeling count data. However, in many applications variability in the observed counts is higher than predicted by the Poisson model. That is, the observed count data is overdispersed relative to the Poisson model. The most common probabilistic model used for analyzing overdispersed count data is the Poisson-gamma mixed model, which is identical to the negative binomial model. Since the negative binomial model is a member of the exponential family of distributions, maximum likelihood estimation can be carried out using widely available generalized linear model procedures. Often a more scientifically justifiable model for overdispersion is the Poisson-lognormal model. However, statistical methods based on the Poisson-lognormal model are seldom used in practice, because of their computational complexity. This thesis addresses the following general question: What are the practical implications of using maximum likelihood procedures based on the negative binomial model when the correct distributional model is the Poisson-lognormal? To answer this question we investigate the robustness, bias and confidence interval coverage for these procedures. A summary conclusion of these extensive studies is that the widely used negative binomial procedure underestimates then variability when the data are from a Poisson-lognormal distribution; leading to hypothesis tests and confidence intervals that are anti-conservative. To set this problem in a classical hypothesis-testing framework, a new hierarchical nonlinear model is developed that includes both the Poisson-lognormal and the Poisson-gamma model within the generalized model's parameter space. Estimation and hypothesis testing can then be carried out using nonlinear mixed procedures available in advanced computational packages.
Description
Citation
Citation to related work
Has part
ADA compliance
For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
Embedded videos