This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: STAT 8620 Advanced Statistical Applications I Lecture Notes The goal of this course is to teach methods for the analysis of discrete response data and to develop a general framework for the analysis of dis- crete data and other data types for which the assumptions of the classical linear model (CLM) do not hold. Such a framework is provided by a class of models known as generalized linear models or GLMs . GLMs extend the class of CLMs (a.k.a. normal-theory linear models, Gauss-Markov models, or, confusingly, general linear models). CLMs can be extended in other important ways: E.g., by the inclusion of both traditional (fixed) regression parame- ters and random parameters, better known as random effects. Such linear mixed models (or LMMs) are very useful for han- dling correlation and multiple sources of variability (covered in STAT 8630). By allowing more general forms of nonlinear regression functions (STAT 8230). More general classes are possible: generalized linear mixed models (GLMMs; covered in STAT 8630), nonlinear mixed models (NLMMs; coverd in STAT 8230), etc. In this class, however, we concentrate on GLMs and extensions of GLMs suitable for the analysis of independent, discrete (or otherwise non-normal) data. But before we can talk about modeling discrete data, we need to begin by introducing some pre-model ideas: descriptive statistics, measures of association, model-free inference in simple data tables, etc. 1 Non-model-based Concepts and Methods for Discrete Data: What do we mean by discrete data? A discrete random variable is a random variable that can taken on a finite or countably infinite number of possible values. In practice, all random variables are discrete, due to limitations in the precision of measurement. Typically, though, variables that theoretically have an underlying continuous scale are treated as continuous in statistical analyses - unless the scale of measurement is extremely coarse. E.g., weight, height, time elapsed. The practical basis of the distinction is, does the variable taken on enough values with positive probability to be well approximated by a continuous distribution. Therefore, were concerned with random variables that can assume only a small number of values. This includes many qualitative , or nominally scaled categorical vari- ables, Religion (Christian, Hindu, Jewish), Gender (male, female), etc. and also ordinally scaled categorical variables. Agreement (strongly agree, agree, neutral, disagree, strongly dis- agree), Pain (mild, moderate, severe), etc. As presented none of these characteristics (Religion, Agreement, etc.) are even, strictly speaking, random variables. They only become so, and be- come analyzable, when we assign numbers to their values....
View Full Document
- Fall '11