02EDA - PubH8452 Longitudinal Data Analysis - Fall 2011...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon
PubH8452 Longitudinal Data Analysis - Fall 2011 Exploratory Data Analysis Exploratory Data Analysis Goals of EDA Relationship between mean response and covariates (including time). Variance, correlation structure, individual-level heterogeneity. Guidelines for graphical displays of longitudinal data Show relevant raw data, not just summaries. Highlight aggregate patterns of scientific interest. Identify both cross-sectional and longitudinal patterns. Identify unusual individuals and observations. 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
PubH8452 Longitudinal Data Analysis - Fall 2011 Exploratory Data Analysis General Techniques Scatter plots, use connected lines to reveal individual profiles. Displays of the responses against time. Displays of the responses against a covariate (with/without time trend being removed) Use smooth curves to reveal mean response profile, at the population level. Kernel estimation Smoothing spline Lowess Variograms, for checking variance/covariance structure. 2
Background image of page 2
PubH8452 Longitudinal Data Analysis - Fall 2011 Exploratory Data Analysis Scatter-plots CD4 <- read.table ("data/cd4.dat", header = TRUE) plot (CD4 ~Time,data = CD4, pch = ".", xlab = "Years since seroconversion", ylab="CD4+ cell number") -2 0 2 4 0 500 1000 1500 2000 2500 3000 Years since seroconversion CD4+ cell number 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
PubH8452 Longitudinal Data Analysis - Fall 2011 Exploratory Data Analysis Fitting Smooth Curve to Longitudinal Data Nonparametric regression models can be used to estimate the mean response profile as a function of time. Assuming that we have a single observation y i at time t i , we want to estimate an unknown mean response curve μ ( t ) in the underlying model: Y i = μ ( t i ) + ² i ,i = 1 ,...,m, where ² i are independent errors with mean zero. Common smoothing techniques include: 1. Kernel smoothing. 2. Smoothing spline. 3. Loess (local regression). 4
Background image of page 4
Exploratory Data Analysis Kernel Smoothing 1. Select a window centered at time t . 2. ˆ μ ( t ) is the average of Y values of all points within that window. 3. To obtain an estimator of the smooth curve at every time point, slide a window from the extreme left to the extreme right, calculating the average of the points within the window every time (moving average). 4. This “boxcar” approach is equivalent to computing ˆ μ ( t ) as a weighted average of the y i ’s with weights equal to zero or one. This may yield curves that are not very smooth. Alternatively we can use a smooth weighting function that gives weights to the observations closer to t , eg. Gaussian kernel K ( u ) = exp( - u 2 / 2). The kernel estimate is defined as: ˆ μ ( t ) = m i =1 w ( t,t i ,h ) y i m i =1 w ( t,t i ,h ) , where w ( t,t i ,h ) = K (( t - t i ) /h ) and h is the bandwidth of the kernel. Larger values of
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 6
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 33

02EDA - PubH8452 Longitudinal Data Analysis - Fall 2011...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online