Bayesian Inference
Prior Distribution
In the Bayesian approach it is assumed that the investigator has some idea about the value of the parameter . This ' idea' is captured through a prior distribution on the parameter space . We will denote the
Non-Parametric Regression & Time Series Analysis
Nadaraya-Watson Kernel Estimator
Let h > 0 be a positive number, called the bandwidth. The Nadaraya - Watson kernel estimator is defined by rn ( x) = li ( x)Yi where K is a kernel and the weights
i
Logistic Regression
Simple Logistic Regression
Let Y = response variable X = explanatory variable Y takes only two values 0 (failure) and 1 (success) Let (x) = P(Y = 1 | X = x) = probability of success when X = x ( x) Define, logit ( (x) ) = ln
Linear Regression
Normality of Errors Assumption
In the simple linear regression model it is often assumed that the errors i are i.i.d. N(0, ). This implies (in the fixed x model) Yi ~ N( + x i , ). We have already seen that if n is large, a
ROBUST STATISTICS
OUTLIER RESISTANT METHODS
What is an Oultier?
An outlying observation, or outlier is one that appears to deviate markedly from other members of the sample in which it occurs. Example: 2, 3, 6, 2, 3, 100, 5, 2 The observation 100
Goodness of Fit Tests
Multinomial Distribution
We say X = (X1 ,., X k ) follow a multinomial distribution with parameters n, p1 ,., pk (p1 + . + pk = 1) if n! xk x1 P(X1 = x1 ,., X k = xk ) = p1 . pk (x1 + . + xk = n) x1!.xk ! x1 ,., xk 0 are integ
Permutation Tests
Permutation Tests
It is a non-parametric method for testing whether two distributions are the same. Suppose we have random sample of size m from a population with cdf FX and another independent random sample of size n from a popu
Hypothesis Testing and P-values II
Likelihood Ratio Test
Consider testing H 0 : 0 versus H1 : 0 The likelihood ratio statistic is sup L( ) L( ) = 2 ln = 2 ln L( ) sup L( ) 0 0 where is the mle and is the mle when i
Parametric Inference I
Parametric Family of Distributions
Let = { f(x, ) : } where k . The set is called the parameter space and = (1 ,., k ) is the parameter. is called a parametric family of distributions indexed by the parameter . Exam
The Bootstrap
Introduction
The bootstrap is a computational method for obtaining estimate of standard error of an estimator. It can also be used for obtaining confidence intervals The Bootstrap technique was invented by B. Efron in 1979.
Bootstr
Empirical CDF and Statistical Functionals
The Empirical Distribution Function
The empirical distribution function Fn is the CDF 1 that puts mass at each data point X i . n ( x) = no. of points x = i =1 Fn n 1 if X i x where I ( X i x ) = 0 if
Generation of Random Variates
Probability Integral Transform
Suppose F is a cdf. Let F- (u ) = inf{x : F ( x) u} If U ~ Unif(0,1) then the random variable F- (u ) has the distribution F.
Exponential Distribution
If U is distributed as Uniform(0,1)
Parametric Inference II
Kullback-Leibler distance
If f and g are pdf' s the Kullback - Leibler distance between f and g is defined to be f(x) D(f, g) = f(x)ln dx g(x) It can be shown that D(f, g) 0 and D(f, f) = 0. For , , we shall write D( , )