Unformatted text preview: equency of Di > d. The more i’ that satisfy this, the
s
^
more acceptable our F will be.
The following are examples of how to perform a KS test using R. The …rst
method uses R code to calculate a pvalue.
#set up for the data of interest
#sorted random exponential with rate 1
x<sort(rexp(10,rate=1))
plot(ecdf(x))
#Plots Empirical CDF
lines(x,pexp(x,rate=1/mean(x)))
#Plots lines for other data values
ks.test(x,"pexp",1/mean(x))
#Calculates the ks test
The following is the code for the KS test. As an exercise, create your own
code for the KS test.
#****************************************************
#KS exp will perform a KS test and return the pvalue
10 KSEXP < function (data,N,d)
{
data < sort(data)
#Tells us how many data values there are.
n = length(data)
#The MLE for the rate in an exponential.
rate < 1/mean(data)
#ks is the set of all ks distances
ks < NULL
#We create N datasets from an EXP
# For each of these datasets we
# calculate a KS distance
for (i in 1:N)
{
#generate data from EXP(rate)
#rdata is the random data that we know is EXPONENTIAL
rdata < sort(rexp(n,rate=rate))
#The rate we would calculate from our EXPONENTIAL data
rrate < 1/mean(rdata)
#Calculation of the KS distance
F = 1exp(rrate*rdata)
F_hat1 = c(1:n)/n
F_hat2 = c(0:(n1))/n
D < max(abs(FF_hat1),abs(FF_hat2))
#Append to the vector a new KS distance.
ks<cbind(D, ks)
} #end for
#Outputs how often the KS distance is larger than the one you calculated
# from your RAW data.
Pvalue < mean(ks>d)
return(Pvalue)
}#end fn KSEXP
To calculate the pvalue for P(D d) we want the percentage of times Di
is bigger than d (that is, the percentage of time that the generated data is less 11 reliable than our observed data).
P (D d) The number of times Di is bigger than d
n
1X
=
I (Di d)
n i=1 If our pvalue is small, then d is too large and we reject our hypothesis. We
use our usual 5% rejection rule. Since the test statistic is dependent on the
number of sets of values, then the answers will vary every time the test is done.
To reduce the variation, we just make a large number of sets. We will still see
variation in our test statistics; however, the larger the number of generated sets
are, the smaller the variation will be. 12...
View
Full
Document
This note was uploaded on 09/27/2013 for the course STATS 340 taught by Professor Riley during the Winter '12 term at Waterloo.
 Winter '12
 RILEY

Click to edit the document details