MIE360 Computer Modeling and Simulation Lecture Notes Daniel Frances © 2010 1 Lecture 18 – Testing if data is IID (“identically and independently” distributed) In this lecture we will only introduce some terminology and tests to determine if there might be a problem. If the data is not IID we don’t offer any constructive advice, we simply say that dealing with non-IID data is beyond the scope of the course. Identically distributed – for one variable This means that we have to be able to assume that all the data was generated by the same underlying distribution, and that this distribution was not changing over the period of data collection. So the process for generating the random data has to be assumed constant. Thus for example when collecting the actual data we need to make sure we collect it over some homogenous period. Suppose you collected data over several two different weekdays, and you want to perform some type of objective statistical test that might detect some lack of homogeneity. There is such a test. It is the Kruskall-Wallis test for homogeneity Suppose we have k independent samples of size n i , i=1…k Suppose the first 3 values 3.5, 4.2, 3.2 were collected on one day type and the two others 5.1, 2.8 were collected on a different day type. The K-W test statistic this time is If all samples are from the same population then T is χ 2 distr. with k-1 d.f. k= 2 = no of groups n = total no of observations = 5 n1 = 3 n2 = 2 X 1j = 3.5, 4.2, 3.2, X 2j =5.1, 2.8 R(X 1j ) = 3, 4, 2 R(X 2j ) = 5, 1 T = [12/(5*6]{ (1/3)(3+4+2) 2

