s340text-KStestOnly

# s340text-KStestOnly - Chapter 3 Validation In many...

This preview shows pages 1–3. Sign up to view the full content.

Chapter 3 Validation In many textbooks author will say something like &The waiting times between buses arriving at a bus stop follows from an exponential distribution±. But. ..How do we know this is true? It²s all good and well to say that a set of data follows a particular distribution, but how do we show that it does? As in most statistical tests, we cannot directly prove something to be true. It is important to remember that hypothesis tests are not conclusive. They test to see if a particular trend is present given both the data and a margin of error. So how can we determine what distribution to use in modeling, knowing that we will never e/ectively be able to prove that a particular set of data has a given distribution? Consider the following set of completely random data. Data Point Value 1 1 4 2 1 2 3 3 4 4 1 Pretend that it isn²t painfully obvious that this data seems to follow a uni- form distribution with parameters (0,1). How can we examine the data in such a way that we can pick a distribution that seems to ³t? The best solution is to graph it and compare it to known distributions. What is meant by the Empircal CDF ? It is a cumulative distribution func- tion, which is determined entirely by the data set that it represents. Notice that it is a step function taking values from 0 to 1. Even though we cannot strictly prove that this is a Uniform(0,1) distribution, we can show that it likely that it is. Given su¢ ciently many data points and a small enough margin of error it becomes more reasonable to use certain models despite not necessarily being the &right±one. There are two major tests that we have available for our data. One assumes that our data is discrete and one assumes that the data is continuous. 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
3.1 & 2 Test for Goodness of Fit For discrete models we can use the & 2 Test for Goodness of Fit (sometimes referred to as the Pearson&s & 2 Test . If we have data which can be categorized (into groups called bins ), then we can use this test to compare a theoretical distribution to that of a set of data. Consider the following example to illustrate the concept of what a bin is, among other aspects of the test. A study is shown which compares two sub-groups of a population (males and females) and looks at whether or not the members of that population are currently engulfed in &ames. The surveying team found 351 participants and recorded the following data: Males Females Total On Fire 104 73 137 Not on Fire 89 85 174 Total 193 158 351 The interest of this study is to determine whether or not being on ±re is independent of one²s sex. Recall for hypothesis tests that it is common to assume that something is true and the see the probability of that given the data. For this experiment, we assume that the two are independent. Let the Null Hypothesis be that these two are independent, and let the Alternative Hypothesis be that sex and being on ±re are not independent.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 12

s340text-KStestOnly - Chapter 3 Validation In many...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online