Lecture Chapter 1

Preliminaries The class web site: http://my.wpi.edu There you will find Course rules & regs. The course syllabus. My class lecture notes. Homework solutions. Links to sites relevant to your study of statistics.

Chapter 1: Introduction to Data Analysis Preview: Data and its science, statistics Stationary and nonstationary processes; displaying data from each. Assessing between and within variation.
What’s the IDEA ? Data have variation. The variation has a pattern (data distribution). By analyzing the pattern, we can tell something about the process or population the data came from.

What are data? Data are facts that convey information (for example, the GLOBAL TEMP data set. This data set contains the average global surface air temperature in degrees centigrade and fahrenheit for the years 1880-2005 as well as their anomalies (i.e., differences) from the baseline average computed for the years 1951-1980. (Source: http://data.giss.nasa.gov/gistemp/tabledata/GLB.Ts.txt)
Here is a portion of the data set: Year anomaly c temp c anomaly f temp f 1880 -0.11750 13.8825 -0.2115 56.9885 1881 -0.13250 13.8675 -0.2385 56.9615 1882 -0.01333 13.9867 -0.0240 57.1760 1883 -0.04667 13.9533 -0.0840 57.1160 1884 -0.42583 13.5742 -0.7665 56.4335 1885 -0.23667 13.7633 -0.4260 56.7740 . . . . . . . . . . . . . . . 2001 0.57167 14.5717 1.0290 58.2290 2002 0.68750 14.6875 1.2375 58.4375 2003 0.67000 14.6700 1.2060 58.4060 2004 0.58833 14.5883 1.0590 58.2590 2005 0.76333 14.7633 1.3740 58.5740

What is statistics ? Statistics is the science of data. The word “statistics” entered the English language in the 1790s as a term to describe the measurement of characteristics of nations or states (hence the term stat istics).
How is statistics used? Today, statistics is used in many areas of human endeavor, such as: Designing and analyzing studies to certify the safety and efficacy of new drugs. Monitoring the performance of the economy. Designing processes to manufacture quality products at low cost.

Displaying a Static Pattern of Variation A Frequency Histogram shows a Data Distribution : i.e., static pattern of variation. A histogram of the temperature anomalies in degrees fahrenheit from the global temperature data set is shown in Figure 1.
-1.2 -0.9 -0.6 -0.3 0.0 0.3 0.6 0.9 1.2 1.5 anomaly_f 0 10 20 30 F r e q u e n c y Figure: Figure 1: Histogram of anomalies, degrees fahrenheit, global temperature data.

Question: Can this histogram tell us anything about the possible existence of global warming?
Displaying Data Taken Over Time Time Series Plot (or Line Plot) : shows pattern of variation evolving over time. Figure 2 displays a time series plot of the temperature anomalies in fahrenheit from the global temperature data set. Can it tell us anything about the possible existence of global warming?

1900 1950 2000 Year -1 0 1 a n o m a l y _ f Figure: Figure 2: Time series plot of anomalies, degrees fahrenheit, global temperature data.
Figure 3 is a fake time series plot of the anomalies we have already plotted in Figures 1 and 2. It was formed by reordering the time sequence. Therefore, it has the same histogram shown in Figure 1.

