This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 1 of 36 Go Back Full Screen Close Quit ORIE 270 Engineering Probability and Statistics Lecture 2: Descriptive Statistics Sidney Resnick School of Operations Research and Information Engineering Rhodes Hall, Cornell University Ithaca NY 14853 USA http://legacy.orie.cornell.edu/ ∼ sid [email protected] August 28, 2007 Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 2 of 36 Go Back Full Screen Close Quit 1. Graphical Methods and Data Summarization Pictorial and graphical methods for data summarization. Generalities: • Large amounts of data hard to interpret. • Reduction of the full data set to either a picture or to a small set of numerical summaries (mean, standard deviation, . . . ).. • Often the full data not recoverable from the pictorial or numerical summarization; exception–stem and leaf plot (maybe). • Often some features not initially evident pop out immediately with the right kind of pictorial representation. Examples of techniques: 1. Stem and leaf plots 2. Simple plots: Scatter plots; Time series plots. 3. Dot plots 4. Histograms 5. Density estimates. Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 3 of 36 Go Back Full Screen Close Quit Features we might find from simple plots like the time series plot: • Is there a trend? • Is some sort of seasonality present? (Diurnal? Monthly?) • Is variability changing? (Is weather becoming more erratic or more violent?) Features possibly revealed by stem & leaf, histograms or density esti mates: • What are the most typical or representative values? (mean & median) Are there several values that might be typical (modes). • Summarize the spread of the data: range and variability. • Are there notable gaps in the data? • Is there symmetry in the data around some typical or representa tive value? • Are there outliers–values so atypical as to be possibly measure ment error or so noteworthy as to attract attention. Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 4 of 36 Go Back Full Screen Close Quit Notation Univariate case: outcomes of experiment or observational study are real numbers. n = size of the data set; sample size x 1 , x 2 , . . . , x n data values . Example: observe voter behavior (R=0, D=1) and with coding record becomes 0,0,1,0,...,0 (say). Multivariate case: outcomes of experiment or observational study are vectors of real numbers: n = size of the data set; sample size ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x n , y n ) data values ....
View
Full Document
 Frequency, Histogram, Scatter plot, Hist Numerical summaries

Click to edit the document details