*This preview shows
pages
1–5. Sign up to
view the full content.*

This ** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This ** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
**Unformatted text preview: **Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 1 of 36 Go Back Full Screen Close Quit ORIE 270 Engineering Probability and Statistics Lecture 2: Descriptive Statistics Sidney Resnick School of Operations Research and Information Engineering Rhodes Hall, Cornell University Ithaca NY 14853 USA http://legacy.orie.cornell.edu/ ∼ sid sir1@cornell.edu August 28, 2007 Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 2 of 36 Go Back Full Screen Close Quit 1. Graphical Methods and Data Summarization Pictorial and graphical methods for data summarization. Generalities: • Large amounts of data hard to interpret. • Reduction of the full data set to either a picture or to a small set of numerical summaries (mean, standard deviation, . . . ).. • Often the full data not recoverable from the pictorial or numerical summarization; exception–stem and leaf plot (maybe). • Often some features not initially evident pop out immediately with the right kind of pictorial representation. Examples of techniques: 1. Stem and leaf plots 2. Simple plots: Scatter plots; Time series plots. 3. Dot plots 4. Histograms 5. Density estimates. Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 3 of 36 Go Back Full Screen Close Quit Features we might find from simple plots like the time series plot: • Is there a trend? • Is some sort of seasonality present? (Diurnal? Monthly?) • Is variability changing? (Is weather becoming more erratic or more violent?) Features possibly revealed by stem & leaf, histograms or density esti- mates: • What are the most typical or representative values? (mean & median) Are there several values that might be typical (modes). • Summarize the spread of the data: range and variability. • Are there notable gaps in the data? • Is there symmetry in the data around some typical or representa- tive value? • Are there outliers–values so atypical as to be possibly measure- ment error or so noteworthy as to attract attention. Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 4 of 36 Go Back Full Screen Close Quit Notation Univariate case: outcomes of experiment or observational study are real numbers. n = size of the data set; sample size x 1 , x 2 , . . . , x n data values . Example: observe voter behavior (R=0, D=1) and with coding record becomes 0,0,1,0,...,0 (say). Multivariate case: outcomes of experiment or observational study are vectors of real numbers: n = size of the data set; sample size ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x n , y n ) data values ....

View Full
Document