This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 1 of 32 Go Back Full Screen Close Quit ENGRD 2700 Engineering Probability and Statistics Lecture 2: Graphical Methods; Data Summarization David S. Matteson School of Operations Research and Information Engineering Rhodes Hall, Cornell University Ithaca NY 14853 USA [email protected] January 21, 2009 Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 2 of 32 Go Back Full Screen Close Quit 1. Graphical Methods and Data Summarization Pictorial and graphical methods for data summarization. Generalities: • Large amounts of data hard to interpret. • Reduction of the full data set to either a picture or to a small set of numerical summaries (mean, standard deviation, ... ).. • Often the full data not recoverable from the pictorial or numerical summa rization; exception–stem and leaf plot. • Often some features not initially evident pop out immediately with the right kind of pictorial representation. Examples of techniques: 1. Stem and leaf plots 2. Simple plots: Scatter plots; Time series plots. 3. Dot plots 4. Histograms 5. Density estimates. Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 3 of 32 Go Back Full Screen Close Quit Features we hope to find from simple plots like the time series plot: • Is there a trend? • Is variability changing? Features we hope revealed by stem & leaf, histograms or density estimates: • What are the most typical or representative values? (mean & median) Are there several values that might be typical (modes). • Summarize the spread of the data: range and variability. • Are there notable gaps in the data? • Is there symmetry in the data around some typical or representative value? • Are there outliers–values so atypical as to be possibly measurement error or so noteworthy as to attract attention. Notation Univariate case: outcomes of experiment or observational study are real numbers. n = size of the data set; sample size x 1 ,x 2 ,...,x n data values . Example: observe voter behavior (R=0, D=1) and with coding record becomes 0,0,1,0,...,0 (say). Overview Scatter & Time Series Freq & Hist Numerical summaries . . . Title Page JJ II J I Page 4 of 32 Go Back Full Screen Close Quit Multivariate case: outcomes of experiment or observational study are vectors of real numbers: n = size of the data set; sample size ( x 1 ,y 1 ) , ( x 2 ,y 2 ) ,..., ( x n ,y n ) data values . Example: In an Internet study, x i = size of ith file downloaded y i = time necessary to download the ith file . Overview Scatter & Time Series Freq & Hist Numerical summaries . . ....
View
Full
Document
This note was uploaded on 03/05/2009 for the course ENGRD 2700 taught by Professor Staff during the Spring '05 term at Cornell.
 Spring '05
 STAFF

Click to edit the document details