Lecture_2 - Lecture 2: Data Types: 1. Discrete...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Lecture 2: Data Types: 1. Discrete Data: whole numbers, can be counted 2. Continuous Data: a real number (usually measured) 3. Categorical Data: non ­numerical, usually from some pre ­determined categories 4. Binary Data: categorical data with two categories 5. Ordinal data: has an underlying order 6. Grouped or frequency data: data that has been reduced to # of observations in particular categories EXAMPLES Notations: Datatypes: Dataset: A collection of Data Transformations of Data; Monotone Transformations: • A transformation F is called monotone increasing if the ranks of {x1, x2, x3,……,xn} are the same as {F(x1), F(x2), F(x3),……F(xn)} • If the ranks are reversed, it is called monotone decreasing. • An affine transformation is one of the form y= Ax + B • Examples • Coding: Categorical to Numerical • Ranking: Ordering the data from smallest to largest • Example: Data {1,  ­0.5, 3, 100,  ­4, 6} _ ­ ­ {3, 2, 3, 6, 1, 5} • Log transformation Statistical Terminology: Revisiting PPDAC Problem: • Statements about Populations of individuals • Individual members of the population= units • Characteristic of a unit = a variate • Functions defined on the units = attribute Aspect of a Problem: • Descriptive: The answer involves learning about some attribute about the population. • Causative: involves the existence of a causal link between variates (or non ­ existence) (i) (ii) Changes in the explanatory variates “cause” a change in the response variates • Predictive: involves predicting value of a response variate for a given unit • Examples: • Correlation ≠ Causation Response variates: Explanatory variates Units • The target population: set of units we set out to investigate • The study population: the set of units which could have been included in the sample • The sample: the set of units actually selected by sampling protocol Errors Target population study error? Study population Conclusions (Induction) Sample sample error? Analysis Then Study error: α(ΡStudy)  ­α(ΡTarget) Sample Error: α (S)  ­ α(ΡStudy) • Examples: • Errors are unavoidable • Suppose the attribute of interest is α(.), a function of the population Plans: (usually for Causative Aspects) • Experimental • Observational Examples Data: Things to remember: (i) Inconsistent Observations (ii) Extreme Observations: Outliers (iii) Sources of Bias: Bias = Systematic Error (iv) Missing Observations: 0, 99, *, NA Analysis Address the questions of interest using the data • construct an appropriate model (STAT 230) • use formal statistical methods (STAT 231) • prepare appropriate numerical and graphical summaries Examples: Conclusion: Address the questions of interest • in contextual language using the output from the Analysis step • discuss possible limitations and uncertainties Roadmap: A look ahead. (i) Summarization of Data (ii) Recap of STAT 230 ...
View Full Document

This note was uploaded on 01/27/2011 for the course STAT 231 taught by Professor Cantremember during the Winter '08 term at Waterloo.

Ask a homework question - tutors are online