Unformatted text preview: Lecture 2: Data Types: 1. Discrete Data: whole numbers, can be counted 2. Continuous Data: a real number (usually measured) 3. Categorical Data: non
numerical, usually from some pre
determined categories 4. Binary Data: categorical data with two categories 5. Ordinal data: has an underlying order 6. Grouped or frequency data: data that has been reduced to # of observations in particular categories EXAMPLES Notations: Datatypes: Dataset: A collection of Data Transformations of Data; Monotone Transformations: • A transformation F is called monotone increasing if the ranks of {x1, x2, x3,……,xn} are the same as {F(x1), F(x2), F(x3),……F(xn)} • If the ranks are reversed, it is called monotone decreasing. • An affine transformation is one of the form y= Ax + B • Examples • Coding: Categorical to Numerical • Ranking: Ordering the data from smallest to largest • Example: Data {1,
0.5, 3, 100,
4, 6} _
{3, 2, 3, 6, 1, 5} • Log transformation Statistical Terminology: Revisiting PPDAC Problem: • Statements about Populations of individuals • Individual members of the population= units • Characteristic of a unit = a variate • Functions defined on the units = attribute Aspect of a Problem: • Descriptive: The answer involves learning about some attribute about the population. • Causative: involves the existence of a causal link between variates (or non
existence) (i) (ii) Changes in the explanatory variates “cause” a change in the response variates • Predictive: involves predicting value of a response variate for a given unit • Examples: • Correlation ≠ Causation Response variates: Explanatory variates Units • The target population: set of units we set out to investigate • The study population: the set of units which could have been included in the sample • The sample: the set of units actually selected by sampling protocol Errors Target population study error? Study population Conclusions (Induction) Sample sample error? Analysis Then Study error: α(ΡStudy)
α(ΡTarget) Sample Error: α (S)
α(ΡStudy) • Examples: • Errors are unavoidable • Suppose the attribute of interest is α(.), a function of the population Plans: (usually for Causative Aspects) • Experimental • Observational Examples Data: Things to remember: (i) Inconsistent Observations (ii) Extreme Observations: Outliers (iii) Sources of Bias: Bias = Systematic Error (iv) Missing Observations: 0, 99, *, NA Analysis Address the questions of interest using the data • construct an appropriate model (STAT 230) • use formal statistical methods (STAT 231) • prepare appropriate numerical and graphical summaries Examples: Conclusion: Address the questions of interest • in contextual language using the output from the Analysis step • discuss possible limitations and uncertainties Roadmap: A look ahead. (i) Summarization of Data (ii) Recap of STAT 230 ...
View
Full Document
 Winter '08
 CANTREMEMBER
 the00, a00, formal statistical methods, graphical summaries Examples, data00

Click to edit the document details