This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: STA 4107/5107 Chapter 1 1 Introduction 1.1 Key Terms This section will not be covered in class but you are expected to know it thoroughly. Please ask questions in class or come to office hours if you need help. 2 What is Multivariate Analysis? This is the information age. What has become a truism in popular culture poses real diffi- culties for the researcher. Data have become so numerous and so complex that often even the best human minds have difficulty processing the results of even one experiment or study. When a large number of variables are measured it is tempting to try to perform a large number of pair-wise comparisons to make sense of the data, or ignore the experimental de- sign and try to deal with the data as if they came from several rather than one experiment. However, we end up with so many comparisons and so many different analyses that we have merely delayed rather than avoided the task of understanding the complexity in our data. Multivariate analysis is a set of analytical tools for summarizing and simplifying data wherein several variables have been measured on one observational unit, so that we can detect patterns and significant relationships and thereby comprehend the results of our studies and experiments . 3 Basic Concepts of Multivariate Analysis 3.1 The Variate A key concept in multivariate analysis is taking linear combinations of the data that simplify or highlight important patterns. These linear combinations are referred to by your text as variates . They are also often referred to as components or factors in the literature. A linear combination of p variables can be represented by V = w 1 X 1 + w 2 X 2 + ··· + w n X p (1) The weights, w i , are chosen depending on the goal of the analysis. For example, in cluster analysis the weights are chosen (algorithmically) so as to maximally differentiate among the different groups; in principal components, the weights are chosen such that the variate ‘explains’ as much of the variation as possible (more later). In any case, the nature of the variate is both determined by and is the defining characteristic of the multivariate analysis being used. 1 3.2 Measurement scales There are two main types of measurements that we typically encounter: metric and non- metric. However, classifying any given set of measurements as either metric or non-metric is not always straightforward. We will consider several types of variables below. 3.2.1 nominal or categorical variables These are considered non-metric variables. For nominal or categorical variables the ‘mea- surement’ on the experimental unit consists of identifying the class to which the experimental unit belongs. There is no inherent scale to these observations, that is, we can’t say that one category is in any mathematical sense, less than or greater than the other. For example, ‘sex’ is a categorical variable. In most analyses we will assign a numeric value such as 0 = male, 1 = female, or 1 = dead, 0 = alive, etc. We could use any number to identify the classes.1 = female, or 1 = dead, 0 = alive, etc....
View Full Document
This note was uploaded on 07/14/2011 for the course STA 4702 taught by Professor Staff during the Spring '08 term at University of Florida.
- Spring '08