Factor analysis is a statistical data reduction and analysis technique that strives to explain correlations
among multiple outcomes as the result of one or more underlying explanations, or factors. The technique
involves data reduction. Factor analysis attempts to discover the unexplained factors that influence the
co-variation among multiple observations. These factors represent underlying concepts that cannot be
adequately measured by a single variable. Factor analysis is especially popular in survey research
psychological, mathematical, and economic---where there appear to be dozens or even hundreds of
variables affecting operations. By analyzing and studying the variables statistically, factor analysis can
separate out a few core variables, known as factors, in which the responses to each question represent an
outcome. Because multiple questions often are related,
underlying factors.
Interdependency Technique
Seeks to find the latent factors that account for the patterns of co linearity among multiple metric variables
Reduction of number of variables, by combining two or more variables into a single factor. For example,
performance at running, ball throwing, batting, jumping and weight lifting could be combined into a
single factor such as general athletic ability. Usually, in an item by people matrix, factors are selected by
grouping related items. In the Q factor analysis technique, the matrix is transposed and factors are
created by grouping related people: For example, liberals, libertarians, conservatives and socialists, could
form separate groups.
Why factor analysis is used?
Factor analysis originated in psychometrics, and is used in behavioral sciences,
social sciences
,
marketing
,
product management
,
operations research
, and other applied sciences that deal with large quantities of
data.
for example, that variations in three or four observed variables mainly reflect the variations in a single
unobserved variable, or in a reduced number of unobserved variables. Factor analysis searches for such
joint variations in response to unobserved latent variables. The observed variables are modeled as linear
combinations of the potential factors.
Example
is a fictionalized simplification for expository purposes, and should not be taken as being
realistic. Suppose a psychologist proposes a theory that there are two kinds of intelligence, "verbal
intelligence" and "mathematical intelligence", neither of which is directly observed. Evidence for the
theory is sought in the examination scores from each of 10 different academic fields of 1000 students. If
each student is chosen randomly from a large population, then each student's 10 scores are random
variables. The psychologist's theory may say that for each of the 10 academic fields, the score averaged
over the group of all students who share some common pair of values for verbal and mathematical
"intelligences" is some constant times their level of verbal intelligence plus another constant times their
level of mathematical intelligence, i.e., it is a linear combination of those two "factors". The numbers for a