Employed 0.456 0.411 0.199 0.749 -0.125 i. Construct the scree plot and determine the number of components that would be sufficient for the PCA. Give the percentage of variation that would be explained by your chosen number of components. (4) ii. Write out the equations for the principal components selected above. (3) (c) Use the PCA biplot below to answer the following questions: i. In which year was GNP the highest? (1) ii. Population is mostly correlated with which variable? (1) iii. Which variable is uncorrelated with armed forces? (1) iv. Read off the value for population in year 1961. (1)

Question 3 [12 marks] In an investigation of drug-taking behaviour, data were collected on drug usage rates for 1634 students in the seventh to ninth grades in a province. Each respondent completed a questionnaire about the number of times each of the following substances had ever been used: cigarettes, beer, wine, liquor, cocaine, tranquillisers, drug store medication, heroin, marijuana, hashish, inhalants (glue), hallucinogenic and amphetamines. Responses were recorded on a five point scale: (1) Never tried, (2) Only once, (3) A few times, (4) Many times, (5) Regularly. After a Factor Analysis, two primary factors were extracted from the correlation structure of the data. The investigator named the factors as follows: Legal drugs with loadings on cigarettes, beer, wine and
liquor and Illegal drugs with loadings on cocaine, tranquillisers, drug store medication, heroin, marijuana, hashish, inhalants (glue), hallucinogenic and amphetamines. (a) Give an example of a manifest variable in the scenario sketched above. (1) (b) Explain what a latent variable is. (1) (c) Give four assumptions made under the general factor analysis linear model Hint: E ( F i ) , var ( F i ) , cov ( F i , F j ) , E ( ε i ) , var ( ε i ) , cov ( ε i j ) , cov ( F i ε j ) (4) (d) After performing a singular value decomposition on the correlation matrix of the data, the investigator obtained estimates of the factor loadings which were tricky to interpret. What can be done to make the interpretation easier, be specific naming a method that can be used.(2) (e) Given the factor model below, how will you calculate the communality of X i ? (2) X i = γ i 1 F 1 + γ i 2 F 2 + ε i (f) What does the communality represent? Explain giving an example. (2) Question 4 [14 marks] Consider the following contingency table and associated R output. In the table below, 300 people were

