Week 4. Assignment - Student Survey

As a data science intern with newly learned knowledge in skills in statistical correlation and R programming,

you will analyze the results of a survey recently given to college students. You learn that the research question

being investigated is: "Is there a significant relationship between the amount of time spent reading and the

time spent watching television?" You are also interested if there are other significant relationships that can be

discovered? The survey data is located in this StudentSurvey.csv file.

a. Examine the Survey data variables.

1. What measurement is being used for the variables?

2. Explain what effect changing the measurement being used for the variables would have on

the covariance calculation.

3. Would this be a problem?

4. Explain and provide a better alternative if needed.

b. Choose the type of correlation test to perform:

1. Explain why you chose this test, m

2. Make a prediction if the test yields a positive or negative correlation?

c. Perform a correlation analysis of:

1. All variables

2. A single correlation between two a pair of the variables

3. Repeat your correlation test in step 2 but set the confidence interval at 99%

4. Describe what the calculations in the correlation matrix suggest about the relationship

between the variables. Be specific with your explanation.

d. Calculate the correlation coefficient and the coefficient of determination, describe what you

conclude about the results.

e. Based on your analysis can you say that watching more TV caused students to read less? Explain.

f. Pick three variables and perform a partial correlation, documenting which variable you are

"controlling". Explain how this changes your interpretation and explanation of the results.

Include all of your answers in a R Markdown report. Refer to the example template presented as a guide.