# 3.1.5.5 Lab - Correlation Analysis in Python.pdf - 3.1.5.5...

• 9

Course Hero uses AI to attempt to automatically extract content from documents to surface to you and others so you can study better, e.g., in search results, to enrich docs, and more. This preview shows page 1 - 3 out of 9 pages.

3.1.5.5 Lab - Correlation Analysis in PythonOctober 11, 20211Lab - Correlation Analysis in Python1.0.1Objectives* Part 1: The Dataset * Part 2: Scatterplot Graphs and Correlatable Variables * Part 3: CalculatingCorrelation with Python * Part 4: Visualizing1.0.2Scenario/BackgroundCorrelation is an important statistical relationship that can indicate whether the variable valuesare linearly related.In this lab, you will learn how to use Python to calculate correlation. In Part 1, you will setup thedataset. In Part 2, you will learn how to identify if the variables in a given dataset are correlatable.Finally, in Part 3, you will use Python to calculate the correlation between two sets of variable.1.0.3Required Resources• 1 PC with Internet access• Raspberry Pi version 2 or higher• Python libraries: pandas, numpy, matplotlib, seaborn• Datafiles: brainsize.txt1.1Part 1: The DatasetYou will use a dataset that contains a sample of 40 right-handed Anglo Introductory Psychologystudents at a large Southwestern university. Subjects took four subtests (Vocabulary, Similarities,Block Design, and Picture Completion) of the Wechsler (1981) Adult Intelligence Scale-Revised.The researchers used Magnetic Resonance Imaging (MRI) to determine the brain size of the subjects.Information about gender and body size (height and weight) are also included.The researchers1
withheld the weights of two subjects and the height of one subject for reasons of confidentiality.Two simple modifications were applied to the dataset:1. Replace the quesion marks used to represent the withheld data points described above by the‘NaN’ string. The substitution was done because Pandas does not handle the question markscorrectly.2. Replace all tab characters with commas, converting the dataset into a CSV dataset.The prepared dataset is saved asbrainsize.txt.Step 1: Loading the Dataset From a File.Before the dataset can be used, it must be loadedonto memory.In the code below, The first line imports thepandasmodules and definespdas a descriptor thatrefers to the module.The second line loads the dataset CSV file into a variable calledbrainFile.The third line usesread_csv(), apandasmethod, to convert the CSV dataset stored inbrainFileinto a dataframe. The dataframe is then stored in thebrainFramevariable.Run the cell below to execute the described functions.[3]:# Code cell 1importpandasaspdbrainFile='./Data/brainsize.txt'brainFrame=pd.read_csv(brainFile)Step 2: Verifying the dataframe.To make sure the dataframe has been correctly loaded andcreated, use thehead()method. Another Pandas method,head()displays the first five entries ofa dataframe.

Course Hero member to access this document

Course Hero member to access this document

End of preview. Want to read all 9 pages?

Course Hero member to access this document

Term
Fall
Professor
NoProfessor
Tags
Correlation and dependence, Pearson product moment correlation coefficient, Spearman s rank correlation coefficient, viq