# Data Wrangling Report .docx - Data Wrangling Report 1)A...

• 5

This preview shows page 1 - 3 out of 5 pages.

Data Wrangling Report: 1)A) Dataset Variable Level of Measurement PersonalIcome State Nominal PersonalIcome 2019_PersonalIncome Ratio TheftData Year Interval TheftData STATE_NAME Nominal TheftData OFFENSE_NAME Nominal TheftData LOCATION_NAME Nominal TheftData PROP_DESC_NAME Nominal TheftData STOLEN_VALUE Ratio TheftData RECOVERED_VALUE Ratio Level of Measurements of Variables Present in Both Datasets Levels of Measures are based on the variable type and scale: Nominal scales contain the least amount of information. In nominal scales, the numbers assigned to each variable or observation are only used to classify the variable or observation. For example, States is a qualitative variable that is based on categories. Similarly, the location or offense names are just different categories of this variable. Ordinal scales present more information than nominal scales and are, therefore, a higher level of measurement. In ordinal scales, there is an ordered relationship between the variable’s observations. Here, there is no present variable that reflects this measurement. Interval scales present more information than ordinal scales, in that they provide assurance that the differences between values are equal. In other words, interval scales are ordinal scales but with equivalent scale values from low to high interval. The evident example here is the Year because the difference between year 2020 and 2021 is the same as that between 2019 and 2020. Also, there is no 0-starting point. Ratio scales are the most informative scales. Ratio scales provide rankings, assure equal differences between scale values, and have a true zero point. In essence, a ratio scale can be thought of as nominal, ordinal, and interval scales combined as one. This is why I considered stolen values, recovered values, and personal income as evident examples. Note: although year is time, but time isn’t one of the levels of measurement, thus time in days or years, are on interval scales. 1)B)
1. Considering the problem at hand: “In 2019 , did personal income influence the value of the thefts across all US states ?”, start by identifying the keywords that would help you come up with the required variables.