# Assignment#2_RT_FQ2021.doc - DSC 441 Fall 2020-2021...

DSC 441: Fall 2020-2021 Assignment #2, Page 1 of 2 Assignment #2 Due Date: Thursday , October 15 th , 2020, by midnight Total number of points: 35 points Problem 1 (10 points): This problem is an example of data preprocessing needed in a data mining process. Suppose that a hospital tested the age and body fat d ata for 18 randomly selected adults with the following results: Age 26 26 29 29 40 45 50 55 60 %fat 10.5 30.5 8.8 20.8 32.4 26.9 30.4 30.2 33.2 Age 55 45 60 55 61 62 63 75 66 %fat 36.6 44.5 30.8 35.4 33.2 36.1 37.9 43.2 37.7 a. (2 points) Draw the box-plots for age and %fat. Interpret the distribution of the data. b. (2 points) Normalize the two attributes based on z-score normalization. c. (2 points) Regardless of the original ranges of the variables, normalization techniques transform the data into new ranges that allow to compare and use variables on the same scales. What are the values ranges of the following normalization methods (for this data set and in general)? Explain and backup your answer. i. Min-max normalization ii. Z-score normalization iii. Normalization by decimal scaling. d.