Assignment#2 - Nina Kazarov.docx - DSC 441 Fall 2018-2019...

This preview shows page 1 - 4 out of 7 pages.

DSC 441: Fall 2018-2019 Assignment 2, Page 1 of 7Assignment 2 – Nina KazarovDue Date: Saturday, October 6th, 2018, by midnight Total number of points: 35 pointsProblem 1 (10 points): This problem is an example of data preprocessing needed in a data mining process. Suppose that a hospital tested the age and body fat data for 18 randomly selected adults with the followingresults:Age262629294045505560%fat10.530.58.820.832.426.930.430.233.2Age554560556162637566%fat36.644.530.835.433.236.137.943.237.7a.(2 points) Draw the box-plots for age and %fat. Interpret the distribution of the data.
DSC 441: Fall 2018-2019 Assignment 2, Page 2 of 7Based on the descriptive statistics and boxplot for the Age variable, we can conclude that Age is skewed to the left.
b.(2 points) Normalize the two attributes based on z-score normalization.
.
c.(2 points) Regardless of the original ranges of the variables, normalization techniques transformthe data into new ranges that allow to compare and use variables on the same scales. What are thevalues ranges of the following normalization methods? Explain your answer.
DSC 441: Fall 2018-2019 Assignment 2, Page 3 of 7

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture