Assignment#2_RT_SQ1920.pdf - DSC 441 Spring 2019-2020...

This preview shows page 1 - 2 out of 2 pages.

DSC 441: Spring 2019-2020 Assignment #2, Page 1 of 2 Assignment #2 Due Date: Monday, May 4th, 2020, by midnight Total number of points: 35 pointsProblem 1 (10 points): This problem is an example of data preprocessing needed in a data mining process. Suppose that a hospital tested the age and body fat data for 18 randomly selected adults with the following results: Age 26 26 29 29 40 45 50 %fat 10.5 30.5 8.8 20.8 32.4 26.9 30.4 30.2 Age 55 45 60 55 61 62 63 %fat 36.6 44.5 30.8 35.4 33.2 36.1 37.9 43.2 a.(2 points) Draw the box-plots for age and %fat. Interpret the distribution of the data. b.(2 points) Normalize the two attributes based on z-score normalization. c.(2 points) Regardless of the original ranges of the variables, normalization techniques transform the data into new ranges that allow to compare and use variables on the same scales. What are the values ranges of the following normalization methods (for this data set and in general)? Explain your answer. i.Min-max normalization ii.Z-score normalization iii.Normalization by decimal scaling. d.55 60 33.2 75 66 37.7

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture