View the step-by-step solution to:

Question

# I need help with these

two questions. Please. This is from my data mining class.

4. Suppose that a hospital tested the age and body fat data for 18 randomly selected adults with the
following results: age 23 23 27 27 39 41 47 49 50 52 54 54 56 57 58 5s 60 61
fat 9.5 26.5 7.8 17.8 31.4 25.9 27.4 27.2 31.2 34.6 42.5 28.8 33.4 30.2 34.1 32.9 41.2 35.7
(a) Calculate the 5 number summary for the age attribute, and use this to draw a boxplot for age. (2)
(b) List a possible source of noise or errors in this dataset. (1) (0) Suppose that the last two values for the body fat attribute were missing. Choose a data cleaning
method for handling the missing data problem in this scenario. Justify your choice. (1) 5. Suppose that a data warehouse consists of the four dimensions, date, spectator, location, and game,
with the data cube measure charge, where charge is the fare that a spectator pays when watching a
baseball game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. (a) Design concept hierarchies for each of these dimensions that could be used to encode the levels of
aggregation needed for typical analyses of this collection of data. (2) (b) How many cuboids are there in the data cube with the concept hierarchies that you’ve speciﬁed
(including the base and apex cuboids)? (2) (0) Starting with the base cuboid [datc, spectator, location, game], what speciﬁc OLAP operations
should you perform in order to list the total charge paid by student spectators at Camden Yards
in 2015? (2)
Total: 1

### Why Join Course Hero?

Course Hero has all the homework and study help you need to succeed! We’ve got course-specific notes, study guides, and practice tests along with expert tutors.

### -

Educational Resources
• ### -

Study Documents

Find the best study resources around, tagged to your specific courses. Share your own to gain free Course Hero access.

Browse Documents