CS404_Exam_1_100102_bob

CS404_Exam_1_100102_bob

CSc 404 – Data Mining Name:_____ _________________ ________ Exam #1 October 1, 2002 Score:_________________/100 Directions: Carefully answer each of the following questions. This is an open-book, open-note exam. You may use calculators. You are NOT to get help from others. Points will be assigned on answer quality as well as answer correctness. CLEARLY show all work. 1. Briefly characterize the difference between descriptive and predictive data mining. [5 pts.] Descriptive mining tasks characterize the general properties of the data in the database. Predictive mining tasks perform inference on the current data in order to make predictions. 2. Consider an attribute, A1, which has the following set of values: {tall, medium, short}. [12 pts.] a.) Convert this set of values to a set of ordinal values. Aren’t these already ordinal? I wonder if this was an oversight that was found during the exam, but then not updated on the archive exam. b.) The above conversion causes additional properties to be added to the set of attributes. One such property is that you can now determine the distance between two attributes. Name another property. (based on above)

0b186b629a8a632cd2194323663a14ab920fa3be.doc - 2 3. Consider the data cube whose measures are the number of available automobiles and their cost. Dimensions of the data cube are: time, location, and car-type. The time dimension contains the lattice structure for day, month, week, quarter, and year discussed in class (Fig. 2.8b). Location contains the city state hierarchy. [15 pts.] a. Compute the number of possible cuboids. 2^(num dimensions) = 8 cuboids (not sure how to break it down by sub-categories) Emailed Merz wrt this information b. Using the OLAP operations slice, roll-up, drill down, and dlice, construct a query to provide the number of cars available by state during the second quarter of the year. Organize the operations to minimize the of addition operations that have to be performed.
