n the following, assume the latency/transfer-rate model of disk performance, where we estimate disk access times by allowing blocks that are consecutive on disk to be fetched with a single seek time and rotational latency cost (as shown in class). Also, we use the term RID (Record ID) to refer to an 8-byte "logical pointer" that can be used to locate a record (tuple) in a table. You are given the following very simple database schema that models a movie theatre chain theatres, screenings and customers buying movie tickets. We store the city where a customer lives, and we also store the information about theaters, movie screenings and tickets. The details of the schema are shown below: Customer (cid, cname, ccity)
Theatre (tid, tname, tcity)
MovieScreening (mid, tid, mname, mduration, mtype)
MovieTicket (cid, mid, mtimestamp, price, qty)
Assume there are 100 million customers, 0.5 million movies and 5 billion movie tickets purchased over a period of 100 days. Each "movie ticket" tuple is of size 40 bytes, and all other tuples are 100 bytes. Assume that 1% customers live in San Francisco and 0.1% of the movies are Horror. Otherwise, assume that data is evenly and independently distributed; e.g.: 50 million movie tickets purchased records were made on June 26, 2017 in whole USA. Consider following queries:
1. select cid, cname from customer where ccity = 'San Francisco'
2. select c.cid, c.cname from customer c, movieticket m where c.cid=o.cid and c.ccity='San Francisco' and m. mtimestamp = '2017-11-10'
3. select m.mid, m.mname from movie m, movieticket mt where m.mid=mt.mid and m.type='Horror' and mt. mtimestamp = '2017-10-11'
(c) Suppose that for each query, you could create up to two index structures to make the query faster. What index structures would you create, and how would this change the evaluation plans and running times?
Recently Asked Questions
- Adjusting entry: for 12/31/2017 CMC prepays for some insurance and advertising. The Prepaid Expense account has a balance of $26,774 at year end but before
- "I have my null hypothesis as mean eastern sales = mean western sales. It is two-tailed and I need to use p for a significance level of 0.05. I have the
- I need guidance on this- The 7S model A description of the value of each step within the 7 S model to implementing change with the organization, including