chapter 4 - part 1 - COP 4710: Database Systems Fall 2007...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: COP 4710: Database Systems Fall 2007 Chapter 4 – Relational Query Languages – Part 1 Instructor : Dr. Mark Llewellyn markl@cs.ucf.edu HEC 236, 823-2790 http://www.cs.ucf.edu/courses/cop4710/fall2007 School of Electrical Engineering and Computer Science University of Central Florida COP 4710: Database Systems (Chapter 4) Page 1 © Mark Llewellyn Mapping E-R Diagrams to Relational Schemas PRACTICE problem from previous notes customer-id customer cust-name has account-id card account card-type = d ‘D’ debit card ‘C’ credit card has charges amount bank-num current-bal date merchant card-type exp-date merchant-id address address COP 4710: Database Systems (Chapter 4) Page 2 © Mark Llewellyn Mapping E-R Diagrams to Relational Schemas SOLUTION to practice problem customer customer-id card account account-id expr-date acct-type customer-id customer-name customer-address debit card d-acct-id bank-num credit card c-acct-id current-balance charges c-acct-id merchant-id date amount merchants merchant-id merchant-address COP 4710: Database Systems (Chapter 4) Page 3 © Mark Llewellyn Query Languages • – A query language is a language in which a database user requests information from the database. Most query languages are on a higher-level than standard programming languages like C and Java. Query languages fall into a category of languages known as 4GL. • Query languages can be broadly categorized into two groups: procedural languages and nonprocedural languages. – A procedural query language requires the user to specify a sequence of operations on the db to compute the desired result. (User specifies how and what.) A nonprocedural query language requires the user to describe the desired result without needing to specify the sequence of operations required to obtain the result. (User specifies only what.) Page 4 © Mark Llewellyn – COP 4710: Database Systems (Chapter 4) Query Languages (cont.) • Most commercially available relational database systems offer a query language which is categorized as a hybrid query language. Hybrid query languages include elements of both the procedural and nonprocedural approaches to query languages. For the time being we are going to examine “pure” relational query languages. These languages are terse and formal and lack many of the syntactic elements available in commercial languages, but they illustrate the fundamental techniques utilized by all query languages for extracting data from the database. Page 5 © Mark Llewellyn • • COP 4710: Database Systems (Chapter 4) Query Languages (cont.) • As we examine these pure languages, bear in mind that, although the pure languages do not contain such features, a complete query language has facilities for inserting and deleting tuples from relations as well as for modifying existing tuples. Procedural language: 1. relational algebra • • Nonprocedural languages: 1. 2. relational tuple calculus relational domain calculus COP 4710: Database Systems (Chapter 4) Page 6 © Mark Llewellyn Relational Algebra • The relational algebra is a procedural query language. It consists of set operations which are either unary or binary, meaning that either one or two relations are operands to the set operations. Each of the set operations produces a relation as its output. There are five fundamental operations in the relational algebra and several additional operations which are defined in terms of the five fundamental operations. There is also a rename operation which is sometimes referred to as a fundamental operation, we’ll save this one for a little while. The five fundamental operations are: select, project, union, set difference, and Cartesian product. We will examine each operation individually before combining operations into more powerful expressions. • • • • COP 4710: Database Systems (Chapter 4) Page 7 © Mark Llewellyn Relational Algebra (cont.) • • The five fundamental operations are: select, project, union, set difference, and Cartesian product. There are several additional (redundant) operations that have been defined in the relational algebra. The most common of these include: intersection, natural join, division, semi-join, and outer join. We will examine each operation individually before combining operations into more powerful expressions. • COP 4710: Database Systems (Chapter 4) Page 8 © Mark Llewellyn Selection Operator Type: unary Symbol: Greek letter sigma, σ General form: σ(predicate)(relation instance) Schema of result relation: same as operand relation Size of result relation (tuples): ≤ ⏐operand relation⏐ Examples: σ(major = “CS”)(students) σ(major = “CS” and hair-color = “brown”)(students) σ(hours-attempted > hours-earned)(students) • • The select operation selects tuples from a relation instance which satisfy a specified predicate. In general, a predicate, may contain any of the logical comparative operators, which are =, ≠, <, ≤, >, ≥. Furthermore, several predicates may be combined using the connectives and (∧), or (∨), and not (¬). The select operation may be thought of as providing a horizontal crosssection of the operand relation. Page 9 © Mark Llewellyn • COP 4710: Database Systems (Chapter 4) Selection Operator Examples R r = σ(A = ‘a’)(R) A a b c a a b c d h e a B a d f d c b a d d c a C yes no yes no no no yes yes yes no yes D 1 7 34 6 7 69 24 47 34 26 5 A a a a a B a d c a C yes no no yes D 1 6 7 5 r = σ(A = ‘a’ ^ C = “yes”)(R) A a a B a a C yes yes D 1 5 an empty relation r = σ(B = ‘m’)(R) A B C D COP 4710: Database Systems (Chapter 4) Page 10 © Mark Llewellyn Projection Operator Type: unary Symbol: Greek letter pi, π General form: π(attribute-list)(relation instance) Schema of result relation: specified by <attribute-list> Size of result relation (tuples): ≤⏐ operand relation⏐ Examples: π(student-id, name, major)(students) π(name, advisor)(students) π(name, gpa, hours-attempted)(students) • • The project operation can be viewed as producing a vertical cross-section of the operand relation. If the operation produces duplicate tuples, these are typically removed from the result relation in keeping with its set-like characteristics. COP 4710: Database Systems (Chapter 4) Page 11 © Mark Llewellyn Projection Operator Examples R r = π(A, C)(R) r = π(A, D)(R) r = π(C)(R) A a b c a a b c d h e a B a d f d c b a d d c a C yes no yes no no no yes yes yes no yes D 1 7 34 6 7 69 24 47 34 26 5 A a b c a d h e C yes no yes no yes yes no A a b c a a b c d h e a D 1 7 34 6 7 69 24 47 34 26 5 C yes no COP 4710: Database Systems (Chapter 4) Page 12 © Mark Llewellyn Union Operator Type: binary Symbol: union symbol, ∪ General form: r ∪ s, where r and s are union compatible Schema of result relation: schema of operand relations Size of result relation (tuples): ≤ max{⏐r⏐+⏐s⏐} Examples: r∪s π(a, b)(r) ∪ π(a, b)(s) • The union operation provides a means for extracting information which resides in two operand relations which must be union compatible. Union compatibility requires that two conditions hold: 1. 2. Relations r(R) and s(S) in the expression r ∪ s must be of the same degree (arity). That is, they must have the same number of attributes. The domains of the ith attribute of r(R) and the ith attributes of s(S) must be the same, for all i. COP 4710: Database Systems (Chapter 4) Page 13 © Mark Llewellyn Union Operator Examples R T r=R∪T r=R∪S A a b c a a S B a d f d c D 1 7 34 6 7 A a b c a a B a d f d c not valid – R and T are not union compatible E a b c a a F a d f d c m c d G 1 7 34 6 7 4 22 16 r=T∪X A X a b a a Y m c d c Z 4 22 16 7 X B a d f d c a b a a A a b a B a d c b c a a COP 4710: Database Systems (Chapter 4) Page 14 © Mark Llewellyn Set Difference Operator Type: binary Symbol: − General form: r − s, where r and s are union compatible Schema of result relation: schema of operand relation Size of result relation (tuples): ≤⏐relation r⏐ Examples: r−s • The set difference operation allows for the extraction of information contained in one relation that is not contained in a second relation. As with the union operation, the set difference operation requires that the two operand relations be union compatible. COP 4710: Database Systems (Chapter 4) Page 15 © Mark Llewellyn Set Difference Operator Examples R T r=R−T r=R−S A a b c a a S B a d f d c D 1 7 34 6 7 A a b c a a B a d f d c not valid – R and T are not union compatible E a b F a d f d G 1 7 34 6 r=T−X A c a r=X−T B f d c a r=S−R E B a b a F m c d G 4 22 16 X a b a a Y m c d c Z 4 22 16 7 X A a b a B a d c A empty relation COP 4710: Database Systems (Chapter 4) Page 16 © Mark Llewellyn Cartesian Product Operator Type: binary Symbol: × General form: r × s (no restrictions on r and s) Schema of result relation: schema r × schema s with renaming Size of result relation (tuples): >⏐relation r ⏐and >⏐relation s⏐ Examples: r×s • • The Cartesian product operation allows for the combining of any two relations into a single relation. Recall that a relation is by definition a subset of a Cartesian product of a set of domains, so this gives you some idea of the behavior of the Cartesian product operation. COP 4710: Database Systems (Chapter 4) Page 17 © Mark Llewellyn Cartesian Product Operator Examples T A a b B a d r=T×X T.A a a a a T.B a a a a d d d d X.A a b a c a b a c X.B a d c a a d c a X A a b a c B a d c a b b b b COP 4710: Database Systems (Chapter 4) Page 18 © Mark Llewellyn Cartesian Product Operator Examples R r=R×S A A a b c B a d f C 1 7 34 D yes yes no a a a a b b b X a b a a Y m c d c Z 4 22 16 7 b c c c c B a a a a d d d d f f f f C 1 1 1 1 7 7 7 7 34 34 34 34 D yes yes yes yes yes yes yes yes no no no no X a b a a a b a a a b a a Y m c d c m c d c m c d c Z 4 22 16 7 4 22 16 7 4 22 16 7 S COP 4710: Database Systems (Chapter 4) Page 19 © Mark Llewellyn Relational Algebra Expressions • While each of the five fundamental relational algebra operators can be used individually to form a query, their expressive power is tremendously enhanced when they are combined together to form query expressions. Before we introduce the redundant operations in relational algebra we’ll look at forming more complicated combinations of the five fundamental operations. [This will also make you appreciate the redundant operations all the more.] To form meaningful queries we need to be able to pose them against a database. For all of the examples that follow, we’ll use the following database: Page 20 © Mark Llewellyn • • COP 4710: Database Systems (Chapter 4) Relational Algebra Expressions (cont.) dept term course-num courses enrollment name name prof-num teaches takes professors dept yrs-teaching area students student-num name age hours gpa major COP 4710: Database Systems (Chapter 4) Page 21 © Mark Llewellyn Relational Algebra Expressions (cont.) • Using the techniques for converting an ERD into a set of relational schemas we have the following resulting schemas: S = STUDENTS(s#, name, age, major, gpa, hours_completed) C = COURSES(c#, term, name, dept, enrollment) P = PROFESSORS(p#, name, dept, yrs_teaching, area) TA = TAKES(s#, c#, term, grade) TE = TEACH(p#, c#, term) • When you first begin to write queries in a new query language, it is sometimes helpful to actually visualize the data that might be in one of the operand (argument) relations upon which you are operating. To this end, the last two pages of this set of notes provides an instance of each of the relations above so that you can perform this visualization, however, this is something that you will need to move away from as you get more advanced in your query composition, because you do not want to influence the design of your query by visualizing a relation instance that may not contain all possible tuples that your query will encounter during execution. COP 4710: Database Systems (Chapter 4) Page 22 © Mark Llewellyn Relational Algebra Expressions (cont.) Example Query 1: Find the names of all the students who are Computer Science majors. Approach: – First select all of the students who are CS majors. r = σ(major = “Computer Science”)(S) – Next project only the name attribute from the previous result. result = π(name)(r) Complete Query Expression: result = π(name)(σ(major = “Computer Science”)(S)) COP 4710: Database Systems (Chapter 4) Page 23 © Mark Llewellyn Relational Algebra Expressions (cont.) Example Query 2: Find the student-num (s#) and name of all the students who have completed more than 90 hours. Approach: – First select all of the students who have completed more than 90 hours. r = σ(hours_completed > 90)(S) – Next project the student-num and name attributes from the previous result. result = π(s#, name)(r) Complete Query Expression: result = π(s#, name)(σ(hours_completed > 90)(S)) COP 4710: Database Systems (Chapter 4) Page 24 © Mark Llewellyn Relational Algebra Expressions (cont.) Example Query 3: Find the names of all those students who are less than 20 years old who have completed more than 80 hours. Approach: – First select all of the students who have completed more than 80 hours and are less than 20 years old. r = σ((hours_completed > 80) AND (age < 20))(S) – Next project the name attribute from the previous result. result = π(name)(r) Complete Query Expression: result = π(name)(σ((hours_completed > 80) AND (age < 20))(S)) COP 4710: Database Systems (Chapter 4) Page 25 © Mark Llewellyn Relational Algebra Expressions (cont.) Example Query 4: Find the names of all the courses that are offered by either Computer Science or Physics. Approach: – First select all of the courses that are offered by either CS or Physics. r = σ((dept = Computer Science) or (dept = Physics))(C)) – Next project the name attribute from the previous result. result = π(name)(r) Complete Query Expression: result = π(name)(σ((dept = Computer Science) or (dept = Physics))(C)) COP 4710: Database Systems (Chapter 4) Page 26 © Mark Llewellyn Relational Algebra Expressions (cont.) Example Query 5: Find the name of every professor who taught a course in the Fall 2006 term. Approach: – – – First put the professor information together with the course information. Next, select only related professors and courses from previous result. Finally, select only the students name from the previous result. Complete Query Expression: result = π(P.name)(σ((TE.term = Fall 2006) AND (P.p# = TE.p#))(P × TE)) COP 4710: Database Systems (Chapter 4) Page 27 © Mark Llewellyn Relational Algebra Expressions (cont.) Example Query 6: Find the names of all the students who took a course in the Fall 2006 term that was taught by a professor who had more than 20 years of teaching experience. Approach: – – – First put the professor information together with the course information together with the teaches information together with the takes information. Next, select only related students, professors and courses from previous result. Finally, select only the students name from the previous result. Complete Query Expression: result = π(S.name)(σ((TA.term = Fall 2006) AND (P.yrs_teaching > 20) AND (S.s# = TA.s#) AND (P.p# = TE.p#) AND (TA.c# = TE.c#) AND (TA.term = TE.term))(S × P × TA × TE)) COP 4710: Database Systems (Chapter 4) Page 28 © Mark Llewellyn Relational Algebra Expressions (cont.) Example Query 7: Find the names of all the professors who are either in the Computer Science department or have more than 20 years of teaching experience. Complete Query Expression: result = [π(name)(σ(dept = Computer Science)(P))] ∪ [π(name)(σ(yrs_teaching > 20)(P))] or: result = π(name)(σ((dept = Computer Science) OR (yrs_teaching > 20))(P)) COP 4710: Database Systems (Chapter 4) Page 29 © Mark Llewellyn Relational Algebra Expressions (cont.) Example Query 8: Find the student numbers for those students who were enrolled only in the Spring 2007 term. Complete Query Expression: result = [π(TA.s#)(σ(TA.term = Spring 2007)(TA))] − [π(TA.s#)(σ(TA.term ≠ Spring 2007))(TA))] Note: The following query expression is not correct for this query!!! Why? result =π(TA.s#)(σ(TA.term = Spring 2007))(TA)) COP 4710: Database Systems (Chapter 4) Page 30 © Mark Llewellyn Sample Relation Instances: S relation s# name age major gpa hrs_completed S1 Michael Schumacher 19 Computer Science 4.00 45 S5 Jean Alesi 20 Physics 3.46 78 S3 Rubens Barrichello 21 Math 3.82 33 S2 Giancarlo Fisichella 18 Math 2.73 23 S4 Jarno Trulli 18 Computer Science 1.48 99 S7 Bernd Schneider 19 Computer Science 2.29 45 S6 Mika Hakkinen 20 English 2.37 33 COP 4710: Database Systems (Chapter 4) Page 31 © Mark Llewellyn Sample Relation Instances: C relation c# term name dept enrollment C1 Fall 2006 CS1 CS 120 C1 Spring 2007 CS1 CS 100 C4 Fall 2007 Architecture CS 97 C3 Fall 2006 Database CS 86 C5 Spring 2005 Physics I Physics 135 C5 Fall 2006 Physics I Physics 125 C6 Summer 2007 Calculus III Math 67 COP 4710: Database Systems (Chapter 4) Page 32 © Mark Llewellyn Sample Relation Instances: P relation p# name dept yrs_teaching P1 Wilson CS 5 P2 Davis Math 32 P3 deMoser CS 17 P4 Roberts Physics 14 COP 4710: Database Systems (Chapter 4) Page 33 © Mark Llewellyn Sample Relation Instances: TA relation s# S1 S3 S4 S5 S5 S5 S5 S5 S3 S3 S2 c# C3 C4 C6 C5 C1 C3 C6 C4 C5 C1 C4 term Fall 06 Fall 07 Summer 06 Spring 05 Fall 06 Fall 04 Summer 05 Fall 07 Spring 06 Fall 06 Fall 05 grade A B C D A C C A C A D COP 4710: Database Systems (Chapter 4) Page 34 © Mark Llewellyn Sample Relation Instances: TE relation p# c# term P1 C3 Fall 2006 P3 C4 Fall 2007 P4 C6 Summer 2007 P2 C5 Spring 2006 P2 C1 Spring 2005 P1 C4 Fall 2005 P3 C1 Fall 2006 COP 4710: Database Systems (Chapter 4) Page 35 © Mark Llewellyn ...
View Full Document

This note was uploaded on 02/22/2009 for the course COP 4710 taught by Professor Dr. mark llewellyn during the Fall '07 term at University of Central Florida.

Ask a homework question - tutors are online