ER model.pdf - Conceptual Design with ER Model INF 551 Wensheng Wu 1 Lecture Outline \u2022 Steps in building a database application ER Entity Set

ER model.pdf - Conceptual Design with ER Model INF 551...

This preview shows page 1 out of 58 pages.

You've reached the end of your free preview.

Want to read all 58 pages?

Unformatted text preview: Conceptual Design with ER Model INF 551 Wensheng Wu 1 Lecture Outline • Steps in building a database application ER - Entity Set Relationship • Conceptual design with ER model Difference b/w set and bag/multiset - multiset contains duplicates IR - bag of words s = {1,2,3} - set b = {1,3,1,2} - bag 2 Steps in Building a DB Application • Step 0: pick an application domain – E.g., course management • Step 1: conceptual design – Decide on what to model in the application domain • E.g., instructors, students, courses, etc. – need a modeling language to express what you want – ER model is the most popular such language – output: an ER diagram of the app. domain 3 Steps in Building a DB Application – Here we use relational DBMS ER | S: relational schema (tables) | (refine using normal form (3NF. BCNF) S: refined schema FD: functional dependency is a constraint table name, attribute name - meta data • Step 2: pick a type of DBMS • Step 3: translate ER design to a relational schema – use a set of rules to translate ER to rel. schema – use a set of schema refinement rules to transform the above rel. schema into a good rel. schema • At this point – you have a good relational schema on paper Example: Table with columns sid, name, depart_name, depart_addr sid -> name, depart_name: Functional dependency: sid -> name, depart_name and depart_addr depart_name - > depart_addr : many to one relationship depart_name - > depart_addr So, there is a redundancy of data here when depart_addr is being repeated for every sid. In BCNF, it will check if the left hand side of FD is a table id. If it is not, it will remove the table and split the table. In this example, it will split the table into sid, name, depart_name and depart_name, depart_addr. (Step of REFINING) 4 Steps in Building a DB Application • Subsequent steps include – implement your relational DBMS using a "database programming language" called SQL – ordinary users cannot interact with the database directly – and the database also cannot do everything you want – hence write your application program in Php, C++, Java, Python, etc. to handle the interaction and take care of things that the database cannot do • So, the first thing we should start with is to learn ER model ... 5 ER Model • Gives us a language to specify – what information the db must hold – what are the relationships among components of that information • Proposed by Peter Chen in 1976 • What we will cover – – – – – basic stuff subclasses constraints weak entity sets design principles 6 Basic Concepts name category Oval - Attribute Rectangle - Entity Set Diamond - Relationship price Company can make many product. makes Every product will be by atmost 1 company. Arrow -> implies many -> one name Company Product stockprice Many to many buys employs Person address name ssn 7 Entities and Attributes • Entities – real-world objects distinguishable from other objects – described using a set of attributes price name Product name category Company • Attributes stockprice – each has an atomic domain: string, integers, reals, etc. • Entity set: a collection of similar entities 8 Relationships • A mathematical definition: – if A, B are sets, then a relationship R is a subset of A xB one to many Cartesian Product |A*B| = 3* 4 = 12 1 • A={1,2,3}, B={a,b,c,d}, R = {(1,a), (1,c), (3,b)} A= Cartesian Product: Select * from A, B = all values from both in tuples. Number of tuples = |A| * |B| A <— R — B a b 2 c 3 B= d makes is a subset of Product x Company: makes Company Product 9 More about relationships ... 10 Multiplicity of E/R Relationships • one-one: a b c d 1 2 3 – One = at most one 1 2 3 • many-one a b c d – Here left side = many 1 2 3 • many-many makes a b c d Company Product 11 Multiway Relationships How do we model a purchase relationship between buyers, products and stores? Product Purchase Store Person Can still model as a mathematical set (how ?) 12 Arrows in Multiway Relationships Q: what does the arrow mean? Invoice VideoStore Rental Movie Person A: if I know the store, person, invoice, I know the movie too 13 Arrows in Multiway Relationships Q: what do these arrows mean? Invoice VideoStore Rental Movie Person A: store, person, invoice determines movie and store, invoice, movie determines person 14 Arrows in Multiway Relationships Q: how do I say: “invoice determines store” ? A: no good way; best approximation: Invoice in this we can say, that by knowing the invoice, movie and person, we can know the store. VideoStore Rental Person Movie rental(store_id, in_id, p_id, m_id) - since many to one relationship, we know: 1) by knowing invoice, person and movie, we can know the store 2) since we need three attributes to know the output, the key are all three attributes combined. FD: many to one. If you know the input, you know the output. Relational model captures many-one relationships in functional dependencies, e.g., invoice store 15 Roles in Relationships What if we need an entity set twice in one relationship? Product Purchase atmost one salesperson multiple buyer salesperson Person Store buyer Person 16 Roles in Relationships What if we need an entity set twice in one relationship? Product Purchase Store buyer salesperson Person 17 Attributes on Relationships date Product purchase: pid, sid, personid, date first three combined together are the unique id. Purchase Store Person 18 made the relationship into entity and separated each relationship with other entities into different parts. Converting Multiway Relationships to Binary purchase: -> p1, s1, person1, d1 we can multiple products in total, but within purchase for the key tuple, we need atmost -> p1, s1, person2, d2 one product, one store and one person. That’s why we have atmost relationships. since pi, si, personi can not be null. The relationships will change to exactly one from atmost one. date ProductOf Product StoreOf Store BuyerOf Person Purchase 19 Relationships: Summary • Modeled as a mathematical set • Binary and multiway relationships • Converting a multiway one into many binary ones • Constraints on the degree of the relationship in this case, atmost one – many-one, one-one, many-many – limitations of arrows • Attributes of relationships – not necessary, but useful 20 Roadmap • What we will cover – – – – – basic stuff subclasses constraints weak entity sets design principles 21 Subclasses in ER Diagrams name category price Product isa Software Product platforms isa Educational Product 22 Age Group Subclasses in ER Diagrams • Assume subclasses form a tree. – I.e., no multiple inheritance. • Isa triangles indicate the subclass relationship. – Point to the superclass. 23 Subclasses • Subclass = special case = fewer entities = more properties. • Example: Ales are a kind of beer. – Not every beer is an ale, but some are. – Let us suppose that in addition to all the properties (attributes and relationships) of beers, ales also have the attribute color. 24 Example name Beers manf isa color Ales 25 ER vs. Object Oriented Subclasses • In the object-oriented world, objects are stored in one class only. – Subclasses inherit all properties from superclasses. – All properties of the object are stored together. • In contrast, in the E/R view, entities may have components in all subclasses to which they belong. – Matters when we convert to relations. 26 Example Beers(name, manf) Ales(name, color) Ales is extension of class Beer In ER, extension of classes keep the primary key only. Not other attributes (how it is in java) name Beers manf Pete's Ale isa color Ales 27 Roadmap • What we will cover – – – – – basic stuff subclasses constraints weak entity sets design principles 28 Constraints Constraints - Metadata • A constraint = an assertion about the data in the database that must be true at all times • Part of the database schema • Very important in database design – To ensure data integrity 29 Modeling Constraints Finding constraints is part of the modeling process. Commonly used constraints: Keys: social security number uniquely identifies a person. Single-value constraints: a person can have only one spouse. Referential integrity constraints have to be equal to 1. Foreign key relationship to the other table has to be equal to 1. Referential integrity constraints: if you work for a company, it must exist in the database. Domain constraints: peoples' ages are between 0 and 150. General constraints: all others (e.g., at most 50 students can enroll in a class) 30 Why Constraints are Important? • Give more semantics to the data – help us better understand it • Allow us to refer to entities (e.g, using keys) • Enable efficient storage – E.g., store ages as tiny integer (1 byte for example) • Enable efficient lookup – E.g., creating an index on key 31 Keys in E/R Diagrams Indicated by underlines: name category price No formal way to specify multiple keys in E/R diagrams Product Person address name ssn 32 More about Keys • Every entity set must have a key – why? • A key can consist of more than one attribute • There can be more than one key for an entity set – one key will be designated as primary key • Requirement for key in an isa hierarchy – Root entity set has all attributes needed for a key 33 Subclasses in ER Diagrams name category price Product isa Software Product platforms isa Educational Product 34 Age Group Single Value Constraint • An entity has at most one value for a given attribute or relationship • An attribute of an entity set has a single value or NULL – i.e., the value may be missing • A many-one relationship also implies a single value constraint makes Company Product 35 Referential Integrity Constraint • Ref. int. constraint: exactly one value exists in a given role • An attribute has a non-null, single value – this can be considered a kind of ref. int. constraint • However, we more commonly use such constraints to refer to relationships 36 Referential Integrity Constraints • In some formalisms we may refer to other object but get garbage instead – e.g. a dangling pointer in C/C++ • The Referential Integrity Constraint on relationships explicitly requires a reference to exist 37 Referential Integrity Constraints • This Product makes Product makes Atmost 1 Company Exactly 1 Company will be even clearer once we get to relational databases 38 Roadmap • What we will cover – – – – – basic stuff subclasses constraints weak entity sets design principles 39 Weak Entity Sets Question (could be asked): drawing entity sets? Entity sets are weak when (some or all of) their key attributes come from other entity sets to which they are related. entity sets can have multiple keys keys can have multiple attributes This happens when: - part-of relationships - splitting n-ary relationships to binary. Can’t figure out player without university name. Hence, player is a weak entity set. double rectangle: weak entity sets affiliation Player sports team player entity is dependent on university. number ) University name addr underlined item = key of the table 40 Converting Multiway Relationships to Binary When we look at one purchase tuple, we need one product, store and person exactly for it to be a key. date ProductOf ) Product StoreOf ) Store BuyerOf ) Person Purchase Relationship can have multiple entity sets. For example: Purchase has with product, store and person. Purchase needs all three info. Purchase is weak because it gets info from other entity sets (product, store, person) 41 Now, about design techniques ... 42 Design Principle 1: Be Faithful Example of choosing bad semantics A) Product can be purchased by multiple person. Product Purchase Person B) Country can have many persons as president. A person can be president of many countries. Country President Person Same thing as b Instructor Teaches Course 43 Design Principle 2: Avoid Redundancy date Product Purchase Store Person address will be repeated multiple times. So, to avoid redundancy, make new table for person. Person name will be repeated too, but that is okay because it’s used to identify. personAddr personName 44 Avoiding Redundancy • Redundancy occurs when we say the same thing in more than one way. • Redundancy wastes space and (more importantly) encourages inconsistency. – Multiple instances of the same fact may become inconsistent if we change one and forget to change the other, related version. 45 Example: Good name Beers name ManfBy addr Manfs This design gives the address of each manufacturer exactly once. 46 Example: Bad name Beers name ManfBy addr Manfs manf This design states the manufacturer of a beer twice: as an attribute and as a related entity. 47 Example: Bad name manf manfAddr Beers This design repeats the manufacturer's address once for each beer; loses the address if there are temporarily no beers for a manufacturer. 48 Design Principle 3: KISS That is why it is complicated. No need to have entity set for Dates date Dates Product Purchase Store Person 49 More on Design Techniques 1. Don't use an entity set when an attribute will do. 2. Limit the use of weak entity sets. 50 Entity Sets Versus Attributes • An entity set should satisfy at least one of the following conditions: – – It is more than the name of something; it has at least one nonkey attribute. or It is the “many” in a many-one or many-many relationship. (why?) 51 Example: Good name Beers name ManfBy addr Manfs If we remove the beer entity and add it to the manf entity set: 1) we will have three attributes for manf BUT this is bad, because the address will be repeated multiple times. •Manfs deserves to be an entity set because of the nonkey attribute addr. •Beers deserves to be an entity set because it is the “many” of the many-one relationship ManfBy. 52 Example: Good name manf Beers There is no need to make the manufacturer an entity set, if we record nothing about manufacturers besides their name. 53 Example: Bad name Beers name ManfBy Manfs Don’t have a non-key for manfs. This will be same as having two attributes for beers, but with an extra entity. So, bad. Since the manufacturer is nothing but a name, and is not at the “many” end of any relationship, it should not be an entity set. 54 Don't Overuse Weak Entity Sets • Beginning database designers often doubt that anything could be a key by itself. – They make all entity sets weak, supported by all other entity sets to which they are linked. • In reality, we usually create unique ID's for entity sets. – Examples include social-security numbers, automobile VIN's etc. 55 When Do We Need Weak Entity Sets? • The usual reason is that there is no global authority capable of creating unique ID's. • Example: it is unlikely that there could be an agreement to assign unique player numbers across all football teams in the world. 56 ER Review • Basic stuff – entity, attribute, entity set entity set has to be unique (because it is a set). so, needs to have a key. – relationship: binary, multiway, converting from multiway – relationship roles, attributes on relationships • Subclasses (is-a) • Constraints – multiplicity of relationships • many-one, one-one, many-many • limitations of arrows not null - integrity constraints tinyint (0-255) rather than int (4KB) - domain contraint. – keys, single-valued, ref integrity, domain & general constraints 57 ER Review • Weak entity set doesn’t have complete key attribute by itself. • Design principles – be faithful – avoid redundancy – KISS 58 ...
View Full Document

  • Fall '14
  • Entity-relationship model, Relational model, multiway, Multiway Relationships

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

Stuck? We have tutors online 24/7 who can help you get unstuck.
A+ icon
Ask Expert Tutors You can ask You can ask You can ask (will expire )
Answers in as fast as 15 minutes