This preview shows page 1. Sign up to view the full content.
Unformatted text preview: NORMALIZATION
1 Decomposition and FDs
x Functional dependencies: can be used in designing a relational database to remove the undesired properties
x Normalization using FDs:
BoyceCodd Normal Form (BCNF)
3rd Normal Form (3NF) Farkas CSCE 520 2 BoyceCodd Normal Form
x A relation is in BoyceCodd Normal Form if for all FDs X A in S+ over R at least one of the followings hold:
X A is a trivial FD
if X A is a nontrivial FD then X is a superkey for schema R x Example: R(Name,Breed,Date, Kennel)
FD: Name,Breed,Date Kennel
R is in BCNF Farkas CSCE 520 3 Decomposition into BCNF
x Compute S+ (for FDs in S);
x if there is a R that is not BCNF then
let XA a nontrivial FD on R s.t. XR is not in S+ and X ∩ A = ∅;
Decompose R into: – R1(RA)
– R2(X,A) Farkas CSCE 520 4 BCNF x Not every BCNF decomposition is dependency preserving
x Example: FDs: AB C and C B
Keys: {A,B} and {A,C}
C B is BCNF violation, therefore need to decompose to R1=(AC) and R2=(BC)
Decomposition cannot enforce AB C! Farkas CSCE 520 5 Third Normal Form
x Modifies BCNF conditions so no need to decompose in this problem situation
x An attribute is prime if it is member of any key
x X A violates 3NF if and only if X is not a superkey and also A is not a prime. Farkas CSCE 520 6 3NF Conditions
x A relation is in 3NF if for all FDs X A in S+ over R at least one of the followings hold:
X A is a trivial FD
if X A is a nontrivial FD then X is a superkey for schema R
Each attribute B in AX is contained in a candidate key for R Farkas CSCE 520 7 3NF and BCNF
x BCNF gives Losslessjoin
Noredundancy
Dependency preservation not always possible x 3NF gives Losslessjoin
Dependency preservation
May have null values (transitive dependencies) Farkas CSCE 520 8 Multivalued Dependencies
Fourth Normal Form
Reasoning About FD’s + MVD’s
This slides are from J. Ullman’s CS145 Introduction to Databases web site at http://infolab.stanford.edu/~ullman/dscb.html#slides 9 Definition of MVD
x A multivalued dependency (MVD) on R, X >>Y , says that if two tuples of R agree on all the attributes of X, then their components in Y may be swapped, and the result will be two tuples that are also in the relation.
x i.e., for each value of X, the values of Y are independent of the values of RXY.
10 Example: MVD
Drinkers(name, addr, phones, beersLiked)
x A drinker’s phones are independent of the beers they like.
name>>phones and name >>beersLiked. x Thus, each of a drinker’s phones appears with each of the beers they like in all combinations.
x This repetition is unlike FD redundancy.
name>addr is the only FD. 11 Tuples Implied by name>>phones
If we have tuples:
name
sue
sue
sue
sue addr
a
a
a
a phones beersLiked
p1 b1
p2 b2
p2 b1
p1 b2 Then these tuples must also be in the relation. 12 Picture of MVD X >>Y
X Y others equal exchange 13 MVD Rules
x Every FD is an MVD (promotion ). If X >Y, then swapping Y ’s between two tuples that agree on X doesn’t change the tuples.
Therefore, the “new” tuples are surely in the relation, and we know X >>Y. x Complementation : If X >>Y, and Z is all the other attributes, then X >>Z.
14 Splitting Doesn’t Hold
x Like FD’s, we cannot generally split the left side of an MVD.
x But unlike FD’s, we cannot split the right side either sometimes you have to leave several attributes on the right side. 15 Example: Multiattribute Right Sides
Drinkers(name, areaCode, phone, beersLiked, manf)
x A drinker can have several phones, with the number divided between areaCode and phone (last 7 digits).
x A drinker can like several beers, each with its own manufacturer.
16 Example Continued
x Since the areaCodephone combinations for a drinker are independent of the beersLikedmanf combinations, we expect that the following MVD’s hold:
name >> areaCode phone
name >> beersLiked manf
17 Example Data
Here is possible data satisfying these MVD’s:
name
Sue
Sue
Sue
Sue areaCode
650
650
415
415 phone
5551111
5551111
5559999
5559999 beersLiked
Bud
WickedAle
Bud
WickedAle manf
A.B.
Pete’s
A.B.
Pete’s But we cannot swap area codes or phones by themselves.
That is, neither name>>areaCode nor name>>phone
holds for this relation.
18 Fourth Normal Form
x The redundancy that comes from MVD’s is not removable by putting the database schema in BCNF.
x There is a stronger normal form, called 4NF, that (intuitively) treats MVD’s as FD’s when it comes to decomposition, but not when determining keys of the relation.
19 4NF Definition
x A relation R is in 4NF if: whenever X >>Y is a nontrivial MVD, then X is a superkey.
Nontrivial MVD means that: 1. Y is not a subset of X, and
2. X and Y are not, together, all the attributes. Note that the definition of “superkey” still depends on FD’s only.
20 Decomposition and 4NF
x If X >>Y is a 4NF violation for relation R, we can decompose R using the same technique as for BCNF. 1. XY is one of the decomposed relations.
2. All but Y – X is the other. 21 Example: 4NF Decomposition
Drinkers(name, addr, phones, beersLiked)
FD: name > addr
MVD’s: name >> phones
name >> beersLiked
x Key is {name, phones, beersLiked}.
x All dependencies violate 4NF.
22 Example Continued
x Decompose using name > addr:
1. Drinkers1(name, addr) x In 4NF; only dependency is name > addr. 1. Drinkers2(name, phones, beersLiked) x Not in 4NF. MVD’s name >> phones and name >> beersLiked apply. No FD’s, so all three attributes form the key.
23 Example: Decompose Drinkers2
x Either MVD name >> phones or name >> beersLiked tells us to decompose to:
Drinkers3(name, phones)
Drinkers4(name, beersLiked) 24 3NF, BCNF, 4NF
3NF
BCNF
4NF 25 Properties
Property 3NF BCNF 4NF Eliminates redundancy due to FDs
Eliminates redundancy due to MVDs
Preserves FDs No Yes Yes No No Yes Yes No No Preserves MVDs No No No 26 ...
View
Full
Document
This note was uploaded on 12/13/2011 for the course CSCE 520 taught by Professor Farkas during the Spring '11 term at South Carolina.
 Spring '11
 Farkas

Click to edit the document details