BCNF and Normalization
Relational Schema Design
Goal of relational schema design is to avoid
redundancy and anomalies.
Chapter 11 & 12 in Textbook
Steps in building a database for an application:
1. Understand real-world domain being captured
2. Specify it using a database conceptual model (E/R,OO)
3. Translate specifica
Structured Query Language
Originally developed in the System-R
project of IBM (1974)
Industry standard for relational
databases (SQL92 is an ANSI/ISO
Structured Query Language
Data Definition Language for defining
relations, views, integrity
Extensible Markup Language
Document Type Definitions
1.Information Integration : Making
databases from various places work as
2.Semistructured Data : A new data
model designed to cope with problems
of information integra
Exponential Random Graph Models
A brief introduction
Statnet Development Team
Mark Handcock (UW)
Martina Morris (UW)
Carter Butts (UCI)
Dave Hunter (PSU)
Steven Goodreau (UW)
Jim Moody (Duke)
Skye Bender-deMoll (at Large)
Weak Entity Sets
Converting E/R Diagrams to Relations
Purpose of E/R Model
The E/R model allows us to sketch
database schema designs.
Includes some constraints, but not
Designs are pictures called en
Introduction to SQL
SQL is a very-high-level language.
Say what to do rather than how to do
Avoid a lot of data-manipulation details
needed in procedural languages like C+
Extended Relational Algebra
The Extended Algebra
= eliminate duplicates from bags.
= sort tuples.
= grouping and aggregation.
Outerjoin : avoids dangling tuples =
tuples that do not join
Algebra of Bags
What is an Algebra
Mathematical system consisting of:
Operands - variables or values from
which new values can be constructed.
Operators - symbols denoting
procedures that construct new values
Local and Global Constraints
Constraints and Triggers
A constraint is a relationship among
data elements that the DBMS is
required to enforce.
Example: key constraints.
Triggers are only executed when a
Paths in XML Documents
XPath is a language for describing
paths in XML documents.
Really think of the semistructured data
graph and its paths.
<!DOCTYPE CFESHPS [
<!ELEMENT CFESHPS (CFESHP*, DRINK*)>
Third Normal Form (3NF)
Third Normal Form - Motivation
R (A, B, C)
AB ->C and C ->B.
Example: A = street address, B = city, C = zipcode.
What is the key?
There are two keys, cfw_A,B and cfw_A,C .
These (and similar) structures of FDs cause
are building blocks
that enable the analysis of data redundancies, and
the elimination of anomalies caused by them
(through the process of normalization).
Convert to relations:
- Students(Id, Name)
Definition of Functional Dependency
If t is a tuple in a relation R and A is an attribute of R, then tA is
the value of attribute A in tuple t.
The FD AdvisorId AdvisorName holds in R if in every
instance of R, for e