What Does a DBMS Manage?
1. Data organization
2. Data Retrieval
3. Data Integrity
Updates in SQL
Storage and File Organization
General Overview - rel. model
Relational model - SQL
Formal & commercial query languages
Database System Concepts
DBMSs store data on d
Review: The ACID properties
A tomicity: All actions in the Xaction happen, or none happen.
C onsistency: If each Xaction is consistent, and the DB starts
consistent, it ends up consistent.
I solation: Execution of one
Complexity of Running a Data
is an example.
3 he Internet
Simplified Data Processing on
CS505 Intermediate Topics in Database Systems
Meeting Time and Place: Mon, Wed, and Fri 9:00-9:50 in Anderson Tower Rm.263
Instructor: Tingjian Ge
Instructor Email: [email protected]
Course Web Page: http:/protocols.netlab.uky.edu/~ge/teaching.html
MapReduce: Simplied Data Processing on Large Clusters
Jeffrey Dean and Sanjay Ghemawat
[email protected], [email protected]
MapReduce is a programming model and an associated implementation for processing and generating large data sets
Data Analysis (a.k.a.
data warehousing) and
Column Oriented DBMS
Modified and Extended the slides from Silberschatz et al and from New England Database
Decision Support Systems
business decisions, often based on
data collected by on-line tran
Data Organization - B-trees
Data organization and retrieval
File organization can improve data retrieval time
Slides by Joe Hellerstein, UCB, with some material from Jim Gray,
Microsoft Research. See also:
Why Parallel Access To
At 10 MB/s
1.2 days to scan
s Concurrency Control
Ensures interleaving of operations amongst
concurrent transactions result in serializable
transaction operations interleaved following a
Database System Co
Intermediate Topics in Database Systems
Prof. Tingjian Ge
with thanks to Prof. Stan Zdonik, Brown University
Prof. Sam Madden, MIT
Prof. Avi Silberschatz, Yale University
What is a Database System?
A very large collection of related
CS 505 Fall 2009 Homework #4
Due Monday Nov. 9, in Class
Chapter 15 and 16
CS 505 Fall 2009 Homework #2
Due Sep. 30, in Class
Note: For this problem, you need to read the pseudo-code in the textbook for B+ tree operations.
Problem 1 (14.11): Suppose that a B+-tree index on (branch-name, branch-city) is available on
relation branch. What would be the best way to handle the following selection?
Answer: Using t
Problem 1: Suppose we have tables:
T1 (c11, c12, c13, c14)
T2 (c21, c22)
T3 (c31, c32, c33)
Draw a logical query plan (tree) for query:
SELECT c14 FROM T1, T2 WHERE c12 = c21
SELECT c32 FROM T2, T3 WHRE c33 = c21 AND c31 = 22
17.9. Assume that immediate modification is used in a system. Show, by an example, how an
inconsistent database state could result if log records for a transaction are not output to
stable storage prior to data updated by the transaction being
CS 505 Fall 2009 Homework #5
Due Monday Nov. 30, in Class
Suppose there is a log-based recovery database that crashed during the execution. When it comes
back online, you find the log and it looks as follows.
C-Store: A Column-oriented DBMS
Mike Stonebraker, Daniel J. Abadi, Adam Batkin+, Xuedong Chen, Mitch Cherniack+, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth ONeil, Pat ONeil, Alex Rasin, Nga Tran+, Stan Zdonik
MIT CSAIL Cambridge, MA
Storage and Disks
Now Something Different
CS 405G: Application Oriented
CS 505: Systems Oriented
What is Systems?
A: Not Programming
Not programming big things.
Systems = Efficient and safe use of limited resources (e.g., disks)
A Comparison of Approaches to Large-Scale Data Analysis
University of Wisconsin
[email protected] Daniel J. Abadi
Yale University Microsoft Inc.
[email protected] Samuel Madde
B+-tree is perfect, but.
to answer a selection query (ssn=10) needs to traverse a full path.
In practice, 3-4 block accesses (depending on the height of the tree,
Any better approach?
s File Structures
s Query Processing and Optimization
s Data Retrieval at the physical level:
Indices: data structures to help with some query evaluation:
CS505 Lab 2: Operators
Operators: Join, aggregate, filter, etc.
Add and delete tuples
Buffer pool eviction
SELECT * FROM table1, table2
WHERE table1.field1 =
AND table1.id > 5
Operators = Iterator