db-intro - 1 Introduction To This Class INFS 614 Professor...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 Introduction To This Class INFS 614 Professor Smith M IT R E 2 Let’s Go Over the Syllabus .... 0 0 0 Welcome! Your experience with databases? About me: - database scientist at MITRE since ‘93 database - current interests: bioscientific databases, security/privacy, data discovery and sharing - contact information: 0 0 email (best): kps@mitre.org office phone: 703-983-6115 (endure the voicemail) 0 Texts: - D a ta b a s e M a n a g e m e n t S y s te m s , 3 r d e d , R a g h u R a m a k r is h n a n a n d J o h a n n e s G e h r k e , M c G r a w - H ill; - O r a c le 9 i P r o g r a m m in g , 4 th e d itio n R a js h e k h a r S u n d e r r a m a n , M IT R E A d d is o n - W e s le y I S B N 0 - 3 2 1 - 1 9 4 9 8 - 5 . 3 Satisfaction of Prerequisites Satisfaction 0 Prerequisites (strictly enforced) - INFS-501 (Discrete mathematics) - INFS-515 (Computer architecture/organization) - INFS-590 (Program design and data structures) INFS-590 Specifically: - Good background in discrete mathematics (e.g., set theory, mathematical logic, relations and functions); mathematical - Programming (good knowledge of either C, C++ or Java); - Data structures and algorithms, computer architecture, and Data operating systems. M IT R E 0 4 Satisfaction of Prerequisites (Cont.) Satisfaction 0 Consult your letter of acceptance. It specifies your status with Consult respect to these foundation courses. For each course, it must be that either - You were waived from the course (the evidence should be You either in the acceptance letter or in a subsequent official document). - You took the course and received a grade of B or better. M IT R E 5 Submission and Grading 0 0 0 0 Late submissions are not accepted. Late - On-time means: before lecture begins on the due date - Your homeworks must run properly under the Oracle system in the labs. Grading is based on your performance on: - homework assignments (20%) - midterm exam (35%), and - final exam (45%). Final grades consider: - a) class rank - b) absolute standards (e.g., did you learn the material?) The period of performance ends with the final. M IT R E 6 Your GTA and Course Administration 0 GTA - Ms. Sylvia Henshaw - Office hours: 5pm Mondays - Note: Sylvia also works in the ISE office, do not mix her jobs! The course will be administered via a website: - ise.gmu.edu/~kps/INFS614 - please read it at least once / week! 0 M IT R E 7 Honor Code System GMU honor Code GMU - www.ise.gmu.edu/Honor.html 0 For this class: - Homeworks & exams require individual work. - Study groups are encouraged, but homeworks’ solutions and write up must be individual. - Exams: in-class, individual effort, closed books Exams: 0 Satisfaction of prerequisites: - Honor code issue Honor 0 M IT R E 8 Cheating: 5 Good Reasons Not To ... 1) 2) 3) 4) 5) It won’t help you. It will hurt you. It could really hurt you. Cheating is stealing from your classmates. Cheating is wrong. M IT R E 9 Semester Overview Date Aug 29 Sept 5 Sept Sept 12 Sept Sept 19 Sept 26 Oct 3 Oct Oct 10 Oct 17 Oct 24 Oct 31 Nov 7 Nov 14 Nov 21 Nov 28 Dec 5 Dec 12 Topic (chapter/section) Introduction (chapter 1) ER model (chapter 2) Relational Model (chapter 3) Relational Algebra (sections 4.1-4.2) Relational Calculus (section 4.3) Relational SQL Basics: (section 5.1-5.3) Review Midterm Exam SQL: Nested Queries (section 5.5) SQL: Aggregate Queries (section 5.6) Topic (TBA) Functional Dependencies (sections 19.1Functional 19.3) Thanksgiving Holiday! Decomposition and Normal Forms (sections 19.4-19.6) Review Final Exam 5 5 4, 1c 4 1a 1b 2 3 1 c 1a 1b 2 3 HW assignment HW due M IT R E 10 Database Management Systems: Lecture 1 - Introduction INFS 614 Professor Smith M IT R E 11 Why Are Databases Important? 0 D a ta b a ses a re ev ery w h ere - T r a v e lo c it y is j u s t a la y e r a r o u n d S A B R E , a n in d u s t r y - w id e a ir lin e r e s e r v a t io n d a t a b a s e - The “Deep Web” is > 40 times the size of the Internet - Y o u r c e ll p h o n e h a s a s m a ll d a t a b a s e - in c r e a s in g ly : d a t a b a s e s lu r k d e e p in s id e b ig g e r s y s t e m s 0 L o t s o f j o b s r e q u ir e , o r c a n b e d o n e m u c h b e t t e r , w it h a g o o d u n d e r s t a n d in g o f d a t a b a s e s 0 D a t a b a s e s c a n b e v e r y in t e r e s t in g - “ b o t t o m u p ” lo o k a t m a n y t y p e s o f s y s t e m s ; if y o u u n d e r s t a n d t h e d a t a a r c h it e c t u r e , y o u h a v e a n in t im a t e u n d e r s t a n d in g o f t h e w h o le s y s t e m ( a n d v ic e v e r s a ) - y o u m e e t in t e r e s t in g p e o p le ! 0 aircraft designers, people catching drug lords, neuroscientists .... 0 M a n y in t e r e s t in g r e s e a r c h a r e a s M IT R E 12 What is a Database? A DBMS? 0 A d a ta b a s e is c o lle c t io n o f d a t a m a n a g e d o v e r a p e r io d o f t im e - b i g: a l l U S a i r l i n e r e s e r v a t i o n s , s a l e s r e c o r d s a t W a l m a r t s m a ll : a p e r s o n a l C h r is t m a s c a r d lis t , a r e c ip e fi le s p e c i a l i z e d: m e d i c a l i m a g e s , t e r r o r i s t i n f o , a h y b r i d a u t o d e s i g n y o u a r e in a s t u d e n t d a t a b a s e ( o r y o u n e e d t o b e s o o n ...) 0 A D B M S is a D a t a b a s e M a n a g e m e n t S y s t e m - a s p e c ia liz e d s o f t w a r e t o o l m a k in g it m u c h e a s ie r t o m a n a g e d a ta b a ses - a r o u n d 4 0 y e a r s o f a c t iv e r e s e a r c h a n d e n g in e e r in g h a s h e lp e d d e v e lo p t h e m o d e r n D B M S - t h is is a h u g e b u s in e s s ( b illio n s /y e a r ) 0 S o m e t im e s d a ta b a s e is u s e d t o m e a n D B M S - “ O r a c le s e lls a p o p u la r d a t a b a s e ” - l e a d s t o m i s u n d e r s t a n d i n g i f y o u a s k “ D o y o u h a v e a d a t a b a s e ?” M IT R E 13 Example Database: A University Database 0 Information about university environment Information • Entities : • Students • Faculty • Courses • Classrooms • Relationships : •Students’ enrollment in courses •Faculty teaching courses •Use of classroom for course •Prerequisite courses M IT R E 14 DBMS’s: Why Use A DBMS? 0 W h a t if y o u d o n ’ t ? 0 Let’s say you have 500 gig of corporate data on employees, d e p a r t m e n t s , s a le s , e t c , a n d y o u w a n t t o g e n e r a t e c u s t o m r e p o r t s . 0 A s im p le fi le s y s t e m h a s m a n y d r a w b a c k s : - r e p o r t a p p lic a t io n c a n ’ t j u s t u s e m a in m e m o r y , t h u s m u s t w r it e c o d e t o s h u t t le d a t a b a c k a n d f o r t h t o d is k a s y o u u s e it 0 O S p r o v id e s p r im it iv e s f o r t h is , b u t it s a lo t o f w o r k - s c a lin g p r o b le m : ( e .g ., 3 2 b it s h a n d le s 4 g ig ) , lin e a r a c c e s s s lo w - if y o u d e le t e a d e p a r t m e n t , h o w d o y o u e n s u r e it s e m p lo y e e s a r e d e le t e d ? - O S s e c u r it y is p r o b a b ly n o t g o o d e n o u g h - y o u p r o b a b ly n e e d t o w r it e a n e w p r o g r a m f o r e a c h q u e s t io n y o u n eed to a sk a b o u t th e sto red d a ta - y o u m a y n e e d t o c o d e u p p r o t e c t io n s f o r c o n c u r r e n t u s e r s a n d a g a in s t s y s t e m c r a s h e s 0 - - - > > T h is is a lo t o f w o r k ! ! M IT R E 15 Standard Services A DBMS Offers 0 S c h e m a t ic d a t a m o d e lin g 0 0 0 0 0 - d is c ip lin e d d e s ig n A u n iv e r s a l q u e r y la n g u a g e ( S Q L ) - d e c la r a t iv e a c c e s s - in t e r o p e r a b ilit y - p h y s ic a l d a t a in d e p e n d e n c e A u t o m a t e d q u e r y o p t im iz a t io n E f fi c ie n t a c c e s s t o t e r a b y t e s t h r o u g h in d e x in g A u t o m a t e d d a t a in t e g r it y en fo rcem en t S e c u r it y ( a c c e s s c o n t r o ls , r o le s , v ie w s , a u d it , a u t h e n t ic a t io n ...) 0 “Crash-proof” atomic 0 0 0 0 0 0 t r a n s a c t io n s D a t a p e r s is t e n c e b e t w e e n e x e c u t io n s C o n cu rren cy co n tro l T r a n s p a r e n t d a t a d is t r ib u t io n , p a r a lle liz a t io n A P I s ( e .g ., J D B C ) a n d a r ic h s e t o f d e v e lo p m e n t t o o ls D a t a e x p o r t in t o X M L S o p h is t ic a t e d a d m in is t r a t io n a n d t u n in g t o o ls M IT R E Where is the “Break-Even Point”? Where Case Study: Small Preschool Survey 0 D u e to (free) o p en so u rce D B M S ’ s , I 16 ca n u se a D B M S to co m p u te a v era g es f o r a s u r v e y w it h 7 9 r e c o r d s 0 C o u ld d o it in E x c e l, t o o , b u t t h a t s e e m s h ard er - q u e r ie s h a r d c o d e d in t o s p r e a d s h e e t p ages M IT R E 17 Structure of a DBMS Queries DBMS Security Transaction Mgmt Query Interface Query Optimization & Execution Relational Operators File Access Methods Buffer Management Disk Management Disk M IT R E 18 People Who Interact With Databases Application Programmers Knowledge Creators DBMS Web Users DBAs DBMS Designers & Implementors M IT R E Information System Architects Database Researchers 19 Course Overview Relational Calculus 4 5 Entity Relationship Design 1 SQL 2 DML DDL (queries) (relational design) Relational Algebra 3 Normalization & Dependency Theory 6 M IT R E Data Modeling D a t a b a s e s im p le m e n t a m o d e l o f p e r t in e n t a s p e c t s o f t h e w o r ld 20 “Real World” 1 ) D a ta S tru ctu res 2 ) O p e r a t io n s 3 ) V a lid it y C o n s t r a in t s warehouse x “store” “no room left” “insert” Database warehouse representation x “insert denied” M IT R E 21 A Brief History of Databases Brief D a ta M od el relational network hierarchical x 1960 x 1970 x 1980 1990 2000 2010 T im e the present M IT R E Relational Data Model: An Example 0 22 Given 3 relations (tables) of data: P ilo t s F lig h t s A ir c r a f t Pilot.name = Flights.pilot_name Flights.aircraft_id = Aircraft.id 0 Which pilots have flown prop-jets? (In SQL) SELECT FR OM W HERE AND AND D I S T I N C T P ilo t s .n a m e P ilo t s , F lig h t s , A ir c r a f t P ilo t .n a m e = F lig h t s .p ilo t _ n a m e F lig h t s .a ir c r a f t _ id = A ir c r a f t .id Aircraft.type = “prop-jets” M IT R E Initial Query Execution Plan an sw er ( t h e d is t in c t p ilo t n a m e s ) ( 1 0 ) p r o je c t ( o n ly p r o p - j e t s - 0 .1 % ) ( 1 0 ,0 0 0 ) s e le c t ( 1 0 ,0 0 0 ,0 0 0 ) jo in scan jo in scan ( 1 0 ,0 0 0 ,0 0 0 ) (2 0 0 0 ) 23 T o t a l t u p le s p rocessed : 3 0 ,0 1 2 ,0 6 0 ( 1 0 ,0 0 0 ,0 0 0 ) (5 0 ) scan P ilo t s Database : (5 0 ) F lig h t s A ir c r a f t ( 1 0 ,0 0 0 ,0 0 0 ) (2 0 0 0 ) M IT R E Query Optimization: Improved Plan an sw er (only distinct pilot’s names) ( 1 0 ) p r o je c t T o t a l t u p le s p rocessed : 3 0 ,0 6 2 (5 0 ) ( 1 0 ,0 0 0 ) jo in 24 scan ( 1 0 ,0 0 0 ) jo in ( o n ly p r o p - j e t s - 0 .1 % ) ( 1 0 ,0 0 0 ) in d e x e d r e tr ie v a l s e le c t (2 ) P ilo t s Database : (5 0 ) F lig h t s A ir c r a f t ( 1 0 ,0 0 0 ,0 0 0 ) (2 0 0 0 ) M IT R E A Brief History of Databases (Continued) object objectrelational relational 25 D a ta M od el network hierarchical x 1960 x 1970 x 1980 x 1990 x 2000 2010 T im e the present M IT R E 26 ORDBMS’s 0 K e y is s u e a d d r e s s e d : - “ I m p e d a n c e m i s m a t c h” - m is m a t c h b e t w e e n d a t a m o d e l in a d a t a b a s e a n d in it s a p p lic a t io n la n g u a g e s - r e la t io n s v s . o b j e c t s /c la s s e s ( e .g ., in C + + o r J a v a ) 0 A d d ed O R D B M S fea tu res: - M o r e p o w e r f u l t y p e d e fi n it io n 0 e.g. “point”, “polyhedron” - T y p e - s p e c ifi c m e t h o d s , p r e d ic a t e s 0 “union”, “contains” - I n h e r it a n c e in t h e s c h e m a 0 convex polyhedra - U s e r d e fi n a b le in f r a s t r u c t u r e 0 indices, operators, optimization techniques M IT R E A Brief History of Databases (recent events .... ) native xml object objectrelational relational 27 D a ta M od el network hierarchical x 1960 x 1970 x 1980 x 1990 x 2000 2010 T im e x x the present M IT R E Important Events (Somewhat) Recently 0 O p en S ou rce D B M S s 28 - M ySQ L - 1995 - P o stg reS Q L - 1 9 9 6 0 S e m i- s t r u c t u r e d ( e .g ., X M L ) D a t a b a s e s - L O R E - 1996 - A p a c h e X in d ic e - 2 0 0 1 M IT R E ...
View Full Document

This note was uploaded on 10/18/2009 for the course INFS 614 taught by Professor Staff during the Fall '08 term at George Mason.

Ask a homework question - tutors are online