ics321-20091020-storage

ics321-20091020-storage - Storage
&
Indexing


Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Storage
&
Indexing
 Asst.
Prof.

Lipyeow
Lim
 InformaAon
&
Computer
Science
Department
 University
of
Hawaii
at
Manoa
 10/20/2009
 Lipyeow
Lim
‐‐
University
of
Hawaii
at
Manoa
 1
 ICS
321
Fall
2009
 ApplicaAon
View
of
DBMS
 •  DBMS
holds
data
in
 the
form
of
relaAons
 or
tables
 •  A
table
is
a
bag
of
 tuples
or
records
 •  SQL
is
used
to
 manage
and
query
 the
data
 •  Data
stored
in
a
 DBMS
is
persistent

 10/20/2009
 ApplicaAon
 SQL DBMS 2
 Lipyeow
Lim
‐‐
University
of
Hawaii
at
Manoa
 Data
Storage
 •  Main
Memory
 •  Flash
Memory

 •  Disk
 –  Random
access
 –  VolaAle
 –  Random
access
 –  Random
writes
are
expensive
 –  Random
access
 –  SequenAal
access
cheaper
 –  Only
sequenAal
access
 –  Archiving
 Tertiary Storage 10/20/2009
 Lipyeow
Lim
‐‐
University
of
Hawaii
at
Manoa
 CPU
 Cache
 Main
Memory
 •  Tapes
 Disk
 Tapes
 OpAcal
 Disks
 3
 RelaAonal
Tables
on
Disk
 •  Record

‐‐
a
tuple
or
row
of
a
relaAonal
 table
 •  RIDs
–
record
idenAfiers
that
uniquely
 idenAfy
a
record
across
memory
and
 disk
 •  Page
–
a
collecAon
of
records
that
is
 the
unit
of
transfer
between
memory
 and
disk
 •  Bufferpool
–
a
piece
of
memory
used
 to
cache
data
and
index
pages.
 •  Buffer
Manager
–
a
component
of
a
 DBMS
that
manages
the
pages
in
 memory
 •  Disk
Space
Manager
–
a
component
of
 a
DBMS
that
manages
pages
on
disk
 10/20/2009
 Lipyeow
Lim
‐‐
University
of
Hawaii
at
Manoa
 Bufferpool record page disk 4
 MagneAc
Disks
 •  A
disk
or
plaXer
contains
mulAple
 concentric
rings
called
tracks.

 •  Tracks
of
a
fixed
diameter
of
a
 spindle
of
disks
form
a
cylinder.
 •  Each
track
is
divided
into
fixed
sized
 sectors
(ie.
“arcs”).
 •  Data
stored
in
units
of
disk
blocks
 Arms move (in
mulAples
of
sectors)
 over tracks •  An
array
of
disk
heads
moves
as
a
 single
unit.

 •  Seek
<me:
Ame
to
move
disk
heads
 over
the
required
track
 •  Rota<onal
delay:
Ame
for
desired
 sector
to
rotate
under
the
disk
 head.
 •  Transfer
<me:
Ame
to
actually
 read/write
the
data
 10/20/2009
 Lipyeow
Lim
‐‐
University
of
Hawaii
at
Manoa
 tracks sector spindle rotates 5
 Accessing
Data
on
Disk
 •  Seek
<me:
Ame
to
move
 disk
heads
over
the
 required
track
 •  Rota<onal
delay:
Ame
for
 desired
sector
to
rotate
 Arms move over tracks under
the
disk
head.
 –  Assume
uniform
 distribuAon,
on
average
 Ame
for
half
a
rotaAon
 tracks sector spindle rotates •  Transfer
<me:
Ame
to
 actually
read/write
the
 data
 10/20/2009
 Lipyeow
Lim
‐‐
University
of
Hawaii
at
Manoa
 6
 Example:
Barracuda
1TB
HDD
(ST31000528AS)

 •  What
is
the
average
Ame
 to
read
2048
bytes
of
 data
?
 =
Seek
Ame
+
rotaAonal
 latency
+
transfer
Ame
 =
8.5
msec
+
4.16
msec
+
 (
2048
/
512
)
/
63
*
(60
 000
msec
/
7200
rpm
)
 =
8.5
+
4.16
+
0.265

 10/20/2009
 cylinders
 Blocks/ cylinder
 Sectors/track
 Heads
 Sprindle
 Speed
 Average
 Latency
 Random
read
 seek
Ame
 Random
read
 Write
Ame
 121601
 8029
 63
 255
 7200
rpm
 4.16
msec
 <
8.5
msec
 <
9.5
msec
 Bytes/cylinder
 16065*512
 Lipyeow
Lim
‐‐
University
of
Hawaii
at
Manoa
 7
 File
OrganizaAons
 How
do
we
organize
records
in
a
file
?
 •  Heap
files:
records
not
in
any
parAcular
order
 •  Sorted
files:
records
sorted
by
parAcular
fields
 –  Good
for
scans

 –  scans
in
the
sorted
order
or
range
scans
in
the
sorted
 order
 Like
sorted
files,
they
speed
up
searches
for
a
subset
 of
records,
based
on
values
in
certain
(“search
key”)
 fields
 Updates
are
much
faster
than
in
sorted
files
 Lipyeow
Lim
‐‐
University
of
Hawaii
at
Manoa
 8
 •  Indexes:
Data
structures
to
organize
records
via
 trees
or
hashing.


 –  –  10/20/2009 

 Comparing
File
OrganizaAons
 Consider
an
employee
table
with
search
key
 <age,sal>
 •  Scans
:
fetch
all
records
in
the
file
 •  Point
queries:
find
all
employees
who
are
30
 years
old
 •  Range
queries:
find
all
employees
aged
above
 65.
 •  Insert
a
record.
 •  Delete
a
record
given
its
RID.
 10/20/2009
 Lipyeow
Lim
‐‐
University
of
Hawaii
at
Manoa
 9
 Simple
EvaluaAon
Model
 •  B
:
number
of
data
pages
 •  R
:
number
of
records
per
page
 •  D
:
average
Ame
to
read/write
a
disk
page
 –  From
previous
calculaAons,
if
a
page
is
2K
bytes,
D
 is
about
13
milliseconds
 •  C
:
average
Ame
to
process
a
record
 –  For
the
1
Ghz
processors
we
have
today,
assuming
 it
takes
100
cyles,
C
is
about
100
nanoseconds
 10/20/2009
 Lipyeow
Lim
‐‐
University
of
Hawaii
at
Manoa
 10
 ...
View Full Document

This note was uploaded on 11/15/2010 for the course ICS 321 taught by Professor Lim during the Fall '09 term at University of Hawaii, Manoa.

Ask a homework question - tutors are online