5_onotation - Algorithm
Analysis:
Big
O
Notation
...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Algorithm
Analysis:
Big
O
Notation
 !  Determine
the
running
time
of
simple
algorithms
 !  Best
case
 !  Average
case
 !  Worst
case
 !  !  !  Profile
algorithms
 Understand
O
notation's
mathematical
basis
 Use
O
notation
to
measure
running
time
 John Edgar 2 !  Algorithms
can
be
described
in
terms
of
 Choosing
an
appropriate
algorithm
can
make
a
 significant
difference
in
the
usability
of
a
system
 !  Government
and
corporate
databases
with
many
millions
 !  Time
efficiency
 !  Space
efficiency
 !  of
records,
which
are
accessed
frequently
 !  Online
search
engines
 !  Real
time
systems
where
near
instantaneous
response
is
 required
 ▪  From
air
traffic
control
systems
to
computer
games
 John Edgar 3 !  There
are
often
many
ways
to
solve
a
problem
 !  Different
algorithms
that
produce
the
same
results
 ▪  e.g.
there
are
numerous
sorting
algorithms
 !  We
are
usually
interested
in
how
an
algorithm
 performs
when
its
input
is
large
 !  In
practice,
with
today's
hardware,
most
algorithms
will
 perform
well
with
small
input
 !  There
are
exceptions
to
this,
such
as
the
Traveling
 Salesman
Problem
 John Edgar 4 !  It
is
possible
to
count
the
number
of
operations
that
 an
algorithm
performs
 !  By
a
careful
visual
walkthrough
of
the
algorithm
or
by
 !  Inserting
code
in
the
algorithm
to
count
and
print
the
 !  It
is
also
possible
to
time
algorithms
 algorithm
 number
of
times
that
each
line
executes
(profiling)
 !  Compare
system
time
before
and
after
running
an
 ▪  In
Java:
System.currentTimeMillis()

 !  More
sophisticated
timer
classes
exist
 John Edgar 5 !  It
may
be
useful
to
time
how
long
an
 algorithm
takes
to
run
 !  In
some
cases
it
may
be
essential
to
know
how
 long
an
algorithm
takes
on
some
system
 ▪  e.g.
air
traffic
control
systems
 !  But
is
this
a
good
general
comparison
 method?
 !  Running
time
is
affected
by
a
number
of
 factors
other
than
algorithm
efficiency
 John Edgar 6 !  !  !  !  !  !  !  !  !  !  CPU
speed
 Amount
of
main
memory
 Specialized
hardware
(e.g.
graphics
card)
 Operating
system
 System
configuration
(e.g.
virtual
memory)
 Programming
language
 Algorithm
implementation

 Other
programs
 System
tasks
(e.g.
memory
management)
 …
 John Edgar 7 !  !  Instead
of
timing
an
algorithm,
count
the
number
of
 instructions
that
it
performs
 The
number
of
instructions
performed
may
vary
 based
on
 !  The
size
of
the
input
 !  The
organization
of
the
input
 !  The
number
of
instructions
can
be
written
as
a
cost
 function
on
the
input
size

 John Edgar 8 Java public void printArray(int arr){ for (int i = 0; i < arr.length; ++i){ System.out.println(arr[i]); } } Operations
performed
on
an 
array
of
length
10
 |
 |||
|||
|||
|||
||| ||| ||| ||| ||| ||| |
 perform
comparison, 
print
array
element,
and 
increment
i:10
times 
 make 
comparison 
when
i
=
10 
 declare
and 
initialize
i
 John Edgar 9 !  !  !  Instead
of
choosing
a
particular
input
size
we
will
 express
a
cost
function
for
input
of
size
n
 Assume
that
the
running
time,
t,
of
an
algorithm
is
 proportional
to
the
number
of
operations
 Express
t
as
a
function
of
n
 !  Where
t
is
the
time
required
to
process
the
data
using
 some
algorithm
A
 !  Denote
a
cost
function
as
tA(n)
 ▪  i.e.
the
running
time
of
algorithm
A,
with
input
size
n
 John Edgar 10 public void printArray(int arr){ for (int i = 0; i < arr.length; ++i){ System.out.println(arr[i]); } } Operations
performed
on
an 
array
of
length
n
 1
 3n
 perform
comparison, 
print
array
element,
and 
increment
i:
n
times 
 1
 make 
comparison 
when
i
=
n 
 declare
and 
initialize
i
 t
=
3n
+
2

 John Edgar 11 !  The
number
of
operations
usually
varies
based
on
 the
size
of
the
input
 !  Though
not
always,
consider
array
lookup
 !  In
addition
algorithm
performance
may
vary
based
 on
the
organization
of
the
input
 !  For
example
consider
searching
a
large
array
 !  If
the
target
is
the
first
item
in
the
array
the
search
will
be
 very
quick
 John Edgar 12 !  Algorithm
efficiency
is
often
calculated
for
three
 broad
cases
of
input
 !  Best
case
 !  Average
(or
“usual”)
case
 !  Worst
case
 !  This
analysis
considers
how
performance
varies
 for
different
inputs
of
the
same
size
 John Edgar 13 !  It
can
be
difficult
to
determine
the
exact
number
of
 operations
performed
by
an
algorithm
 An
alternative
to
counting
all
instructions
is
to
focus
 on
an
algorithm's
barometer
instruction
 !  The
barometer
instruction
is
the
instruction
that
is
executed
 !  Though
it
is
often
still
useful
to
do
so
 !  the
most
number
of
times
in
an
algorithm
 !  The
number
of
times
that
the
barometer
instruction
is
 executed
is
usually
proportional
to
its
running
time
 John Edgar 14 !  Let's
analyze
and
compare
some
different
 algorithms
 !  Linear
search
 !  Binary
search
 !  Selection
sort
 !  Insertion
sort
 !  Quick
sort
 John Edgar 15 !  It
is
often
useful
to
find
out
whether
or
not
a
 list
contains
a
particular
item
 !  If
the
array
isn't
sorted
use
linear
search
 !  Start
with
the
first
item,
and
go
through
the
array
 !  Such
a
search
can
either
return
true
or
false
 !  Or
the
position
of
the
item
in
the
list
 comparing
each
item
to
the
target
 !  If
the
target
item
is
found
return
true
(or
the
index
 of
the
target
element)
 John Edgar 17 Java public int linSearch(int arr, int target){ for (int i=0; i < arr.length; i++){ if(target == arr[i]){ The
function
returns
as
soon
as return i; 
the
target
item
is
found
 } } //for return -1; //target not found } return
‐1
to
indicate
that
the 
item
has
not
been
found
 John Edgar 18 !  !  Iterate
through
an
array
of
n
items
searching
for
the
 target
item
 The
barometer
instruction
is
equality
checking
(or
 comparisons
for
short)
 !  x.equals(arr[i]); //for an object or !  x == arr[i]; //for a primitive type !  There
are
actually
two
other
barometer
instructions,
what
 !  How
many
comparisons
does
linear
search
do?
 John Edgar 19 are
they? !  Best
case
 Worst
case
 !  The
target
is
the
first
element
of
the
array
 !  Make
1
comparison
 !  The
target
is
not
in
the
array
or
 !  The
target
is
at
the
last
position
in
the
array
 !  Make
n
comparisons
in
either
case
 !  Is
it
(Best
case

+
Worst
case)
/
2,
so
(n+
1)
/
2?
 !  !  Average
case
 John Edgar 20 !  There
are
two
situations
when
the
worst
case
 arises
 To
calculate
the
average
cost
we
need
to
know
 how
often
these
two
situations
arise
 !  We
can
make
assumptions
about
this
 !  Though
any
these
assumptions
may
not
hold
for
a
 !  When
the
target
is
the
last
item
in
the
array
 !  When
the
target
is
not
there
at
all
 !  particular
use
of
linear
search
 John Edgar 21 !  Assume
that
the
target
is
not
in
the
array
½
 the
time
 !  Therefore
½
the
time
the
entire
array
has
to
be
 searched
 !  Assume
that
there
is
an
equal
probability
of
 the
target
being
at
any
array
location
 is
at
some
location
i
 John Edgar !  If
it
is
in
the
array
 !  That
is,
there
is
a
probability
of
1/n
that
the
target
 22 !  Work
done
if
the
target
is
not
in
the
array
 !  n
comparisons
 !  This
occurs
with
probability
of
0.5
 John Edgar 23 !  Work
done
if
target
is
in
the
array:
 !  1
comparison
if
target
is
at
the
1st
location
 ▪  Occurs
with
probability
1/n
(second
assumption)
 !  2
comparisons
if
target
is
at
the
2nd
location
 ▪  Also
occurs
with
probability
1/n

 !  i
comparisons
if
target
is
at
the
ith
location
 !  Take
the
weighted
average
of
the
values
to
find
the
 total
expected
number
of
comparisons
(E)
 !  E
=
1*1/n
+
2*1/n
+
3*1/n
+
…
+
n
*
1/n
or
 !  E
=
(n
+
1)
/
2
 John Edgar 24 !  !  !  Target
is
not
in
the
array:
n
comparisons
 Target
is
in
the
array
(n
+
1)
/
2
comparisons
 Take
a
weighted
average
of
the
two
amounts:
 !  =
(n
*
½)
+
((n
+
1)
/
2
*
½)
 !  =
(n
/
2)
+
((n
+
1)
/
4)
 !  =
(2n
/
4)
+
((n
+
1)
/
4)
 !  =
(3n
+
1)
/
4
 !  Therefore,
on
average,
we
expect
linear
search
to
 perform
(3n
+
1)
/
4
comparisons*
 
 *recall
the
assumption
we
made
about
½
not
in
array
 John Edgar 25 !  If
we
sort
the
target
array
first
we
can
change
the
 linear
search
average
cost
to
around
n
/
2
 !  Once
a
value
equal
to
or
greater
than
the
target
is
found
 !  However,
if
the
array
is
sorted,
it
is
possible
to
do
 much
better
than
this
 John Edgar the
search
can
end
 ▪  So,
if
a
sequence
contains
8
items,
on
average,
linear
 search
compares
4
of
them,

 ▪  If
a
sequence
contains
1,000,000
items,
linear
search
 compares
500,000
of
them,
etc.
 26 Search
for
32 
 Guess
that
the
target
item
is
in
the
middle,
that
is
index
=
15
/
2
=
7 
 value index 07 11 15 21 29 32 44 45 57 61 64 73 79 81 0 1 2 3 4 5 6 7 8 9 10 11 12 13 86 14 92 15 The
array
is
sorted,
and
contains
16
items
indexed
from
0
to
15
 John Edgar 27 Search
for
32 
 45
is
greater
than
32
so
the
target
must
be
in
the
lower
half
of
the
array 
 Repeat
the
search,
guessing
the
mid
point
of
the
lower
subarray
(6
/
2
=
3)
 value index 07 11 15 21 29 32 44 45 57 61 64 73 79 81 0 1 2 3 4 5 6 7 8 9 10 11 12 13 86 14 92 15 Everything
in
the
upper
half
of
the
array
can
be
ignored,
halving
the
search
space
 John Edgar 28 Search
for
32 
 21
is
less
than
32
so
the
target
must
be
in
the
upper
half
of
the
subarray Repeat
the
search,
guessing
the
mid
point
of
the
new
search
space,
5 
 The
target
is
found
so
the
search
can
terminate
 value index 07 11 15 21 29 32 44 45 57 61 64 73 79 81 0 1 2 3 4 5 6 7 8 9 10 11 12 13 86 14 92 15 The
mid
point
=
(lower
subarray
index
+
upper
index)
/
2 
 John Edgar 29 !  Requires
that
the
array
is
sorted
 !  In
either
ascending
or
descending
order
 !  Make
sure
you
know
which!
 !  A
divide
and
conquer
algorithm
 !  Each
iteration
divides
the
problem
space
in
half
 !  Ends
when
the
target
is
found
or
the
problem
space
 consists
of
one
element
 John Edgar 30 Java public int binSearch(int arr, int target){ int lower = 0; Index
of
the
last
element
in int upper = arr.length - 1; 
the
array
 int mid = 0; while (lower <= upper){ mid = (lower + upper) / 2; if(target == arr[mid]){ return mid; } else if(target > arr[mid]){ Note
the
if,
else
if, lower = mid + 1; 
else
 } else { //target < arr[mid] upper = mid - 1; } } //while return -1; //target not found } John Edgar 31 !  The
algorithm
consists
of
three
parts
 !  Initialization
(setting
lower
and
upper)
 !  While
loop
including
a
return
statement
on
success
 !  Return
statement
which
executes
when
on
failure
 !  !  Initialization
and
return
on
failure
require
the
same
 amount
of
work
regardless
of
input
size
 The
number
of
times
that
the
while
loop
iterates
 depends
on
the
size
of
the
input
 John Edgar 32 !  !  The
while
loop
contains
an
if,
else
if,
else
statement
 The
first
if
condition
is
met
when
the
target
is
found
 !  And
is
therefore
performed
at
most
once
each
time
the
 algorithm
is
run
 !  The
algorithm
usually
performs
5
operations
for
each
 iteration
of
the
while
loop
 !  Checking
the
while
condition
 !  Assignment
to
mid
 !  Equality
comparison
with
target
 !  Inequality

comparison

 !  One
other
operation
(setting
either
lower
or
upper)
 John Edgar 33 !  In
the
best
case
the
target
is
the
midpoint
 element
of
the
array
 !  Requiring
one
iteration
of
the
while
loop
 John Edgar 34 !  What
is
the
worst
case
for
binary
search?
 !  Either
the
target
is
not
in
the
array,
or

 !  It
is
found
when
the
search
space
consists
of
one
 element
 !  How
many
times
does
the
while
loop
iterate
 in
the
worst
case?
 John Edgar 35 !  Each
iteration
of
the
while
loop
halves
the
search
 space
 !  For
simplicity
assume
that
n
is
a
power
of
2
 ▪  So
n
=
2k
(e.g.
if
n
=
128,
k
=
7)
 !  The
first
iteration
halves
the
search
space
to
n/2
 !  After
the
second
iteration
the
search
space
is
n/4
 !  After
the
kth
iteration
the
search
space
consists
of
just
one
 element,
since
n/2k
=
n/n
=
1
 ▪  Because
n
=
2k,
k
=
log2n
 !  Therefore
at
most
log2n
iterations
of
the
while
loop
are
 made
in
the
worst
case!
 John Edgar 36 !  Is
the
average
case
more
like
the
best
case
or
the
 worst
case?
 !  What
is
the
chance
that
an
array
element
is
the
target
 ▪  1/n
the
first
time
through
the
loop
 ▪  1/(n/2)
the
second
time
through
the
loop
 ▪  …
and
so
on
…
 !  It
is
more
likely
that
the
target
will
be
found
as
the
 search
space
becomes
small
 !  That
is,
when
the
while
loop
nears
its
final
iteration
 !  We
can
conclude
that
the
average
case
is
more
like
the
 worst
case
than
the
best
case
 John Edgar 37 n 10 100 1,000 10,000 100,000 1,000,000 10,000,000 John Edgar (3n+1)/4 8 76 751 7,501 75,001 750,001 7,500,001 log2(n) 3 7 10 13 17 20 24 38 !  As
an
example
of
algorithm
analysis
let's
look
at
two
 simple
sorting
algorithms
 !  Selection
Sort
and
 !  Insertion
Sort
 !  Calculate
an
approximate
cost
function
for
these
 two
sorting
algorithms

 !  By
analyzing
how
many
operations
are
performed
by
 each
algorithm
 !  This
will
include
an
analysis
of
how
many
times
the
 algorithms'
loops
iterate
 John Edgar 40 !  Selection
sort
is
a
simple
sorting
algorithm
 that
repeatedly
finds
the
smallest
item
 unsorted
part
 !  The
array
is
divided
into
a
sorted
part
and
an
 !  Repeatedly
swap
the
first
unsorted
item
with
 the
smallest
unsorted
item
 !  Starting
with
the
element
with
index
0,
and
 !  Ending
with
last
but
one
element
(index
n
–
1)
 John Edgar 41 23 41 33 81 07 19 11 45 07 41 33 81 23 19 11 45 07 11 33 81 23 19 41 45 07 11 19 81 23 33 41 45 07 11 19 23 81 33 41 45 07 11 19 23 33 81 41 45 07 11 19 23 33 41 81 45 07 11 19 23 33 41 45 81 find smallest unsorted - 7 comparisons find smallest unsorted - 6 comparisons find smallest unsorted - 5 comparisons find smallest unsorted - 4 comparisons find smallest unsorted - 3 comparisons find smallest unsorted - 2 comparisons find smallest unsorted - 1 comparison John Edgar 42 Unsorted
elements n n‐1 … 3 2 1 Comparisons
to
find
 smallest n‐1 n‐2 … 2 1 0 n(n‐1)/2 John Edgar 43 Java public void selectionSort(int arr){ for(int i = 0; i < arr.length-1; ++i){ int smallest = i; outer
loop
 // Find the index of the smallest element n‐1
times
 for(int j = i + 1; j < arr.length; ++j){ if(arr[j] < arr[smallest]){ smallest = j; inner
loop
body
 } n(n
–
1)/2
times
 } // Swap the smallest with the current item int temp = arr[i];{ arr[i] = arr[smallest]; arr[smallest] = temp; } } John Edgar 44 !  The
outer
loop
is
evaluated
n‐1
times
 The
inner
loop
is
evaluated
n(n
–
1)/2
times
 !  There
are
4
instructions
but
one
is
only
evaluated
some
of
 !  7
instructions
(including
the
loop
statements)
 !  Cost
is
7(n‐1)
 !  !  !  Some
constant
amount
(k)
of
work
is
performed
 Total
cost:
7(n‐1)
+
4(n(n
–
1)/2)
+
k
 !  Assumption:
all
instructions
have
the
same
cost
 John Edgar 45 the
time
 !  Worst
case
cost
is
4(n(n
–
1)/2)
 ▪  e.g.
initializing
the
outer
loop
 !  In
broad
terms
and
ignoring
the
actual
number
of
 executable
statements
selection
sort
 !  Makes
n*(n
–
1)/2
comparisons,
regardless
of
the
original
 !  Neither
of
these
operations
are
substantially
 affected
by
the
organization
of
the
input

 order
of
the
input
 !  Performs
n
–
1
swaps
 John Edgar 46 !  Another
simple
sorting
algorithm
 !  Divides
array
into
sorted
and
unsorted
parts
 !  The
sorted
part
of
the
array
is
expanded
one
 element
at
a
time
 !  Find
the
correct
place
in
the
sorted
part
to
place
 the
1st
element
of
the
unsorted
part
 ▪  By
searching
through
all
of
the
sorted
elements

 !  Move
the
elements
after
the
insertion
point
up
 one
position
to
make
space
 John Edgar 47 23 41 33 81 07 19 11 45 23 41 33 81 07 19 11 45 23 33 41 81 07 19 11 45 23 33 41 81 07 19 11 45 07 23 33 41 81 19 11 45 07 19 23 33 41 81 11 45 07 11 19 23 33 41 81 45 07 11 19 23 33 41 45 81 treats first element as sorted part locate position for 41 - 1 comparison locate position for 33 - 2 comparisons locate position for 81 - 1 comparison locate position for 07 - 4 comparisons locate position for 19- 5 comparisons locate position for 11- 6 comparisons locate position for 45 – 2 comparisons John Edgar 48 Java public void insertionSort(int arr){ for(int i = 1; i < arr.length; ++i){ outer
loop
 int temp = arr[i]; n‐1

times
 int pos = i; // Shuffle up all sorted items > arr[i] while(pos > 0 && arr[pos - 1] > temp){ arr[pos] = arr[pos – 1]; inner
loop
body
 how
many
times?
 pos--; } //while // Insert the current item min:
just
the
test
for
each 
outer
loop
iteration,
n

 arr[pos] = temp; } max:
i
–
1
times
for
each } 
iteration,
n
*
(n
–
1)
/
2
 John Edgar 49 Sorted
 Elements 0 1 2 … n‐1 Worst‐case
 Search 0 1 2 … n‐1 n(n‐1)/2 Worst‐case
 Shuffle 0 1 2 … n‐1 n(n‐1)/2 John Edgar 50 !  The
efficiency
of
insertion
sort
is
affected
by
 the
state
of
the
array
to
be
sorted
 !  In
the
best
case
the
array
is
already
 completely
sorted!
 !  Requires
n
comparisons
 !  No
movement
of
array
elements
is
required
 John Edgar 51 !  In
the
worst
case
the
array
is
in
reverse
order
 !  Every
item
has
to
be
moved
all
the
way
to
the
 front
of
the
array
 !  The
outer
loop
runs
n‐1
times
 ▪  In
the
first
iteration,
one
comparison
and
move
 ▪  In
the
last
iteration,
n‐1
comparisons
and
moves
 ▪  On
average,
n/2
comparisons
and
moves
 !  For
a
total
of
n
*
(n‐1)
/
2
comparisons
and
moves
 John Edgar 52 !  What
is
the
average
case
cost?
 !  Is
it
closer
to
the
best
case?
 !  Or
the
worst
case?
 !  If
random
data
are
sorted,
insertion
sort
is
 usually
closer
to
the
worst
case
 !  Around
n
*
(n‐1)
/
4
comparisons
 !  What
is
average
input
for
a
sorting
 algorithm
in
any
case?
 John Edgar 53 !  Quicksort
is
a
more
efficient
sorting
algorithm
than
 either
selection
or
insertion
sort
 We
will
go
over
the
basic
idea
of
quicksort
and
an
 example
of
it
 !  See
text
for
details
 !  It
sorts
an
array
by
repeatedly
partitioning
it
 !  John Edgar 55 !  Partitioning
is
the
process
of
dividing
an
array
into
 sections
(partitions),
based
on
some
criteria
 !  "Big"
and
"small"
values
 !  Negative
and
positive
numbers
 !  Names
that
begin
with
a‐m,
names
that
begin
with
n‐z
 !  Darker
and
lighter
pixels
 !  Quicksort
uses
repeated
partitioning
to
sort
an
array
 John Edgar 56 Partition
this
array
into
small 
and
big
values
using
a 
partitioning
algorithm
 31 12 07 23 93 02 11 18 John Edgar 57 Partition
this
array
into
small 
and
big
values
using
a 
partitioning
algorithm
 We
will
partition
the
array 
around
the
last
value
(18),
we'll 
call
this
value
the
pivot

 31 12 07 23 93 02 11 18 18 smalls < 18 pivot bigs > 18 John Edgar 58 Partition
this
array
into
small 
and
big
values
using
a 
partitioning
algorithm
 We
will
partition
the
array 
around
the
last
value
(18),
we'll 
call
this
value
the
pivot
 Use
two
indices,
one
at
each 
end
of
the
array,
call
them
low 
and
high
 31 12 07 23 93 02 11 18 arr[low
]
is
greater
than
the
pivot
and 
should
be
on
the
right,
we
need
to 
swap
it
with
something
 John Edgar 59 Partition
this
array
into
small 
and
big
values
using
a 
partitioning
algorithm
 We
will
partition
the
array 
around
the
last
value
(18),
we'll 
call
this
value
the
pivot
 Use
two
indices,
one
at
each 
end
of
the
array,
call
them
low 
and
high
 31 12 07 23 93 02 11 18 arr[low
]
(31)
is
greater
than
the
pivot 
and
should
be
on
the
right,
we
need
to 
swap
it
with
something
 arr[high]
(11)
is
less
than
the
pivot
so 
swap
with
arr[low
]
 John Edgar 60 Partition
this
array
into
small 
and
big
values
using
a 
partitioning
algorithm
 We
will
partition
the
array 
around
the
last
value
(18),
we'll 
call
this
value
the
pivot
 Use
two
indices,
one
at
each 
end
of
the
array,
call
them
low 
and
high
 1 31 12 07 23 93 02 11 18 3 John Edgar 61 Partition
this
array
into
small 
and
big
values
using
a 
partitioning
algorithm
 We
will
partition
the
array 
around
the
last
value
(18),
we'll 
call
this
value
the
pivot
 Use
two
indices,
one
at
each 
end
of
the
array,
call
them
low 
and
high
 11 12 07 23 93 02 31 18 12 02 23 repeat
this
process
until:
 John Edgar 62 Partition
this
array
into
small 
and
big
values
using
a 
partitioning
algorithm
 We
will
partition
the
array 
around
the
last
value
(18),
we'll 
call
this
value
the
pivot
 Use
two
indices,
one
at
each 
end
of
the
array,
call
them
low 
and
high
 11 12 07 02 93 23 31 18 repeat
this
process
until:
 high
and
low
are
the
same
 John Edgar 63 Partition
this
array
into
small 
and
big
values
using
a 
partitioning
algorithm
 We
will
partition
the
array 
around
the
last
value
(18),
we'll 
call
this
value
the
pivot
 Use
two
indices,
one
at
each 
end
of
the
array,
call
them
low 
and
high
 11 12 07 02 93 23 31 18 18 93 repeat
this
process
until:
 high
and
low
are
the
same
 We'd
like
the
pivot
value
to
be
in
the 
centre
of
the
array,
so
we
will
swap
it 
with
the
first
item
greater
than
it
 John Edgar 64 Partition
this
array
into
small 
and
big
values
using
a 
partitioning
algorithm
 We
will
partition
the
array 
around
the
last
value
(18),
we'll 
call
this
value
the
pivot
 Use
two
indices,
one
at
each 
end
of
the
array,
call
them
low 
and
high
 11 12 07 02 18 23 31 93 smalls pivot bigs John Edgar 65 Use
the
same
algorithm
to 
partition
this
array
into
small 
and
big
values
 00 08 07 01 06 02 05 09 00 08 07 01 06 02 05 09 smalls bigs! pivot John Edgar 66 09 08 07 06 05 04 02 01 Or
this
one:
 01 08 07 06 05 04 02 09 smalls pivot bigs John Edgar 67 !  !  The
quicksort
algorithm
works
by
repeatedly
 partitioning
an
array
 Each
time
a
subarray
is
partitioned
there
is
 !  A
sequence
of
small
values,
 !  A
sequence
of
big
values,
and

 !  A
pivot
value
which
is
in
the
correct
position
 !  Partition
the
small
values,
and
the
big
values
 !  Repeat
the
process
until
each
subarray
being
partitioned
 consists
of
just
one
element
 John Edgar 68 !  How
long
does
quicksort
take
to
run?
 !  Let's
consider
the
best
and
the
worst
case
 !  These
differ
because
the
partitioning
algorithm
may
not
 !  Let's
look
at
the
best
case
first
 !  Each
time
a
subarray
is
partitioned
the
pivot
is
the
exact
 always
do
a
good
job
 midpointof
the
slice
(or
as
close
as
it
can
get)
 ▪  So
it
is
divided
in
half
 !  What
is
the
running
time?
 John Edgar 69 08 01 02 07 03 06 04 05 First
partition
 04 01 02 03 05 06 08 07 smalls pivot bigs John Edgar 70 04 01 02 03 05 06 08 07 Second
partition
 pivot1 pivot2 02 01 03 04 05 06 07 08 sm1 big1 pivot1 sm2 big2 pivot2 John Edgar 71 02 01 03 04 05 06 07 08 Third
partition
 pivot1 done done done 01 02 03 04 05 06 07 08 pivot1 John Edgar 72 !  Each
subarray
is
divided
exactly
in
half
in
each
set
of
 partitions
 !  Each
time
a
series
of
subarrays
are
partitioned
around
n
 comparisons
are
made
 !  The
process
ends
once
all
the
subarrays
left
to
be
partitioned
 are
of
size
1
 !  How
many
times
does
n
have
to
be
divided
in
half
 before
the
result
is
1?
 !  log2
(n)
times
 !  Quicksort
performs
around
n
*
log2
(n)
operations
in
the
best
 case
 John Edgar 73 09 08 07 06 05 04 02 01 First
partition
 01 08 07 06 05 04 02 09 smalls pivot bigs John Edgar 74 01 08 07 06 05 04 02 09 Second
partition
 01 08 07 06 05 04 02 09 smalls bigs pivot John Edgar 75 01 08 07 06 05 04 02 09 Third
partition
 01 02 07 06 05 04 08 09 bigs pivot John Edgar 76 01 02 07 06 05 04 08 09 Fourth 
partition
 01 02 07 06 05 04 08 09 smalls pivot John Edgar 77 01 02 07 06 05 04 08 09 Fifth
partition
 01 02 04 06 05 07 08 09 pivot bigs John Edgar 78 01 02 04 06 05 07 08 09 Sixth
partition
 01 02 04 06 05 07 08 09 smalls pivot John Edgar 79 01 02 04 06 05 07 08 09 Seventh(!) 
partition
 01 02 04 05 06 07 08 09 pivot John Edgar 80 !  Every
partition
step
results
in
just
one
 partition
on
one
side
of
the
pivot
 !  The
array
has
to
be
partitioned
n
times,
not
log2 (n)
times
 !  So
in
the
worst
case
quicksort
performs
around
n2
 operations
 !  The
worst
case
usually
occurs
when
the
array
 is
nearly
sorted
(in
either
direction)
 John Edgar 81 !  With
a
large
array
we
would
have
to
be
very,
 very
unlucky
to
get
the
worst
case
 already
be
partially
sorted
 !  Unless
there
was
some
reason
for
the
array
to
 ▪  In
which
case
first
randomize
the
position
of
the
array
 elements!
 !  The
average
case
is
much
more
like
the
best
case
 than
the
worst
case
 John Edgar 82 !  !  !  !  !  Linear
search:
3(n
+
1)/4
–
average
case
 Binary
search:
log2n
–
worst
case
 Selection
sort:
n((n
–
1)
/
2)
–
all
cases
 Insertion
sort:
n((n
–
1)
/
2)
–
worst
case
 Quicksort:
n(log2(n))
–
best
case
 !  Average
case
is
similar
to
the
worst
case
 !  Average
case
is
similar
to
the
best
case
 !  Average
case
similar
to
the
worst
case
 !  Given
certain
assumptions
 John Edgar 84 !  Let's
compare
these
algorithms
for
some
 arbitrary
input
size
(say
n
=
1,000)
 !  In
order
of
the
number
of
comparisons
 ▪  Binary
search
 ▪  Linear
search
 ▪  Insertion
sort
best
case
 ▪  Quicksort
average
and
best
cases
 ▪  Selection
sort
all
cases,
Insertion
sort
average
and
worst
 cases,
Quicksort
worst
case
 John Edgar 85 !  What
do
we
want
to
know
when
comparing
 two
algorithms?
 !  The
most
important
thing
is
how
quickly
the
time
 requirements
increase
with
input
size
 !  e.g.
If
we
double
the
input
size
how
much
longer
 does
an
algorithm
take?
 !  Here
are
some
graphs
…
 John Edgar 86 Hard
to
see
what
is
happening
with
n
so
small
…
 450
 400
 350
 300
 250
 200
 150
 100
 50
 0
 10
 11
 12
 13
 14
 15
 n
 16
 17
 18
 19
 20
 log2n
 5(log2n)
 3(n+1)/4
 n
 n(log2n)
 n((n‐1)/2)
 n2
 Number
of
Operations
 John Edgar 87 n2
and
n(n‐1)/2
are
growing
much
faster
than
any
of
the
others
 12000
 10000
 Number
of
Operations
 8000
 log2n
 5(log2n)
 6000
 3(n+1)/4
 n
 4000
 n(log2n)
 n((n‐1)/2)
 n2
 2000
 0
 10
 20
 30
 40
 50
 n
 60
 70
 80
 90
 100
 John Edgar 88 Hmm!

Let's
try
a
logarithmic
scale
…
 1200000000000
 1000000000000
 Number
of
Operations
 800000000000
 log2n
 5(log2n)
 600000000000
 3(n+1)/4
 n
 400000000000
 n(log2n)
 n((n‐1)/2)
 n2
 200000000000
 0
 10
 50
 100
 500
 1000
 5000
 n
 10000
 50000
 100000
 500000
 1000000
 John Edgar 89 Notice
how
clusters
of
growth
rates
start
to
emerge
 1000000000000
 100000000000
 10000000000
 1000000000
 Number
of
Operations
 100000000
 10000000
 1000000
 100000
 10000
 1000
 100
 10
 1
 10
 50
 100
 500
 1000
 5000
 n
 10000
 50000
 100000
 500000
 1000000
 log2n
 5(log2n)
 3(n+1)/4
 n
 n(log2n)
 n((n‐1)/2)
 n2
 John Edgar 90 !  Exact
counting
of
operations
is
often
difficult
(and
 tedious),
even
for
simple
algorithms
 !  And
is
often
not
much
more
useful
than
estimates
due
to
 !  O
Notation
is
a
mathematical
language
for
 evaluating
the
running‐time

of
algorithms
 !  O‐notation
evaluates
the
growth
rate
of
an
algorithm
 the
relative
importance
of
other
factors
 John Edgar 91 !  !  Cost
Function:

tA(n)
=
n2
+
20n
+
100
 It
depends
on
the
size
of
n
 !  n
=
2,
tA(n)
=
4
+
40
+
100
 ▪  The
constant,
100,
is
the
dominating
term
 !  n
=
10,
tA(n)
=
100
+
200
+
100
 ▪  20n
is
the
dominating
term
 !  n
=
100,
tA(n)
=
10,000
+
2,000
+
100
 ▪  n2
is
the
dominating
term
 !  n
=
1000,
tA(n)
=
1,000,000
+
20,000
+
100
 ▪  n2
is
the
dominating
term
 John Edgar !  Which
term
in
the
funtion
is
most
important
(dominates)?
 92 !  O
notation
approximates
a
cost
function
that
allows
 us
to
estimate
growth
rate
 !  The
approximation
is
usually
good
enough
 !  Count
the
number
of
times
that
an
algorithm
 executes
its
barometer
instruction
 !  And
determine
how
the
count
increases
as
the
input
size
 ▪  Especially
when
considering
the
efficiency
of
an
 algorithm
as
n
gets
very
large
 increases
 John Edgar 93 !  An
algorithm
is
said
to
be
order
f(n)
 !  Denoted
as
O(f(n))
 !  The
function
f(n)
is
the
algorithm's
growth
 rate
function
 !  If
a
problem
of
size
n
requires
time
proportional
to
 n
then
the
problem
is
O(n)
 ▪  i.e.
If
the
input
size
is
doubled
then
the
running
time
is
 doubled
 John Edgar 94 !  An
algorithm
is
order
f(n)
if
there
are
positive
 constants
k
and
m
such
that

 !  tA(n)
≤
k*f(n)
for
all
n
≥
m
 !  If
so
we
would
say
that
tA(n)
is
O(f(n))
 !  The
requirement
n
>
m
expresses
that
the
time
 estimate
is
correct
if
n
is
sufficiently
large

 John Edgar 95 !  The
idea
is
that
a
cost
function
can
be
approximated
 by
another,
simpler,
function

 !  The
simpler
function
has
1variable,
the
data
size
n
 !  This
function
is
selected
such
that
it
represents
an
upper
 !  Saying
that
the
time
efficiency
of
algorithm
A
tA(n)
 is
O(f(n))
means
that
 !  A
cannot
take
more
than
O(f(n))
time
to
execute,
and
 !  The
cost
function
tA(n)
grows
at
most
as
fast
as
f(n)
 bound
on
the
value
of
tA(n)
 John Edgar 96 !  Consider
an
algorithm
with
a
cost
function
of
 3n
+
12
 !  Find
values
of
k
and
m
so
that
this
is
true
 !  k
=
4,
and
 !  m
=
12
then
 !  4n
≥
3n
+
12
for
all
n
≥
12
 John Edgar 97 !  If
we
can
find
constants
m
and
k
such
that:
 !  k
*
n
>
3n
+
12
for
all
n
≥
m
then
 !  The
algorithm
is
O(n)
 !  Consider
an
algorithm
with
a
cost
function
of
 2n2
+
10n
+
6
 !  Find
values
of
k
and
m
so
that
this
is
true
 !  k
=
3,
and
 !  m
=
11
then
 !  3n2
>
2n2
+
10n
+
6
for
all
n
≥
11
 John Edgar 98 !  If
we
can
find
constants
m
and
k
such
that:
 !  k
*
n2
>
2n2
+
10n
+
6
for
all
n
≥
m
then
 !  The
algorithm
is
O(n2)
 1400
 1200
 1000
 800
 2n2+10n+6
 600
 3n2
 400
 200
 0
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 John Edgar 99 !  !  !  When
using
Big‐O
notation
 Instead
of
giving
a
precise
formulation
of
the
cost
 function
for
a
particular
data
size
 Express
the
behaviour
of
the
algorithm
as
the
data
 size
n
grows
very
large
so
ignore
 !  lower
order
terms
and
 !  constants
 John Edgar 100 !  !  !  All
these
expressions
are
O(n):
 All
these
expressions
are
O(n2):
 All
these
expressions
are
O(n
log
n):
 !  n(log
n),
5n(log
99n),
18
+
(4n
–
2)(log
(5n
+
3)),
…
 !  n2,
9n2,
18n2
+
4n
–
53,
…
 !  n,
3n,
61n
+
5,
22n
–
5,
…
 John Edgar 101 !  !  !  O(k
*
f)
=
O(f)
if
k
is
a
constant
 O(f
+
g)
=
max[O(f),
O(g)]
 O(f
*
g)
=
O(f)
*
O(g)
 !  e.g.
O(23
*
O(log
n)),
simplifies
to
O(log
n)
 !  O(n
+
n2),
simplifies
to
O(n2)
 !  O(m
*
n),
equals

O(m)
*
O(n)
 !  Unless
there
is
some
known
relationship
between
m
and
n
 that
allows
us
to
simplify
it,
e.g.
m
<
n
 John Edgar 102 !  !  !  !  !  !  !  O(1)
–
constant
time
 O(log
n)
–
logarithmic
time
 !  The
time
is
independent
of
n,
e.g.
list
look‐up
 !  Usually
the
log
is
to
the
base
2,
e.g.
binary
search
 O(n)
–
linear
time,
e.g.
linear
search
 O(n*logn)
–
e.g.
quicksort,
mergesort
 O(n2)
–
quadratic
time,
e.g.
selection
sort
 O(nk)
–
polynomial
(where
k
is
some
constant)
 O(2n)
–
exponential
time,
very
slow!
 John Edgar 103 !  We
write
O(1)
to
indicate
something
that
takes
a
 constant
amount
of
time
 !  e.g.
finding
the
minimum
element
of
an
ordered
array
 takes
O(1)
time
 !  Important:
constants
can
be
huge
 ▪  The
min
is
either
at
the
first
or
the
last
element
of
the
array
 !  So
in
practice
O(1)
is
not
necessarily
efficient
 !  It
tells
us
is
that
the
algorithm
will
run
at
the
same
speed
 no
matter
the
size
of
the
input
we
give
it
 John Edgar 104 !  !  The
O‐notation
growth
rate
of
some
algorithms
 varies
depending
on
the
input
 Typically
we
consider
three
cases:
 !  Worst
case,
usually
(relatively)
easy
to
calculate
and
 therefore
commonly
used
 !  Average
case,
often
difficult
to
calculate
 !  Best
case,
usually
easy
to
calculate
but
less
important
 than
the
other
cases
 John Edgar 105 !  Linear
search
 !  Best
case:
O(1)
 !  Average
case:
O(n)
 !  Worst
case:
O(n)
 !  Binary
search
 !  Best
case:
O(1)
 !  Average
case:
O(log
n)
 !  Worst
case:
O(log
n)
 John Edgar 106 !  Quicksort
 !  Best
case:
O(n(log2n))
 !  Average
case:
O(n(log2n))
 !  Worst
case:
O(n2)
 !  Best
case:
O(n(log2n))
 !  Average
case:
O(n(log2n))
 !  Worst
case:
O(n(log2n))
 John Edgar 107 !  Mergesort
 !  Selection
sort
 !  Best
Case:
O(n2)
 !  Average
case:
O(n2)
 !  Worst
case:
O(n2)
 !  Best
case:
O(n)
 !  Average
case:
O(n2)
 !  Worst
case:
O(n2)
 John Edgar 108 !  Insertion
sort
 January 2010 Greg Mori 109 !  Analyzing
algorithm
running
time
 !  Record
actual
running
time
(e.g.
in
seconds)
 ▪  Sensitive
to
many
system
/
environment
conditions
 !  Count
instructions
 !  Summarize
coarse
behaviour
of
instruction
count
 ▪  O
Notation
 !  Note
that
all
are
parameterized
by
problem
size
(“n”)
 !  Analyze
best,
worst,
“average”
case
 John Edgar 110 !  Sorting
algorithms
 !  Insertion
sort
 !  Selection
sort
 !  Quicksort
 !  Running
times
of
sorting
algorithms
 John Edgar 111 !  Java
Ch.
10
 !  C++
Ch.
9
 John Edgar 112 ...
View Full Document

This note was uploaded on 04/17/2010 for the course CMPT 11151 taught by Professor Gregorymori during the Spring '10 term at Simon Fraser.

Ask a homework question - tutors are online