Lecture7-PrHierarchy

Lecture7-PrHierarchy -...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: BIOC*2580
Lecture
7.
Polypeptides
and
proteins:

 structural
hierarchy
and
sequence
 1
 
 Synopsis:
Protein
structure
is
very
much
more
complex
than
any
simple
organic
chemical,
but
 by
eliminating
detail,
a
pattern
or
hierarchy
of
organization
emerges:
 
 Primary
structure,
the
specific
sequence
of
amino
acids
in
the
polypeptide
chain;
 Secondary
structure,
the
occurrence
of
regular
repetitive
patterns
over
short
regions
of
the
 polypeptide;
 Tertiary
structure,
the
overall
folding
of
the
complete
polypeptide
chain;

 Quarternary
 structure,
 linking
 of
 several
 protein
 molecules
 to
 form
 a
 larger
 complex
 with
 distinct
properties.
 
 These
four
levels
of
protein
structure
depend
in
various
ways
on
the
the
amino
acid
sequence,
 so
the
chemistry
for
determining
amino
acid
sequence
forms
our
starting
point
to
an
exploration
 of
protein
structure.
 
 Lehninger

p.
82‐84,
92‐100
(4th
ed
p.
88‐89,
96‐101)
 ___________________________________________________________________________
 
 A
 protein
 consists
 of
 a
 long
 linear
 chain
of
amino
acids.
 
 Myoglobin,
 an
 oxygen
 binding
 protein
found
in
muscle
tissue
has
 153
amino
acids
in
its
polypeptide
 chain
(see
sequence
at
left).
 
 This
 is
 a
 relatively
 small;
 some
 proteins
 contain
 hundreds
 or
 thousands
of
amino
acids.
 
 Describing
the
protein
as
a
set
of
amino
acids
is
one
way
to
simplify
the
structure,
however
it
is
 not
a
complete
or
accurate
view
of
the
structure.

 
 Protein
structure
is
clearly
very
different
from
that
of
simple
organic
compounds.
 
 
 
 Page
1
of
8
 BIOC*2580
Lecture
7.
Polypeptides
and
proteins:

 structural
hierarchy
and
sequence
 2
 
 We
can't
be
unduly
concerned
with
individual
C‐C
or
C‐H
bonds;
attempting
to
view
the
 myoglobin
structure
at
that
level
of
detail
shown
on
the
left
makes
it
impossible
to
grasp
the
 overall
organization.
 
 Instead,
the
structure
of
protein
is
viewed
through
a
series
of
simplifications.
Computer
software
 can
be
used
to
suppress
detail
and
make
visual
interpretation
easier.

The
structure
on
the
right
 is
the
same
myoglobin
molecule
seen
from
exactly
the
same
viewpoint
as
on
the
left.

It
traces
 the
path
of
the
polypeptide
backbone
as
a
ribbon
and
eliminates
the
side
chains.
Colour
can
be
 used
to
distinguish
the
sequence
‐
the
N‐terminus
is
shown
in
blue
and
progresses
through
the
 spectrum
 until
 we
 reach
 red
 at
 the
 C‐terminal
 end.
 (See
 also
 the
 simplified
 structure
 in
 Lehninger
Fig
4‐15
(a)
compared
with
(d)

(4th
ed
Fig.
6‐16
(a)
and
(e))).

 
 Regular
structural
organization
now
becomes
clearer.
Protein
structure
can
be
subdivided
into
a
 hierarchy
of
three
or
four
levels:

 
 
 
 Page
2
of
8
 BIOC*2580
Lecture
7.
Polypeptides
and
proteins:

 structural
hierarchy
and
sequence
 3
 
 Primary
structure

is
the
sequence
of
amino
acids
in
the
polypeptide
chain.

By
convention
 the
 N‐terminal
 amino
 acid
 is
 considered
 the
 start,
 and
 amino
 acids
 are
 numbered
 counting
 from
the
N‐terminal
end.
 
 The
 way
 that
 proteins
 function
 and
 act
 is
 dependent
 on
 spatial
 organization
 and
 side‐by‐side
 placement
of
particular
amino
acids
which
might
be
 far
apart
on
the
linear
polypeptide.

This
 means
that
each
protein
consists
of
a
polypeptide
that
folds
up
in
a
highly
specific
manner,
and
 the
pattern
of
folding
is
as
important
to
the
structure
as
are
the
covalent
bonds.


 
 Secondary
 structure:
 
 the
 polypeptide
 backbone
 is
 represented
 as
 a
 single
 ribbon,
 N
 terminus
 at
 lower
 left.
 
 The
 ribbon
 forms
 regular
 helical
 or
 spiral
 patterns
 in
 some
 parts,
 and
 irregular
 loops
 elsewhere.
 
 Regular
 repetitive
 folding
 patterns
 over
 short
 sections
 of
 the
 peptide
 chain
 (5‐20
 amino
 acids
 long)
 such
 as
 the
 helix
 sections
 appearing
 in
 myoglobin
 are
 called
secondary
structure.
(More
details
on
secondary
structure
in
Lecture
9.)
 Tertiary
structure
is
the
 overall
folding
of
the
whole
polypeptide.

For
myoglobin,
8
helical
 secondary
 structure
 segments
 fold
 together
 to
 enclose
 a
 central
 cavity.
 (More
 details
 on
 tertiary
structure
in
lecture
10.)
 
 Quarternary
 structure
 is
 the
 joining
 of
 several
 molecular
 units
 into
 a
 larger
 structure
 that
 has
 special
 properties.
 
 Hemoglobin,
 the
 O2
 binding
 protein
 of
 blood,
 consists
 of
 four
 independent
 molecules
 of
 globin,
 each
 similar
 in
 size
 and
 structure
 to
 myoglobin.
 
 The
 globin
 units
 are
 linked
 by
 non‐covalent
 bonds,
 but
 behave
 in
 a
 cooperative
 manner
 to
 make
 the
 O2
 carrying
function
of
hemoglobin
more
effective.

(More
details
 in
BIOC*3560.)
However,
not
all
 proteins
have
quarternary
structure.
 
 
 Investigation
of
structure
 
 All
higher
order
structure
(secondary,
tertiary,
etc)
derives
from
the
primary
structure,
namely
 the
 amino
acid
sequence
within
the
polypeptide.

To
find
out
how
a
polypeptide
chain
is
made
 up,
we
 need
to
find
out
 what
amino
acids
are
contained
in
it,
and
 in
what
order
or
sequence
 they
 occur.


To
do
this
it
is
necessary
to
break
the
peptide
bonds
so
that
the
amino
acids
can
 be
identified.
 
 Practical
aspects
of
peptide
hydrolysis
 
 H2O
 itself
 hydrolyses
 peptides
 bonds
 extremely
 slowly,
 because
 neutral
 O:
 is
 a
 poor
 nucleophile.
 Although
 it
 has
 two
 lone
 pairs,
 electronegative
 O:
 is
 less
 willing
 to
 share
 them
 than
N:
or
S:
 
 
 
 Page
3
of
8
 BIOC*2580
Lecture
7.
Polypeptides
and
proteins:

 structural
hierarchy
and
sequence
 4
 
 Hydrolysis
of
peptides
and
proteins
is
usually
done
with
a
catalyst:
 
 Acid
hydrolysis
is
done
in
6
M
HCl
at
110o
‐
it
takes
24‐72
hours
to
get
complete
breakdown
of
a
 peptide
chain
into
single
amino
acids
 
 Base
hydrolysis
is
done
in
 4
M
NaOH
at
110o,
and
takes
16
hours
for
complete
hydrolysis,
but
 some
amino
acids
are
destroyed
in
strong
base.
 
 Hydrolysis
may
also
be
carried
out
by
digestive
enzymes
called
proteases.

Enzymes
are
proteins
 that
have
a
catalytic
function,
in
this
case
to
hydrolyse
peptide
bonds.

 
 After
 hydrolysis,
 the
 amino
 acids
 present
 in
 a
 sample
 can
 be
 detected
 by
 chromatography
 e.g.ion
exchange
or
reversed
phase.
 
 
 Underlyingbasis
of
chemical
reactions
 
 The
chemical
basis
of
the
peptide
bond
breaking
reaction
such
as
hydrolysis
outlined
in
lecture
1
 is
 a
 chemical
 process
 called
 nucleophilic
 displacement.
 
 Since
 many
 reactions
 that
 take
 place
 under
biochemical
conditions
are
initiated
by
nucleophilic
attack,
we
need
to
understand
what
a
 nucleophile
is
and
why
it
can
lead
to
peptide
bond
breakage.
 
 Chemical
 reactivity
 is
 a
 consequence
 of
 imbalances
 in
 the
 distribution
 of
 valence
 electrons
 of
 atoms
 in
 molecules.
 
 Parts
 of
 molecules
 that
 are
 primarily
 C‐C
 and
 C‐H
 bonded
 are
 well
 balanced,
 non‐polar
 and
 chemically
 inert.
 
 However,
 where
 atoms
 seem
 to
 have
 valence
 electrons
 to
 spare
 or
 are
 electron
 deficient,
 or
 draw
 electrons
 towards
 them,
 these
 create
 imbalances
where
a
reaction
may
occur
as
the
atoms
seek
a
better
arrangement.

 
 A
 nucleophile
is
simply
an
atom
with
a
lone
pair
of
electrons
 which
is
available
to
share
with
 another
nucleus.

By
sharing
the
electron
pair
with
another
nucleus,
a
new
bond
is
formed.
 
 Atoms
with
lone
pairs:
 
 
 
 
 A
 nucleophilic
displacement
is
a
reaction
in
which
an
incoming
 nucleophile
X:
attacks
a
 target
 atom
 C
 to
 displace
 another
 attached
 group.
 
 The
 group
 Y
 that
 detaches
 is
 called
 the
 leaving
 group:
 X:




C

––

Y


→



X
––
C






:Y
 






 
 
 Page
4
of
8
 BIOC*2580
Lecture
7.
Polypeptides
and
proteins:

 structural
hierarchy
and
sequence
 5
 
 An
atom
with
a
lone
pair
can
use
it
in
different
ways:
 
 1)
The
atom
acts
as
a
hydrogen
bond
acceptor
if
it
simply
attracts
an
‐OH
or
‐NH
dipole
 
 e.g.


R‐O:
‐
‐
‐
H–N
 
 2)
The
atom
acts
as
a
base
if
it
uses
the
lone
pair
to
capture
H+
 
 
 
 e.g.

H+

+


:NH2‐R


→ NH3‐R

 + 3)
The
atom
acts
as
a
nucleophile
when
it
shares
the
lone
pair,
i.e.
bonds
to
another
 nucleus
 e.g.


 
 
 
 In
example
3,
O
 is
acting
as
a
nucleophile
 because
it
has
shared
one
of
its
lone
pairs
to
bond
to
 C.
 The
curly
arrow
notation
is
commonly
used
to
indicate
movement
of
a
pair
of
electrons,
in
this
 case
from
a
non‐bonded
or
lone
pair
position
to
form
a
new
bond.
 
 Hydrolysis
is
then
an
attack
by
H2O
using
O
as
a
nucleophile
onto
a
susceptible
bond
such
as
 a
 peptide
bond.

 
 The
 bond
 is
 susceptible
 because
 the
 C
 atom
 of
 the
 C=O
 is
 electron
 deficient,
 as
 electrons
 are
 drawn
 towards
 the
 electronegative
 O
 atom
 of
 C=O.
 
 Because
 the
 C
 is
 electron
 deficient,
 it
 can
 accommodate
 the
 incoming
 electron
 pair
 (the
 maximum
 number
 of
 valence
 electrons
on
C,
N
or
O
is
8).
 
 This
 sequence
 then
 produces
 a
 transition
 state,
 a
 semi‐stable
 halfway
 stage
 of
 the
 reaction.
 The
 transition
 state
 gives
 rise
 to
 stable
 end
 products
 by
 breaking
 the
 C–N
 bond.

This
happens
because
the
N
atom
can
 serve
 as
 a
 good
 leaving
 group,
 because
 it
 can
 hold
 the
 electrons
 from
 the
 breaking
 bond.
 
 See
Lehninger
mechanism
Fig
6‐25,
p.
215
(4th
ed
p.
216).
 
 
 Page
5
of
8
 BIOC*2580
Lecture
7.
Polypeptides
and
proteins:

 structural
hierarchy
and
sequence
 6
 
 Determining
amino
acid
sequence
 
 Fred
Sanger
at
Cambridge
University
was
the
first
person
to
devise
a
method
to
determine
 sequence.
Over
the
period
1947‐1953,
he
worked
out
methods
to
find
the
amino
acid
sequence
 of
the
protein
hormone
insulin,
and
eventually
won
the
Nobel
Prize
for
this
work.
 
 Sanger
introduced
two
important
techniques:
 
 N‐terminal
tagging
identifies
the
first
amino
acid
position
in
the
chain
 
 Limited
hydrolysis
breaks
the
chain
into
smaller,
more
manageable
pieces

 
 
 N‐terminal
tagging
works
because
the
N‐ terminal
amino
group
becomes
a

 nucleophile
 under
 mildly
 basic
 conditions
(+NH3‐,
the
normal
state
at
pH
 7
is
not
a
 nucleophile,
but
by
increasing
 to
pH
9
 (pKa
=
8
for
the
N‐terminal
of
a
 peptide
chain),
it
becomes
deprotonated
 :NH2‐.
 To
 avoid
 unwanted
 reaction
 at
 lysine,
pKa
 10.2,
pH
should
not
be
raised
 any
further.
 
 The
 nucleophilic
 N‐terminal
 :NH2‐
 will
 then
 react
 by
 displacing
 HF
 from
 the
 reagent
fluorodinitrobenzene.
 
 The
 bright
 yellow
 dinitrophenyl
 group
 becomes
 bonded
 to
 the
 N‐terminal
 amino
 acid,
 tagging
 it
 for
 easy
 identification
by
chromatography.

 
 Unfortunately,
Sanger’s
method
requires
 complete
hydrolysis
of
the
peptide
chain
to
recover
 the
tagged
amino
acid,
and
this
destroys
the
rest
of
the
peptide
chain
so
that
amino
acids
#2,
#3
 etc
are
not
easily
identified.

Sanger
proceeded
by
using
limited
hydrolysis,
 hydrolysis
at
lower
 temperature
or
for
shorter
time
so
that
not
all
peptide
bonds
are
broken.

This
creates
a
random
 mixture
 of
 dipeptides
 and
 tripeptides
 (short
 chains
 of
 2‐3
 amino
 acids).
 
 By
 analyzing
 all
 the
 fragments,
he
was
able
to
reconstruct
the
whole
sequence,
but
it
took
7
years
to
put
together
all
 the
 pieces
of
the
puzzle.

Luckily
for
Sanger,
insulin
is
a
very
small
protein
with
two
chains
of
21
 and
30
amino
acids
respectively.
 
 
 
 Page
6
of
8
 BIOC*2580
Lecture
7.
Polypeptides
and
proteins:

 structural
hierarchy
and
sequence
 7
 
 Per
 Edman
 solved
 the
 problem
 of
 hydrolyzing
 the
 complete
 peptide
 to
 recover
 the
 tagged
 amino
 acid
 in
 Sweden
 in
 1956.
 He
 used
 the
 reagent
 phenylisothiocyanate
 to
 label
 the
 N‐ terminal
end
of
the
polypeptide
sample.

Lehninger
Fig.
3‐25,
p.
94
(4th
ed
p.98).
 
 Phenylisothiocyanate
 reacts
 with
 a
 deprotonated
N‐terminal
amino
group.
 
 Deprotonation
exposes
the
lone
pair
 of
N,
 allowing
 it
 to
 react
 as
 a
 nucleophile,
 which
 can
 then
 attack
 an
 electron
 deficient
 nucleus,
 the
 C
 atom
 of
 isothiocyanate.
 This
 requires
 mildly
 basic
 conditions,
 pH
 9,
 which
 is
 achieved
 by
 carrying
 out
 the
 reaction
 in
 a
 weak
 base
 such
as
pyridine.
 
 The
 coupled
 product
 is
 called
 a
 phenylthiocarbamoyl
peptide.
 
 The
 phenylthiocarbamoyl
 peptide
 is
 transferred
 to
 weak
 anhydrous
 acid,
 e.g.
 CF3CO2H,
 which
 causes
 the
 C=S
 to
 attack
 the
 nearest
 peptide
 bond,
 i.e.
 the
 one
 linking
the
 N‐terminal
amino
 acid
to
 the
rest
of
the
chain.
 The
 result
 is
 a
 cyclization
 reaction
 that
 splits
 off
 the
 first
 amino
 acid,
 leaving
 the
 rest
of
the
chain
intact.
 
 Because
 the
 process
 is
 carried
out
in
the
absence
of
 H2O,
 there
 is
 no
 hydrolysis
 (literally
attack
by
water).
 
 
The
 cyclized
 form
 of
 the
 first
 amino
 acid
 rearranges
 to
 the
 final
 product,
 an
 amino
 acid
 phenylthiohydantoin
 or
 PTH
 amino
acid.
 
 
 Page
7
of
8
 BIOC*2580
Lecture
7.
Polypeptides
and
proteins:

 structural
hierarchy
and
sequence
 8
 
 Each
 amino
 acid
 phenylthiohydantoin
 is
 then
 identified
 by
 chromatography
 or
 mass
 spectrometry.
 
 Because
the
rest
of
the
chain
is
left
intact,
the
cycle
of
reactions
can
be
repeated
many
times,
 each
cycle
removing
the
currently
exposed
N‐terminal
amino
acid,
allowing
each
to
be
identified
 in
turn:
 
 PTH‐Gly
+
Ile‐Val‐Glu‐Gln‐Cys‐Cys‐Ala‐Ser‐Val
 PTH‐Ile
+
Val‐Glu‐Gln‐Cys‐Cys‐Ala‐Ser‐Val
 PTH‐Val
+
Glu‐Gln‐Cys‐Cys‐Ala‐Ser‐Val
etc.
 
 An
important
factor
for
success
is
that
the
two
steps
require
contrasting
conditions:

 1. coupling
with

phenylisothiocyanate
occurs
in
weak
base
 2. cyclization
to
phenylthiohydantoin
occurs
in
anhydrous
acid
 
 Because
 there
 are
 two
 distinct
 phases
 to
 the
 reaction,
 the
 reaction
 cycle
 remains
 strictly
 in
 phase.
The
coupling
reaction
at
step
1
can
be
allowed
to
go
to
completion
without
any
risk
that
 some
 molecules
 of
 glycine
 make
 cyclize
 early
 and
 expose
 Ile
 prematurely ,
 because
 cyclization 
 requires
acid.

Similarly
at
step
2,
the
 cyclization
of
Gly
can
proceed
to
completion
 without
risk
of
Ile
coupling
early,
since
the
conditions
are
acidic,
not
basic.
 
 Another
advantage
is
that
the
cycle
of
reactions
is
very
easy
to
automate,
and
the
whole
process
 can
be
carried
out
by
machine,
producing
one
PTH
amino
acid
every
hour.
 
 Although
 the
 Edman
 reaction
 can
 be
 repeated
 many
 times,
 and
 yields
 are
 high,
 there
 are
 practical
 limits.
 It’s
 usual
 to
 read
 off
 sequences
 of
 20‐30
 amino
 acids
 in
 one
 experiment.
 Sequences
 much
 over
 50
 or
 60
 amino
 acids
 are
 very
 hard
 to
 handle
 in
 a
 single
 run.
 Even
 if
 a
 reaction
has
98%
or
99%
yield,
there's
a
limit
to
the
number
of
times
you
can
repeat
it.
 
 To
overcome
this
limitation,
proteins
are
hydrolyzed
into
 peptides
that
can
then
be
sequenced.

 We
will
carry
on
with
this
discussion
in
the
next
lecture.
 
 
 Page
8
of
8
 ...
View Full Document

Ask a homework question - tutors are online