This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Chapter 7 : Chromosome Structure
Genome size in different organisms ranges from a few thousand nucleotides in the smallest viruses to
many billions of nucleotides in higher plants and animals (Table 7.1). The units in which DNA size is
expressed are bp, kb (kilobase pairs, thousands of bp), Mb (megabase pairs, millions of base pairs) and
Gb (gigabase pairs, billions of base pairs). Human genome is ~ 3 Gb in size. The largest known genome
is ~670 Gb in an amoeba. An European Lily and marbled lungfish both have genomes of ~130 Gb. The
largest genome in Table 7.1 is that of a salamander (90 Gb). It is worth noting that this extremely large
genome size (about 30-fold larger than the human genome) is not due to a great increase in chromosome
number. In fact the haploid chromosome number is only 14 (in humans, it is 23).
The amount of DNA per haploid genome is called the“C-value”. A comparison of genome sizes across
the evolutionary ladder shows a remarkable anomaly. In some orders, the genome sizes vary over more
than a 100-fold, in closely related organisms (insects, amphibians) and in some others (Sarcodina), a
range of >10,000-fold is seen (Fig. 7.1). This is called the C-value paradox. Even though large
variations in the amount of DNA per genome are seen (wheat genome is ~40-fold larger than the rice
genome), the number of genes does not vary that much between related organisms. Most of the increase
in genome size comes from either polyploidy (duplications of the entire genome) or due to increase in
non protein-coding DNA.
The Packaging Problem. From the littlest viruses to the highest eukaryotes, the DNA in the genome is
much larger than the space in which it must fit. Typically, at least 1000-fold compaction is needed. This
is achieved by 3 main methods.
(a) Looping of DNA by attachment to a central protein scaffold (Figs. 7.4, 7.5).
(b) Supercoiling of loops using topoisomerase enzymes (Figs. 7.2).
(c) Tying loops of DNA together using ring-shaped protein complexes (cohesins and condensins).
The energy of ATP hydrolysis is used by topoisomerase enzymes to introduce supercoils in DNA.
Topoisomerase II is like a combination of an endonuclease and DNA ligase (Fig. 7.3). It cuts both
strands of a DNA duplex, passes another duplex through the gap while still holding the 2 cut ends and
then seals the cut, restoring the original duplex strand.
Bacterial Chromosome. Upon breaking open a bacterial cell, the bacterial DNA becomes unfolded
into a sticky mess. This is because tight compaction of DNA requires the neutralization of large number
of negative charges on the surface of DNA molecules. This is done by complexing with positively
charged proteins. Upon cell lysis, these basic proteins become diluted and the bacterial DNA (called a
nucleoid, since it is not a true stable chromosome, like that found in eukaryote nucleus) becomes
unfolded. Addition of positively charged molecules during cell lysis allowed for the isolation of intact
bacterial nucleoids (Fig. 7.4). Analysis of the factors keeping it folded, revealed that there were about
100 loops (~40,000 bp per loop in E. coli) attached to a central protein core. The DNA in each loop was
further compacted by supercoiling (Fig. 7.5) and tying of loops together by ring-shaped protein
complexes (not shown in this diagram).
Eukaryotic Chromosome Structure. DNA in higher cells exists as a stable nucleo-protein filament
called chromatin. Upon gentle lysis of nuclei, all of the DNA, is seen as a beaded string structure (Fig.
7.6). The beads are called nucleosomes. Each bead consists of 8 molecules of highly basic (arginine and
lysine-rich) proteins called histones. Each nucleosome has a core made of 2 copies each of histones
H2A, H2B, H3 and H4. Wrapped on the outside of the histone octamer is two turns of DNA (about 145 bp). Another ~55 bp of DNA is found in a linker connecting 2 nucleosomes (Figs. 7.7, 7.8, 7.9). To the
linker attaches 1 copy of histone H1. Each nucleosome is shaped like a flat disk, 110A wide and 55A
high. Wrapping of DNA achieves ~6-fold compaction (200 bp x 3.4 A = 680A of DNA length folded to
a 110A nucleosome). Further compaction is achieved by twisting the nucleosome fiber into a tight spiral
that has 6 nucleosomes per turn, resulting in a 300A wide fiber (Fig. 7.9). This 300A wide fiber is
further compacted into tight balls (Fig. 7.10), which have to be unfolded to allow the information in the
genes to be copied during transcription and DNA replication. The 300A fiber is folded into loops (Fig.
7.11) and attached to a central protein scaffold in loops (Fig. 7.12 does not show it. There is a pictureof
the central scaffold in the lecture powerpoint). The loops have to be further compacted by clamping
them together with ring shaped protein-complexes called cohesins and condensins to create the
metaphase chromosomes (which are about 14000A wide; Fig. 7.11F). The compaction ratio in the
metaphase chromosome is about 5000-10,000 fold.
Polytene Chromosomes. Usually chromosomes are quite small and featureless. In stained
chromosomes, one can see a narrowing at one point- the centromere. With special staining techniques
(called chromosome banding), more detail is visible (discussed in chapter 8). Yet, in most cells in most
organisms, the resolution of the light microscope is not sufficient to see a deletion of either single or
even groups of 10-20 genes. A special case is salivary gland chromosomes of Drosophila larvae (and
similar insects). These are terminal tissues and are not passed onto the next developmental stage (the
pupa). In these specialized cells, the homologous chromosomes pair and undergo many rounds of
replication without strand separation or nuclear division. The final chromosomes are ~1000 DNA
molecules wide and they become very stretched out (~100-times longer than normal; Figs. 7.13, 7.14).
A great deal of detail becomes visible in these chromosomes. There are about 5000 bands visible in
Drosophila polytene chromosomes. Since Drosophila genome is ~180 million base pairs, each band has
~36 Kb of DNA. In contrast, G-banding of human chromosomes (see Fig. 8.2 in the next chapter) shows
~300 bands for 3 billion bp. Therefore, each G-band in human chromosome represents 10 million base
pairs worth of DNA (which would contain about 100 genes on the average). Effects of deletions,
inversions, translocations, etc. involving only a few genes are not visible in the human banded
chromosomes but these can be seen in the giant salivary gland chromosomes of Drosophila larvae. The
salivary gland chromosomes have proved very useful in localizing cloned genes by a technique called in
situ hybridization, where a labeled piece of DNA is hybridized to salivary gland chromosomes and the
labeled band is located under the light microscope.
DNA Sequence complexity in an eukaryotic genome (sections 7.6, 7.7). I will not cover these 2
sections in lecture in any detail and they will not be on the examinaion. The discussion below is given
for anyone who may be interested in these 2 sections.
When DNA is broken up into small pieces, heated to create single strands and allowed to slowly cool,
the strands become paired again. The rate of DNA renaturation depends upon three factors(a) time
(b) DNA concentration and
(c) complexity of sequences in the DNA population (Figs. 7.15, 7.16 and 7.17).
Analysis of simple repeating sequences such as poly(A) and poly(U) showed that they became double
stranded very quickly (it was easy for each strand to find its complement). Viral DNA (such as phage T4
DNA) took longer, bacterial DNA even longer (Fig. 7.16). The ratio of time was related to the
complexity of the sequences. Phage T4 genome is ~200,000 bp and it takes 200,000 times longer than
poly(A):poly(U) to become double stranded. E. coli genome is about 20-fold larger than phage T4
genome and it took 20-times longer still for it to become double-stranded (Fig. 7.16). All these DNAs
have a monophasic renaturation curve. In other words, all sequences in the genome became double- stranded at about the same time. A renaturation analysis of an eukaryotic DNA (such as that of a human)
showed a different result (Fig. 7.17). DNA was multiphasic in its renaturation. Some DNA (20% in the
example shown in Fig. 7.17), renatured 100,000 times faster (as if it had 100,000 copies of these
sequences in the haploid genome; Highly repetitive sequences) than the slowest renaturing component.
Another 30%, renatured ~1000-fold faster than the slowest renaturing component (middle repetitive
sequences). Finally, about 50% took a long time before it became double-stranded (Unique sequences).
A comparison of the rate of renaturation of this DNA relative to that of E. coli and other reference DNA
molecules suggested that it had a kinetic complexity of ~billion bp. In other words, out of the 3 billion
bp of DNA in a haploid human genome, ~50% or 1.5 billion bp are single copy DNA. About 30% or 1
billion base pairs are duplicated 1000-fold (1 million bp of DNA sequences present in ~1000 copies per
genome). Finally, about 20% of human DNA (~600 million base pairs) is ~6,000 nucleotides worth of
sequence that is repeated 100,000 times per haploid genome (see below for a problem and its solution
that explain this kind of reasoning in more detail)Calculating the number of repeats of different DNA sequences
per haploid genome using kinetic renaturation analysis dataQuestion- DNA of an eukaryotic species (haploid genome size = 4x 109 bp) upon kinetic renaturation
analysis shows 4 components with Cot1/2 values of 0.0001 (10% of the genome), 0.002 (20% of the
genome), 0.06 (20% of the genome) and 20,000 (50% of the genome). Phage T4 DNA (genome size =
2x105 bp) has a Cot1/2 value of 2.0 and E. coli DNA (genome size = 4 x 106 bp) has a Cot1/2 value of 40.
Calculate the kinetic complexity and number of repeats per haploid genome of the 4 components of the
eukaryotic DNA sample.
E. coli Amount of
100% 2 x 10
4 x 106 5 Cot1/2
Complexity per Genome
2 x 105
4 x 106 . 1
1 Eukaryotic DNA SampleComponent 1
Component 4 10%
50% 4 x 108
8 x 108
8 x 108
2 x 109 0.0001
2 x 109 4 x 107
4 x 106
1.33 x 105
1 Highly Repeated Sequences (Satellite DNA). These consist of several different kinds of sequences,
ranging from dinucleotide repeats (ATATAT…., or GCGCGCGC…., etc.) to repeats of somewhat
longer sequences (upto 250 base pairs). None of these sequences is transcribed into mRNA and often
stays condensed even in interphase (heterochromatin). A large fraction of these highly repeated
sequences is found near the centromeres (centromeric heterochromatin) or near the ends of the
chromosomes (telomeric heterochromatin; Fig. 7.18). A substantial fraction of these highly repeated
sequences in the human genome are defective transposons or remnants of retrotransposons [such as one
Long Terminal Repeat (LTR) sequence left behind by the excision of a retrotransposons; see Fig. 14.13
on P. 522 for the structure of retrotransposons with LTR sequences]. Middle Repeated Sequences. These consist of some genes that are present in multiple copies (such as
histones, tRNA and rRNA genes) as well as many functional complete transposons with intact
transposase or reverse transcriptase genes.
Special sequences in the chromosomes- centromere and telomere. Most stained chromosomes show
a lightly staining area that looks like a constriction (Figs. 7.18, 7.20, 8.1, 8.2, 8.4). This is called the
centromere. It has been known for a long time that when a chromosome breaks, only the part containing
the centromere is partitioned properly at the cell division. The piece without a centromere (acentric
fragment) is usually lost within a few cell division cycles. Most organisms have such a localized region
in each chromosome. The centromere is where the fibers of the spindle attach (Fig. 7.19) and pull the 2
daughter chromosomes to the opposite spindle poles during the anaphase stage of the cell division
process. Rarely, the centromere activity may be diffuse and many spindle fibers may attach throughout
the length of the chromosome. Such chromosomes are called holocentric chromosomes and organisms
with such chromosomes are rare. Upon fragmentation, all pieces continue to be partitioned properly.
Great majority of eukaryotes have localized centromeres. The centromere sequence in Saccharomyces
cerevisiae (baker ’s yeast) is only ~220 bp (Fig. 7.19). Centromeres of most eukaryotes span several
hundred kb of DNA and have a lot of repeated sequences (satellite DNA) in them (Fig. 7.20).
Yeast (Saccharomyces) centromeres are unusual in that they are quite small (only~ 220 bp) and have a
well-conserved structure. There are 4 conserved domains in yeast centromeres (Fig. 7.20). An
octanucleotide sequence followed by ~80 bp of very AT-rich DNA followed by a ~25 nucleotide
conserved sequence. These three centromere determining elements (CDE) are usually followed by a
100-135 nucleotide variable sequence. The ~220 bp centromere sequence can be interchanged between
different chromosomes. Also, attaching the centromere to any DNA sequence containing an origin of
replication results in the creation of a stable minichromosome (as long as the DNA is circular).
Formation of a linear minichromosome also requires telomere sequences at the 2 ends.
It has also been known for a long time that chromosomes, when broken by X-rays, tend to have sticky
ends that tend to fuse with other broken chromosome ends. In contrast, ends of normal chromosomes are
not sticky. Therefore, the ends of chromosomes are unique and they were named telomeres. Some
special sequence must be present at the ends that make the chromosomes non-sticky. Telomeres of a
species have a characteristic sequence (Table 7.2). These are short sequences that are repeated many
times (Figs. 7.22, 7.23, 7.24). These simple sequences are rich in G-residues and appear to form unusual
four-stranded structures at chromosome ends by making non Watson-Crick base pairs (Fig. 7.24).
The Problem in Replicating the ends of linear DNA. Since no known DNA polymerase can start a
new chain, new DNA chains are begun with RNA primers. Internal RNA primers are easily removed
and replaced by DNA (Fig. 6.22). However, the 2 primers, at the extreme 5'-ends of linear chromosomes
cannot be replaced by DNA (Fig. 7.22). If left un-repaired, the chromosomes would become
progressively shorter as the cells replicate repeatedly. They would disappear altogether ultimately. Since
this does not happen, there must be a mechanism for fixing the 5'-ends. Telomere repeats are generated
not by copying of a template DNA strand by a DNA polymerase, but by a special enzyme, Telomerase,
that has an internal RNA template (The Nobel Prize in Medicine and Physiology in 2009 was given to 3
persons for the discovery of this enzyme). The enzyme aligns at the end, the internal RNA template is
copied and the enzyme translocates 1-step (Fig. 7.23). When the chromosomes of one species are
transferred to the nucleus of a different species (with a different telomere sequence repeat), in time, the
telomeres of the transferred chromosome become identical to that of the resident chromosomes. Mutants
of telomerase result in shortening of the telomeres and ultimately in cell death due to the loss of essential
sequences at chromosome ends. The number of telomere repeats that a telomerase can add to the ends of
chromosomes is under complex regulation and involves the binding of several proteins to the telomere repeat sequences. Adult human cells do not have active telomerase and are unable to replicate more than
a few times before the telomeres become too short, chromosome ends start to be chewed up, and the cell
dies. In contrast, stem cells (which have the potential to divide endlessly) have an active telomerase.
During fetal development also, telomerase is active but in most cells the telomerase is shut off in an
adult. One mechanism of aging is thought to involve shortening of telomeres. When a somatic cell goes
down the cancer pathway and starts to divide in an unregulated way, it cannot become a full-blown
cancer unless the telomerase is re-activated. Because of this, intense effort is directed towards finding
drugs that will target the telomerase enzyme, with the hope that some of them will prove to be effective
anti-cancer drugs. ...
View Full Document
This note was uploaded on 09/29/2011 for the course GENETICS 380 taught by Professor Glodowski during the Spring '08 term at Rutgers.
- Spring '08