Chapter_06_Solutions

Chapter_06_Solutions - Chapter 6 How Cells Read the Genome:...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Chapter 6 How Cells Read the Genome: From DNA to Protein FROM DNA TO RNA 6 In This Chapter FROM DNA TO RNA DEFINITIONS 6–1 General transcription factor 6–2 snRNA (small nuclear RNA) 6–3 RNA splicing 6–8 Terminator 6–9 Trans-splicing 6–10 Exon 6–11 A146 mRNA (messenger RNA) 6–7 THE RNA WORLD AND THE ORIGINS OF LIFE rRNA (ribosomal RNA) 6–6 A132 Exosome 6–5 FROM RNA TO PROTEIN Promoter 6–4 A121 Nuclear pore complex TRUE/FALSE 6–12 True. Errors in DNA replication have the potential to affect future generations of cells, while errors in transcription have no genetic consequence. Errors in transcription lead to mistakes in a small fraction of RNAs, whose functions are further monitored by downstream quality-control mechanisms. The essential feature is that errors in DNA replication change the gene and, thereby, affect all the copies of RNA (and protein) made in the original cell and all its progeny cells. By contrast, errors in transcription are limited to a small number of defective RNAs (and proteins), and are not passed on to progeny cells. These considerations are reflected in the intrinsic error rates for RNA and DNA polymerases: RNA polymerases typically make 1 mistake in copying 104 nucleotides, while DNA polymerases make about 1 error per 107 nucleotides. Such significant differences in error rates suggest that natural selection is stronger against errors in replication than against errors in transcription. 6–13 False. The s subunit associates with the bacterial RNA polymerase core enzyme to form the RNA polymerase holoenzyme only during the initiation phase of RNA synthesis. The s subunit helps the core enzyme bind to the promoter and stays associated with the core enzyme until RNA synthesis has been properly initiated, and then it dissociates. A121 A122 Chapter 6: How Cells Read the Genome: From DNA to Protein 6–14 True. At its 3¢ end each eucaryotic mRNA has a string of adenine nucleotides, the last of which has a terminal ribose with a free 3¢-OH group. At its 5¢ end each mRNA carries a 7-methylguanosine that is linked 5¢ to 5¢ with the first nucleotide in the mRNA. This linkage leaves a free 3¢-OH group on the ribose of the capping nucleotide. 6–15 False. Although intron sequences are mostly dispensable, they must be removed precisely. An error of even one nucleotide during removal would shift the reading frame in the spliced mRNA molecule and produce an aberrant protein. 6–16 False. The 3¢ ends of most pre-mRNA transcripts produced by RNA polymerase II are defined, not by the termination point of transcription, but by cleavage of the RNA chain 10–30 nucleotides downstream of the sequence AAUAAA and 30 or so nucleotides upstream of a GU- or U-rich sequence. THOUGHT PROBLEMS 6–17 The answer is best given by Francis Crick himself, who coined the terms ‘the sequence hypothesis,’ which proposes that genetic information is encoded in the sequence of the DNA bases, and ‘the central dogma,’ which states that DNA makes RNA makes protein, in 1957. “I called this idea the central dogma, for two reasons, I suspect. I had already used the obvious word hypothesis in the sequence hypothesis, and in addition I wanted to suggest that this new assumption was more central and more powerful. I did remark that their speculative nature was emphasized by their names. As it turned out, the use of the word dogma caused almost more trouble than it was worth. Many years later Jacques Monod pointed out to me that I did not appear to understand the correct use of the word dogma, which is a belief that cannot be doubted. I did apprehend this in a vague sort of way but since I thought that all religious beliefs were without serious foundation, I used the word in the way I myself thought about it, not as most of the rest of the world does, and simply applied it to a grand hypothesis that, however plausible, had little direct experimental support.” Reference: Crick F (1988) What Mad Pursuit: A Personal View of Scientific Discovery, p 109. New York: Basic Books, Inc. 6–18 Actually, the RNA polymerases are not moving at all because they have been fixed and coated with metal to prepare the sample for viewing in the electron microscope. Before they were fixed, they were moving from left to right, as indicated by the gradual lengthening of the RNA transcripts. The RNA transcripts are shorter than the DNA that encodes them because they begin to fold up (acquire a three-dimensional structure) as soon as they are synthesized, whereas the DNA is an extended double helix. 6–19 A with 7, B with 4, C with 2, D with 5, E with 8, F with 3, G with 1, and H with 6. 6–20 If the polymerase transcribes the sequence from left to right, it will use the bottom strand as a template to make the sequence 5¢-GUAACGGAUG (the RNA sequence corresponding to the top strand of the DNA). If the polymerase moves from right to left, it will use the top strand as a template to make the sequence 5¢-CAUCCGUUAC (the RNA sequence corresponding to the bottom strand of the DNA, written 5¢ Æ 3¢). 6–21 General transcription factors play several roles in promoting transcription by RNA polymerase II. They help position the RNA polymerase correctly at the promoter, they aid in pulling apart the two strands of DNA to allow transcription to begin, and they release RNA polymerase from the promoter once transcription has begun. They are called ‘general’ because they assemble on all promoters used by RNA polymerase II; they are identified by A123 FROM DNA TO RNA (A) (B) Figure 6–49 Rotation of duplex due to movement relative to RNA polymerase (Answer 6–23). (A) Direction of rotation of the magnetic bead. (B) Direction of rotation of the DNA duplex. magnet magnetic bead DNA RNA RNA polymerase RNA glass slide names beginning with TFII (transcription factor for RNA polymerase II). Labeling them ‘general’ transcription factors also serves to distinguish them from more specialized gene regulatory proteins that enhance transcription at selected promoters in certain cell types. 6–22 The RNA polymerase must be moving from right to left in Figure 6–2. If the RNA polymerase does not rotate around the template as it moves, it will overwind the DNA ahead of it, causing positive supercoils, and underwind the DNA behind it, causing negative supercoils. If the RNA polymerase were free to rotate about the template as it moved along the DNA, it would not overwind or underwind the DNA, and no supercoils would be generated. Reference: Liu LF & Wang JC (1987) Supercoiling of the DNA template during transcription. Proc. Natl Acad. Sci. U.S.A. 84, 7024–7027. 6–23 The bead would rotate clockwise from the perspective of the magnet, as shown in Figure 6–49A. As shown in Figure 6–49B, the motion of the helix relative to a fixed RNA polymerase causes the helix to rotate. Reference: Harada Y, Ohara O, Takatsuki A, Itoh H, Shimamoto N & Kinosita K (2001) Direct observation of DNA rotation during transcription by Escherichia coli RNA polymerase. Nature 409, 113–115. 6–24 From electron micrographs such as that shown in Figure 6–1 it seems clear that RNA does not become wrapped around the DNA as it is spun out behind RNA polymerase. Thus, RNA polymerase doesn’t seem to revolve around DNA as it moves. Consistent with this, other evidence suggests that RNA polymerase does induce positive supercoiling ahead of it and negative supercoiling behind it. But the level of supercoiling tension is not so high as you would expect if the continued movement of RNA polymerase generated ever-higher levels of coiling. These observations suggest that supercoiling tension is relieved by the action of topoisomerases in the cell, so that supercoiling around the RNA polymerase is maintained at optimal levels. 6–25 Phosphorylation of the CTD is the event that permits release of RNA polymerase from the other proteins present at the start point of transcription. Phosphorylation also allows association of a new set of proteins that are involved in processing the nascent RNA transcript. These proteins include components required for capping, splicing, and polyadenylation. 6–26 The tails arise because the ends of the mRNA are not complementary to the ends of the restriction fragment. Thus, one of the single-strand tails at each end is DNA from the restriction fragment. A single-strand tail at one end corresponds to the 5¢ end of the mRNA, which must come from an upstream exon that is not present in the restriction fragment. A single-strand tail at the other end corresponds to the 3¢ end of the mRNA, which may come from a downstream exon not present in the restriction fragment or simply be the poly-A tail itself. Without additional information you cannot identify which single strand comes from which source. A124 Chapter 6: How Cells Read the Genome: From DNA to Protein (A) NORMAL (B) MUTANT mutation inactivates 3¢ splice site 173 bp exon 1 exon 2 exon 2 exon 1 exon 3 gene TRANSCRIPTION cap AAA pre-mRNA TRANSCRIPTION cap AAA pre-mRNA SPLICING SPLICING cap AAA mRNA TRANSLATION normal protein exon 3 gene cap AAA mRNA TRANSLATION mutant protein Reference: Berget SM, Berk AJ, Harrison T & Sharp PA (1977) Spliced segments at the 5¢ termini of adenovirus-2 late mRNA: a role for heterogeneous nuclear RNA in mammalian cells. Cold Spring Harbor Symp. Quant. Biol. 42, 523–529. 6–27 A. A single nucleotide change in a gene could cause an internal deletion in the mRNA if it altered splicing so that an exon that was usually incorporated was skipped instead. B. Removal of 173 nucleotides from the protein-coding portion of the mRNA would cause a shift in the reading frame for translation into amino acids. Because a codon is three nucleotides, a loss of 173 nucleotides does not correspond to an integral number of codons. Thus, the Smilin encoded by the deleted mRNA would be fine up to the missing exon, but would encode an unrelated sequence of amino acids thereafter until a stop codon was reached. C. The simplest explanation is that the Smilin gene contains a 173-nucleotidelong exon (exon 2 in Figure 6–50A) that is lost during the processing of the mutant precursor mRNA. This could occur, for example, if the mutation changed the 3¢ splice site in the preceding intron so that it was no longer recognized by the splicing machinery (a change in the conserved AG at the intron–exon boundary could do this). Use of the next available 3¢ splice site—adjacent to exon 3—would cause loss of exon 2 from the mutant mRNA (Figure 6–50B). During protein synthesis, the absence of exon 2 (173 nucleotides) would throw the ribosomes out of the correct reading frame as they moved from exon 1 to exon 3. At that junction the ribosomes would begin synthesizing a protein sequence unrelated to that normally encoded by exon 3. 6–28 Statement C is the only one that is necessarily true for exons 2 and 3. It is also true for exons 7 and 8. While statements A and B could be true, they don’t have to be. Because the protein sequence is the same in segments of the mRNA that correspond to exons 1 and 10, neither choice of alternative exons (2 versus 3, or 7 versus 8) can alter the reading frame. To maintain the normal reading frame—whatever that is—the alternative exons must have a number of nucleotides that when divided by 3 (the number of nucleotides in a codon) give the same remainder. Since the sequence of the a-tropomyosin gene is known, it is possible to check to see the actual state of affairs. Exons 2 and 3 both contain the same number of nucleotides, 126, which is divisible by 3 with no remainder. Exons 7 and 8 also contain the same number of nucleotides, 76, which is divisible by 3 with a remainder of 1. 6–29 Since introns evolve faster than exons, the introns of the different species will be more variable than the exons. It is difficult to scan these sequences by Figure 6–50 Splicing of the Smilin transcript (Answer 6–27). (A) Normal transcript. (B) Mutant transcript. A125 FROM DNA TO RNA GGTGGTGAGGCCCTGGGCAG GTAGGTATCCCACTTACAAG 00211100211100011000 00311023345563332333 12 53 EXON INTRON eye and decide, with confidence, which side is the more conserved. One way to quantify the differences is to pick one sequence, for example, the cow, and count up how often the other sequences differ at each position, as shown in Figure 6–51. Summing the differences on each side of the junction makes it clear that sequences on the left are much more similar to one another than are the sequences on the right. (Similar differences exist no matter which sequence is chosen for comparison.) Thus, the more conserved sequences, which are on the left in Figure 6–6, correspond to exons, and the less conserved sequences, which are on the right, correspond to introns. Figure 6–51 Sum of differences from the cow b-globin sequence (Answer 6–29). The b-globin sequence from the cow was compared nucleotide by nucleotide with the b-globin sequences from the other six species. The number of differences at each position is summed below each nucleotide of the b-globin sequence. The total number of differences on each side of the exon–intron boundary (dashed line) is shown at the bottom. (A) MINIGENE 1 5¢-to-3¢ scanning 6–30 A. If the splicing machinery bound to one splice site and scanned across the intron to find its complementary splice site, it would use the first appropriate splice site it encounters. The products predicted from the intron-scanning model are shown in Figure 6–52. If the splicing machinery bound to a 5¢ splice site and scanned toward a 3¢ splice site, minigene 1 would generate one product (Figure 6–52A) and minigene 2 would generate two products (Figure 6–52B). By contrast, if the splicing machinery bound to a 3¢ splice site and scanned toward a 5¢ splice site, minigene 1 would generate two products (Figure 6–52A) and minigene 2 would generate one product (Figure 6–52B). B. The results of this experiment do not match the expectations for either scanning model, suggesting that scanning models are incorrect. The ordering mechanism by which cells avoid exon skipping probably depends on two factors: first, that assembly of the spliceosome occurs as the pre-mRNA emerges from the RNA polymerase; and second, that exons may be defined as an independent step prior to the assembly of the spliceosome. Reference: Kuhne T, Wieringa B, Reiser J & Weissmann C (1983) Evidence against a scanning model for RNA splicing. EMBO J. 2, 727–733. 6–31 6–32 6–33 Group I excised introns are linear, and they carry the activated G in covalent linkage at their 5¢ ends. Group II excised introns are lariats, with the activated A having reacted with the 5¢-most nucleotide of the intron. The mechanism of pre-mRNA splicing catalyzed by the spliceosome is more similar to the mechanism used by the Group II self-splicing introns. ‘Export ready’ means that an mRNA is bound by the appropriate set of proteins. Proteins such as the cap-binding complex, the exon-junction complex, and the poly-A-binding protein must be present, while proteins such as spliceosome components must be absent. RNA fragments from excised introns do not acquire the necessary set of proteins and are thus doomed to degradation. The structure of the nucleolus depends on the rRNA genes, which are located in clusters at the tips of each copy of five different chromosomes in humans. During interphase the transcribed rRNA genes associate to form the visible nucleolus. At mitosis, the chromosomes disperse and the nucleolus breaks up. After mitosis the tips of the chromosomes again coalesce and the nucleolus reforms in a process that depends on transcription of the rRNA genes. CALCULATIONS 6–34 A. Since RNA polymerase is blocked by pyrimidine dimers, the sensitivity of transcription to UV damage will depend on the distance between the promoter for a gene and the probe. It is a simple measure of the target size for UV 5¢ 3¢ 3¢ 3¢ 3¢ 3¢-to-5¢ scanning 5¢ + (B) MINIGENE 2 5¢-to-3¢ scanning 5¢ 5¢ 3¢ 5¢ 3¢ + 3¢-to-5¢ scanning 5¢ Figure 6–52 Expected products in a test of intron scanning (Answer 6–30). (A) Expected products for 5¢-to-3¢ scanning and 3¢-to-5¢ scanning in minigene 1. (B) Expected products for 5¢-to-3¢ scanning and 3¢-to-5¢ scanning in minigene 2. Open boxes indicate complete exons; shaded boxes represent partial exons. A126 Chapter 6: How Cells Read the Genome: From DNA to Protein damage. If the polymerase must travel twice as far to complete a transcript, the chances of its encountering a block to transcription are twice as great. B. Transcription through the Vsg gene is seven times more sensitive to UV irradiation than transcription through the ribosomal transcription unit at the site of rRNA probe 4, which is about 7 kb from its promoter (see Figure 6–9A). Thus, the beginning of the Vsg gene, which is where the probe was located, is about 50 kb (7 ¥ 7 kb) away from its promoter. This calculation assumes that the DNA between the Vsg promoter and the Vsg gene has about the same sensitivity to UV light as the DNA in the ribosomal RNA transcription unit. It also assumes that multiple UV-induced pyrimidine dimers are not common enough to skew the linear relationship between UV dose and distance. C. If the nearby gene is 20% less sensitive to UV irradiation than the Vsg gene, it is inactivated at 80% the rate of the Vsg gene. Therefore, its promoter is 40 kb away (0.80 ¥ 50 kb). Given that the nearby gene is 10 kb in front of the Vsg gene, its promoter must map very near the promoter for the Vsg gene. Thus, it is likely that the two genes are transcribed from the same promoter. Reference: Johnson PJ, Kooter JM & Borst P (1987) Inactivation of transcription by UV irradiation of T. brucei provides evidence for a multicistronic transcription unit including a VSG gene. Cell 51, 273–281. 6–35 A. At the plateau there are 7.2 ¥ 109 transcripts per reaction. The reaction incorporates 2.4 pmol of CMP, and each transcript is 400 nucleotides long, of which 200 nucleotides are CMP. mole transcripts transcripts 2.4 pmol CMP 6 ¥ 1023 CMP ¥ 12 ¥ = ¥ 10 pmol 200 CMP reaction reaction mole 9 transcripts/reaction = 7.2 ¥ 10 B. Each reaction contains 1.0 ¥ 1011 templates. At 16 mg/mL a 25-mL reaction volume contains 0.4 mg of template (16 mg/mL ¥ 0.025 mL = 0.4 mg). Each template is 3500 nucleotide pairs (np) long. template np 0.4 mg templates 6 ¥ 1017 d ¥ ¥ = ¥ 3500 np 660 d reaction reaction mg = 1.0 ¥ 1011 templates/reaction C. There are about 0.07 transcripts/template, which is equivalent to about 1 RNA transcript per 14 template molecules. transcripts 7.2 ¥ 109 transcripts reaction = ¥ template reaction 1.0 ¥ 1011 templates = 0.07 transcripts/template The poor efficiency of the reaction is typical of in vitro transcription. The ratio of 1 transcript per 14 templates does not necessarily mean that 1 out of 14 templates functions in RNA synthesis. It may be that a much smaller fraction of the templates makes a large number of transcripts. For example, 1 in 140 templates might synthesize 10 transcripts each. Reference: Sawadogo M & Roeder RG (1985) Factors involved in specific transcription by human RNA polymerase II: analysis by a rapid and quantitative in vitro assay. Proc. Natl Acad. Sci. U.S.A. 82, 4394–4398. DATA HANDLING 6–36 A. Starting with a complex in which a C had been incorporated at position +34, the RNA polymerase could be walked down to position +43 in two steps: (1) incubate with a mixture of ATP and UTP, and then wash away the A127 FROM DNA TO RNA nucleotides; and (2) incubate with a mixture of UTP and CTP, and then wash. This protocol will allow the polymerase to incorporate all the nucleotides up to position +43, but not the A at position +44. B. Incorporation of the correct A nucleotide occurred 130 times faster than the incorrect G nucleotide at position +44 (0.20/0.0015 = 133, see Table 6–1). Incorporation of the next nucleotide—a C—after the correct A occurred about 5 times faster than incorporation of C after the incorrect G (0.17/0.036 = 4.7). Thus, when the RNA polymerase makes a mistake, it incorporates the next nucleotide more slowly. This pause allows a window of opportunity for removal of the incorrect nucleotide before the next nucleotide is added. Reference: Thomas MJ, Platas AA & Hawley DK (1998) Transcriptional fidelity and proofreading by RNA polymerase II. Cell 93, 627–637. 6–37 The consensus sequence for this set of promoters is shown in Figure 6–53. In this set of 13 promoters there are clear examples of common nucleotides outside the –35 and –10 regions. Also, one of the accepted consensus nucleotides (the terminal A in the –35 sequence) doesn’t even show up as common. When 300 promoters recognized by s70 are compared, the consensus sequence is TTGACA (–35) and TATAAT (–10). It’s always better to compare more sequences! 6–38 A. There are three locations where the linker-scanning mutations drastically decreased the amount of transcript. These locations are from about –15 to –30, –45 to –60, and –80 to –105. Linker-scanning through the promoter element at –15 to –30 seems to have the most dramatic effect, whereas scanning through the other two sensitive sites gives lesser effects that are about equal to one another. You may also have noticed the effect of a linker-scanning mutation that overlapped the start site of transcription. In the absence of the usual sequence at the start site, the transcript is initiated at a variety of positions. B. The segment from –15 to –30 likely includes the TATA box, which is usually located about 25 nucleotides or so upstream of the transcription start site. Reference: McKnight SL & Kingsbury R (1982) Transcriptional control signals of a eukaryotic protein-coding gene. Science 217, 316–324. +1 tyrosine tRNA promoter ribosomal RNA gene promoters bacteriophage promoters TCTCAACGTAACACTTTACAGCGGCG..CGTCATTTGATATGATGC.GCCCCGCTTCCCGATAAGGG GATCAAAAAAATACTTGTGCAAAAAA..TTGGGATCCCTATAATGCGCCTCCGTTGAGACGACAACG ATGCATTTTTCCGCTTGTCTTCCTGA..GCCGACTCCCTATAATGCGCCTCCATCGACACGGCGGAT CCTGAAATTCAGGGTTGACTCTGAAA..GAGGAAAGCGTAATATAC.GCCACCTCGCGACAGTGAGC CTGCAATTTTTCTATTGCGGCCTGCG..GAGAACTCCCTATAATGCGCCTCCATCGACACGGCGGAT TTTTAAATTTCCTCTTGTCAGGCCGG..AATAACTCCCTATAATGCGCCACCACTGACACGGAACAA GCAAAAATAAATGCTTGACTCTGTAG..CGGGAAGGCGTATTATGC.ACACCCCGCGCCGCTGAGAA TAACACCGTGCGTGTTGACTATTTTA.CCTCTGGCGGTGATAATGG..TTGCATGTACTAAGGAGGT TATCTCTGGCGGTGTTGACATAAATA.CCACTGGCGGTGATACTGA..GCACATCAGCAGGACGCAC GTGAAACAAAACGGTTGACAACATGA.AGTAAACACGGTACGATGT.ACCACATGAAACGACAGTGA TATCAAAAAGAGTATTGACTTAAAGT.CTAACCTATAGGATACTTA.CAGCCATCGAGAGGGACACG ACGAAAAACAGGTATTGACAACATGAAGTAACATGCAGTAAGATAC.AAATCGCTAGGTAACACTAG GATACAAATCTCCGTTGTACTTTGTT..TCGCGCTTGGTATAATCG.CTGGGGGTCAAAGATGAGTG frequency (%) AA TTGA C TATAA T C 100 75 50 AA - - - - - - - T T G A C –35 T A T AA T G C - - C - C C –10 Figure 6–53 Consensus sequence for promoters recognized by s70 factor (Answer 6–37). Nucleotides that are perfectly conserved and the first nucleotide of the transcripts are shaded for reference. Below the sequences themselves are indicated the consensus nucleotides and, on an expanded scale, the frequencies of the conserved nucleotides around the –35 and –10 regions. A128 Chapter 6: How Cells Read the Genome: From DNA to Protein 6–39 A. The 400-nucleotide transcript is absent from lane 4, Figure 6–13B, because GTP was included in the reaction mixture. GTP allows transcription to proceed beyond the end of the C-minus sequence (the synthetic sequence lacking C nucleotides), thereby generating transcripts longer than 400 nucleotides. In the absence of GTP (see lane 2) transcription cannot proceed beyond the C-minus sequence. In the presence of GTP and RNase T1 (see lane 6) the longer transcripts are cleaved at the first G to yield the 400nucleotide transcript. In the presence of GTP, RNase T1, and 3¢ O-methyl GTP (see lane 8) any long transcripts that escape termination by 3¢ Omethyl GTP are cleaved by RNase T1 to yield the 400-nucleotide transcript. B. One of the difficulties in assaying promoter function in vitro is the high background of nonspecific initiation of transcription. It is this background that is so evident in Figure 6–13B, lane 3. Its source is not altogether clear, but transcription may start at sequences in the rest of the plasmid that weakly resemble true RNA polymerase II promoters. C. A transcript of about 400 nucleotides is present in Figure 6–13B, lane 5, because cleavage with RNase T1 liberates it from any randomly initiated transcript that has traversed the C-minus sequence. It is actually a few nucleotides longer than the specifically initiated transcript since its 5¢ and 3¢ ends are defined, respectively, by the G nucleotides that immediately precede and follow the C-minus sequence. The 400-nucleotide transcript is absent from Figure 6–13B, lane 7, because 3¢ O-methyl GTP will terminate most transcripts that are initiated in front of the C-minus sequence. The combination of 3¢ O-methyl GTP and RNase T1 eliminates virtually all the background synthesis from the control plasmid. D. As shown in Figure 6–13B, lanes 7 and 8, specific transcription can be assayed in the presence of G nucleotides if 3¢ O-methyl GTP and RNase T1 are included (to inhibit background transcription and to cleave any random transcripts to small pieces). Reference: Sawadogo M & Roeder RG (1985) Factors involved in specific transcription by human RNA polymerase II: analysis by a rapid and quantitative in vitro assay. Proc. Natl Acad. Sci. U.S.A. 82, 4394–4398. 6–40 These experiments provided the most convincing early demonstration that there were three forms of RNA polymerase in eucaryotic cells: one—peak 1—that was insensitive to a-amanitin (RNA polymerase I), one—peak 2— that was inhibited by both 1 mg/mL and 10 mg/mL a-amanitin (RNA polymerase II), and one—peak 3—that was inhibited by 10 mg/mL a-amanitin, but not by 1 mg/mL (RNA polymerase III). It would be unlikely that different forms of the same polymerase could have such different sensitivities to the same molecule. These results also indicate the ways the RNA polymerases were named: by the order in which they were eluted from the column. Reference: Roeder RG (1974) Multiple forms of deoxyribonucleic aciddependent ribonucleic acid polymerase in Xenopus laevis. Isolation and partial characterization. J. Biol. Chem. 249, 241–248. 6–41 A. Since an equal amount of transcription from each template was observed when the preincubation was carried out with the individual templates or a mixture (see Figure 6–15C, lanes 1 to 3), Srb2 protein does not show a preference for either template. B. Srb2 acts stoichiometrically. If Srb2 acted catalytically, it should have been able to modify the second template after the two were mixed. Catalytic activity would have produced transcripts from both templates regardless of which one was originally included in the preincubation with Srb2. When excess Srb2 was added at the beginning of the preincubation with one template, transcription was observed from both templates after mixing. This is consistent with a stoichiometric requirement for Srb2 and rules out the possibility that Srb2 was inactivated during the preincubation—and was for that A129 FROM DNA TO RNA mediator P P P P P P P P P P P CTD RNA polymerase II CTD kinase, NTPs RNA CTD phosphatase reason unable to act (catalytically) on the second template after mixing. C. The production of transcripts solely from the template that was preincubated with Srb2 indicates that Srb2 is part of the preinitiation complex. If Srb2 were able to act after transcription had begun, transcripts would have been produced from both templates regardless of which one was included in the preincubation. D. During preincubation of the template with the extract and Srb2, a number of proteins including Srb2 bind at the promoter to form a preinitiation complex. Evidently, the preinitiation complex, once formed, is stable and does not readily exchange proteins with other templates that are added later. E. Although the Srb2 gene was originally identified as a suppressor of the coldsensitive phenotype of yeast carrying an RNA polymerase II gene with a short CTD, neither those genetic results nor the transcription assays described here provide evidence that Srb2 binds to the CTD. (Nor do they argue against direct interaction; they simply do not speak to the issue.) Additional experiments have shown that Srb2 is part of a complex of proteins known as the mediator. The mediator binds to the dephosphorylated CTD, entering and leaving initiation complexes at every round of transcription in a process that may be coupled to C-terminal domain phosphorylation and the release of RNA polymerase at initiation of transcription (Figure 6–54). References: Koleske AJ, Buratowski S, Nonet M & Young RA (1992) A novel transcription factor reveals a functional link between the RNA polymerase II CTD and TFIID. Cell 69, 883–894. Thompson CM, Koleske AJ, Chao DM & Young RA (1993) A multisubunit complex associated with the RNA polymerase II CTD and TATA-binding protein in yeast. Cell 73, 1361–1375. Svejstrup JQ, Li Yang, Fellows J, Gnatt A, Bjorklund S & Kornberg RD (1997) Evidence for a mediator cycle at the initiation of transcription. Proc. Natl Acad. Sci. U.S.A. 94, 6075–6078. 6–42 The pattern of reaction with DNA and protein is strikingly clear. When the U analog was closer than 10 nucleotides to the 3¢ end of the RNA, it reacted predominantly with its pairing partner in the template strand. By contrast, when it was 10 nucleotides or farther from the 3¢ end, it no longer reacted with DNA at all, and reacted strongly with protein. These patterns of reactivity indicate that the newly synthesized RNA remains paired with the DNA template over a stretch of 8–9 nucleotides from the 3¢ end, and then separates from the template strand. It also seems that the U analog must be closely associated with the RNA polymerase even when it is up to 24 nucleotides from the 3¢ end. Reference: Nudler E, Mustaev A, Lukhtanov E & Goldfarb A (1997) The RNA–DNA hybrid maintains the register of transcription by preventing backtracking of RNA polymerase. Cell 89, 33–41. 6–43 A schematic diagram of the structure of the mRNA–DNA hybrid and the intron–exon structure of the gene are shown in Figure 6–55. Figure 6–54 A summary of the proposed role of the mediator complex in initiation of transcription (Answer 6–41). A130 Chapter 6: How Cells Read the Genome: From DNA to Protein A C D Figure 6–55 Interpretation of the electron micrograph and the intron–exon structure of the chicken ovalbumin gene (Answer 6–43). The letters A to G identify introns and the black boxes identify exons. F G 5¢ end B poly-A tail 200 nucleotides A B C E D E F G 1000 nucleotides References: Dugaiczyk A, Woo SL, Lai EC, Mace Jr ML, McReynolds L & O’Malley BW (1978) The natural ovalbumin gene contains seven intervening sequences. Nature 274, 328–333. Garapin AC, Cami B, Roskam W, Kourilsky P Le Pennec JP, Perrin F, Gerlinger , P Cochet M & Chambon P (1978) Electron microscopy and restriction , enzyme mapping reveal additional intervening sequences in the chicken ovalbumin split gene. Cell 14, 629–639. 6–44 The results of these experiments argue convincingly that base-pairing occurs between the pre-mRNA and the U1 RNA. The base-pairing between the U1 snRNA and the pre-mRNA is not extensive, but it is better when splicing occurs successfully than when it does not. These experiments illustrate a classic approach for testing the reality of proposed base-pairing schemes. If a scheme is important, then a base change in one component should interfere with the process. A compensating change in the second component (to restore base-pairing) should then reestablish the process. That is just what was observed in these experiments. A change of a G Æ A in the premRNA inhibited splicing, whereas a compensating C Æ U change in the U1 RNA, which should restore base-pairing (GC Æ AU), reestablished splicing. Moreover, none of the other mutations in the U1 RNA (which would not restore base-pairing) overcame the splicing defect. Reference: Zhuang Y & Weiner AM (1986) A compensatory base change in U1 snRNA suppresses a 5¢ splice site mutation. Cell 46, 827–835. 6–45 A. If the RNAs are joined by splicing, then there should be a 5¢-splice-site sequence (GU) in the leader RNA and a 3¢-splice-site sequence (AG) in the actin RNAs. When joined according to the usual splicing rules, these splice sites should generate the actin mRNA sequences. As indicated in Figure 6–56 for actin gene 1, splicing could join the leader RNA and the actin RNA to generate actin mRNA with the correct sequence. If the RNAs are joined by splicing, then the leader gene contributes 22 nucleotides to the actin mRNA. B. The leader RNA sequences and the actin RNA sequences cannot be part of the same precursor RNA because actin genes 1, 2, and 3 are not all transcribed in the same direction. (The observation that the leader RNA genes seem to be transcribed into a discrete 100-nucleotide transcript is a weaker leader RNA 5¢ XGUUUAAUUACCCAAGUUUGAG 5¢ actin RNA GUAAA.... ....UUCAG GUACAUUAAAAACUAAUCAAAAUG XGUUUAAUUACCCAAGUUUGAG GUACAUUAAAAACUAAUCAAA AUG actin mRNA Figure 6–56 Splicing of leader RNA and actin gene 1 RNA to generate actin mRNA (Answer 6–45). The GU at the 5¢ splice site associated with the leader RNA is underlined, as is the AG at the 3¢ splice site associated with the actin RNA. Also underlined are the leader segment identified by sequence comparisons and the start site for translation. FROM DNA TO RNA argument against a precursor. Since the leader RNA genes are present in about 100 copies, it would be difficult to rule out that one or a few of them initiated a longer transcript.) C. The presence of proper splice signals that could generate the correct actin mRNA, taken together with the impossibility of a single large precursor, suggests that the leader RNA may be attached to actin RNA by splicing between two separate RNA molecules. It turns out that trans-splicing produces about 1% of a nematode’s mRNAs; it is even more common in trypanosomes where it is responsible for all of the mRNAs. In both organisms a common exon is spliced onto the 5¢ ends of many different RNA transcripts. Reference: Krause M & Hirsh D (1987) A trans-spliced leader sequence on actin mRNA in C. elegans. Cell 49, 753–761. 6–46 A. The 5¢ ends of the RNA molecules were labeled. Only labeled fragments show up in the autoradiograph (see Figure 6–22). Thus, if the shortest fragments (those that were at the bottom of the gel) are from the 5¢ end, the 5¢ end must have been labeled. B. The bands corresponding to the As in the AAUAAA signal sequence are missing from the ladder of bands in polyadenylated and cleaved RNA (see Figure 6–22, lanes 3 and 4) because modification of any one of those As interfered with cleavage and with polyadenylation. Thus, RNA molecules that carry a single modification in the signal sequence are not recognized by the components of the extract and, as a result, do not show up in the population of molecules that carry poly-A tails (see lane 3) or in the population of molecules that are cleaved (see lane 4). C. The band at the arrow in Figure 6–22 is absent from the polyadenylated RNA but present in the cleaved RNA because modification of this A does not prevent cleavage, but it does prevent polyadenylation. Thus, RNA molecules with this A modified are present in the cleaved molecules (see Figure 6–22, lane 4) but are not present in the polyadenylated molecules (see lane 3). D. The analysis of the missing bands in parts B and C indicates that the AAUAAA signal sequence is important for the cleavage of precursor RNAs and that the AAUAAA sequence and the single A are required for polyadenylation. E. If the other end—the 3¢ end—of the RNA molecules were labeled, it would have been possible to determine whether any of the As or Gs on the 3¢ side of the cleavage site were important for polyadenylation. These experiments have been done; they show that no single modification 3¢ of the polyadenylation site prevents polyadenylation. The sequence requirements on the 3¢ side of the cleavage site (GU- or U-rich) are not so specific as those on the 5¢ side and would not be expected to be inactivated by single changes. Reference: Conway L & Wickens M (1987) Analysis of mRNA 3¢-end formation by modification interference: the only modifications which prevent processing lie in AAUAAA and the poly(A) site. EMBO J. 6, 4177–4184. 6–47 A. The idea behind the oligonucleotide experiment was to try to cleave the RNA component of the snRNP that was suspected of interacting with the conserved sequence at the 3¢ end of the histone precursor. If the snRNP interacted by hybridizing to the precursor RNA, then an oligonucleotide that matched the sequence in the precursor RNA should be able to hybridize to the snRNA. Formation of a DNA–RNA hybrid would render the snRNA sensitive to cleavage by added RNase H. Cleavage of the snRNA in this critical region of interaction should render the extract incapable of processing the precursor. This result was the one observed for the mouse and consensus oligonucleotides. B. The inability of the human oligonucleotide to block processing was not anticipated, since a human extract was being used. Examination of the hybrids that can form between the various oligonucleotides and the U7 snRNA reveals that the mouse and consensus oligonucleotides can hybridize perfectly for a 10-nucleotide and a 9-nucleotide stretch, respectively A131 A132 Chapter 6: How Cells Read the Genome: From DNA to Protein 3¢ GTGTCGATGAAACCA 5¢ human |||||| 5¢ m3G-NNGUGUUACAGCUCUUUUAGAAUUUGUCUAGU 3¢ human U7 snRNA |||||||||| 3¢ TTGTCGAGAAAGGC 5¢ mouse ||||||||| 3¢ TGGTCGAGAAAGAAA 5¢ consensus (Figure 6–57). By contrast, hybridization to the human oligonucleotide is split by an unmatched nucleotide into two segments of 6 and 4 nucleotides. The stability of pairing of two separate segments is not so great as for a continuous pairing segment. Hence, the human oligonucleotide does not pair with sufficient stability to render the U7 snRNA sensitive to RNase H cleavage. Reference: Mowry KL & Steitz JA (1987) Identification of human U7 snRNP as one of several factors involved in the 3¢-end maturation of histone premessenger RNAs. Science 238, 1682–1687. 6–48 These results indicate that box elements C and D are important for the accumulation of U85 RNA. The presence of E2 RNA in Figure 6–25B, lanes 6 and 12, shows that the transfections were successful. Thus the absence of U85 RNA from those lanes is meaningful. It is unclear from these studies whether the altered box elements prevent processing from the intronic RNA or render the processed RNA unstable. Reference: Jady B & Kiss T (2001) A small nucleolar guide RNA functions both in 2¢-O-ribose methylation and pseudouridylation of the U5 spliceosomal RNA. EMBO J. 20, 541–551. 6–49 For nucleotides in U5 snRNA that are true targets for U85 snoRNA-dependent modification, the expectation is that the modification will be dependent on pairing between U5 snRNA and U85 snoRNA. Thus, a bona fide modification should be present in the transfection with U2–U5, which can be modified by the endogenous U85 snoRNA; however, it should be absent in the transfection with U2–U5m, which cannot pair stably with the endogenous U85 snoRNA. Most importantly, modification of U2–U5m should be restored in the presence of U85m snoRNA, which can pair with U2–U5m. Both pseudouridine y46 and methylated C45 behave according to these expectations (see Table 6–2). Thus, pseudouridine y46 and 2¢-O-methylation at C45 are both dependent on U85 snoRNA. Methylated U41 is present in all transfections, indicating that it is modified by some other component of the cell in a way that is not dependent on the nearby sequences that were modified in U2–U5m. Pseudouridine y43 is not modified under any conditions in these experiments, suggesting that it cannot be modified as part of the fragment of U5 that was inserted into U2. Thus, these experiments do not rule it out as a potential target for modification by U85 snoRNA. Nevertheless, studies with many other box H/ACA snoRNAs indicate that the guide sequences universally position the U to be converted to y in the location shown for y46 in Figure 6–26A and B. Reference: Jady B & Kiss T (2001) A small nucleolar guide RNA functions both in 2¢-O-ribose methylation and pseudouridylation of the U5 spliceosomal RNA. EMBO J. 20, 541–551. FROM RNA TO PROTEIN DEFINITIONS 6–50 Proteasome 6–51 Genetic code 6–52 Initiator tRNA 6–53 Anticodon Figure 6–57 Pairing between the three oligonucleotides and the human U7 snRNA (Answer 6–47). A133 FROM RNA TO PROTEIN 6–54 Ribozyme FRAME 1 6–55 Nonsense-mediated mRNA decay AGU CUA GGC ACU GA-3¢ S L G T 6–56 Reading frame 6–57 Aminoacyl-tRNA synthetase 6–58 Prion disease 6–59 Molecular chaperone FRAME 2 A GUC UAG GCA CUG A-3¢ V * A L FRAME 3 TRUE/FALSE AG UCU AGG CAC UGA-3¢ S R H * 3-FRAME TRANSLATION 6–60 False. Wobble pairing occurs between the third position in the codon and the first position in the anticodon. 6–61 False. Although the mechanism for identifying a target for digestion is a significant difference, an equally important—perhaps more important—difference is the processivity of digestion. In contrast to a simple protease, which cleaves a substrate’s polypeptide chain just once before dissociating, the proteasome keeps the entire substrate bound until all of it has been converted to short peptides. THOUGHT PROBLEMS 6–62 The amino acids encoded in each of the three reading frames are shown in Figure 6–58. If this segment of RNA encoded part of a larger protein, it would have to be translated in reading frame 1, which is the only one that does not contain a stop codon. 6–63 The only codon assignments consistent with the observed changes, and with the assumption that single-nucleotide changes were involved, are GUG for valine, GCG for alanine, AUG for methionine, and ACG for threonine. It is unlikely that you would be able to isolate a valine-to-threonine mutant in one step because that would require two nucleotide changes. Typically, two changes would be expected to occur at a frequency equal to the product of the frequencies for each of the single changes; hence, the double mutant would be very rare. 6–64 A. UUUUUUUUUUUU… codes for a polymer of phenylalanine. B. AUAUAUAUAUAU… codes for a polymer of alternating isoleucines and tyrosines. Because the start point of the ribosome on the RNA is random, the ribosomes will generate a mixture of polymers, some of which start with isoleucine and some with tyrosine. C. AUCAUCAUCAUC… codes for a mixture of three different polymers. Ribosomes start translation in each of the three reading frames: AUC–AUC– AUC–AUC-… codes for a polymer of isoleucine; UCA–UCA–UCA–UCA-… codes for a polymer of serine; and CAU–CAU–CAU–CAU-… codes for a polymer of histidine. 6–65 A. The genetic code can be used to convert the two amino acid sequences into a set of potential mRNA sequences. At the sites where two or more nucleotides are shown, different mRNA sequences are consistent with the same sequence of encoded amino acids. wild type: N M N G K AAU AUG AAU GGU AAA C C C G A G 1 2 3 AGUCUAGGCACUGA-3¢ S L G T V * A L S R H * Figure 6–58 Amino acids encoded in the three reading frames of an RNA (Answer 6–62). The amino acids encoded in each reading frame are shown separately and all together, as they are usually represented. Asterisks identify stop codons. A134 Chapter 6: How Cells Read the Genome: From DNA to Protein mutant: N M I W Q I C V M K D AAU AUG AUU UGG CAA AUU UGU GUU AUG AAA GAU C C G C C C G C A A A G Comparison of the potential mRNA sequences shows that the mutant carries an extra U at the eighth position, the first one that differs between the two potential mRNA sequences. If you compare only the codons for N (wild type) with I (mutant) there are three possibilities for the difference: insertion of a U, deletion of an A, or a nucleotide change of A Æ U. Comparison of the next codon in the two sequences rules out a nucleotide change, which would leave the adjacent codon unaffected and thus would not give rise to a frameshift mutation. It also allows one to distinguish between insertion and deletion of a nucleotide. Note that the Gs in the first and second positions of the glycine codon (GG–) in the wild type have become the Gs in the second and third positions in the tryptophan codon (–GG) in the mutant. This shift can only be explained by an insertion. Thus, the frameshift mutation arose by insertion of an AT base pair at the eighth position in this DNA sequence. B. Removing the extra U from the mutant sequence resolves some of the ambiguity in the wild-type sequence and extends the sequence for the wild type, which can be converted back into an amino acid sequence as shown below. * W * S A S N M N G K F V L * AAU AUG AAU GGC AAA UUU GUG UUA UGA AAG C C C C G A A G The first seven codons in the extended mRNA clearly code for amino acids, although the identities of some are ambiguous because the mRNA sequence is not fully defined. The eighth codon could either code for an amino acid or be a stop codon. The ninth codon is definitely a stop codon. Thus, since the asparagine (N) at the beginning of the peptide sequence is amino acid 263, the intact polypeptide would be either 269 or 270 amino acids in length. Two stop codons in tandem are commonly found in the end of coding sequences. 6–66 Mutations of the type described in 2 and 4 are often the most harmful. In both cases, the reading frame would be changed. Because these frameshifts occur early in the coding sequence or in the middle of it, the encoded protein will contain a nonsensical and usually truncated sequence of amino acids. In contrast, a reading frameshift that occurs toward the end of the coding sequence, as described in 1, will result in a largely correct protein that may be functional. Deletion of three consecutive nucleotides, as in scenario 3, leads to the deletion of one amino acid, but does not alter the reading frame. The deleted amino acid may or may not be important for the folding or activity of the protein. In many cases such mutations are silent; that is, they have insignificant consequences for the organism. Substitution of one nucleotide for another, as in scenario 5, is often completely harmless, because it does not change the encoded amino acid. In other cases it may change an amino acid, which may be deleterious or benign, depending on the location and functional significance of that particular amino acid. Often, the most deleterious kind of single-nucleotide change creates a new stop codon, which gives rise to a truncated protein. 6–67 A. A genetic code that used pairs of nucleotides would have 16 different codons (4 possible nucleotides in the first position ¥ 4 possible nucleotides in the A135 FROM RNA TO PROTEIN second position). Thus, it could specify a maximum of 16 different amino acids. B. A triplet code that depended only on codon composition would have 20 different codons (4 codons composed all of one base; 12 codons with two bases the same and one different; and 4 codons with three different bases). Such a code could specify a maximum of 20 different amino acids. C. It is relatively easy to see how a doublet code could be translated by a mechanism similar to that used with the standard genetic code. It is more difficult to see how the nucleotide composition of a stretch of three nucleotides could be translated without regard to the order of nucleotides, because base-pairing could no longer be used. An AUG, for example, would not basepair with the same anticodon as a UGA. 6–68 In present-day cells, there is some wobble in the matching of codons to anticodons. In several cases, the same tRNA pairs with multiple codons that specify the same amino acid but differ in their nucleotide sequence, generally at the third position of the codon. It seems likely that in the early biological world, without highly evolved ribosomes to help in the pairing process, the converse may also have been true: several different tRNAs, with similar anticodons, may have been able to bind to the same codon. This would have played havoc with the translation of the genetic message into protein, unless the amino acids carried by these tRNAs were chemically similar. In the absence of perfect specificity, natural selection may have operated to ensure that tRNAs with related anticodons carried chemically similar amino acids. Alternatively, in the early world, before modern aminoacyl-tRNA synthetases had evolved, there was probably some ambiguity in the matching of tRNAs with appropriate amino acids. The same tRNA might have become coupled to any of a number of amino acids that were chemically similar. One can imagine the evolution of the genetic code by refinement of a matching process that was originally imprecise and gave only a blurred relationship between sets of roughly similar codons and sets of roughly similar amino acids. 6–69 The rules for wobble pairing between the anticodon and the codon, expressed both ways, are shown in Table 6–5. It is striking that A in the wobble position of the anticodon in eucaryotes does not have a pairing partner in codons. It turns out that A is not used in the wobble position in eucaryotic tRNAs. In many cases an A is encoded in the wobble position, but it is changed to inosine (I) after transcription, giving rise to a mature tRNA that recognizes U or C in the wobble position of the codon. Reference: Lander ES et al (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921. 6–70 In eucaryotes a minimum of 45 tRNAs would be required to recognize all 61 codons, given the rules for wobble base-pairing. Pairs of codons that end in Table 6–5 Rules for wobble base-pairing between codon and anticodon (Answer 6–69). WOBBLE CODON BASE POSSIBLE ANTICODON BASE Bacteria U C A G A, G, or I G or I U or I C or U Eucaryotes U C A G G or I G or I U C WOBBLE ANTICODON BASE POSSIBLE CODON BASE Bacteria U C A G I A or G G U U or C U, C, or A Eucaryotes U C A G I A G U or C U or C A136 Chapter 6: How Cells Read the Genome: From DNA to Protein U or C always encode the same amino acid (for example, CAU and CAC encode histidine, and GGU and GGC encode glycine). Thus a single tRNA with an I (or a G) in the wobble position of the anticodon would be required for each such pair of codons (see Table 6–5, Answer 6–69). The 16 pairs of such codons (32 codons) would require 16 tRNAs. Each of the remaining 29 codons, which end in either an A or a G, would require a specific tRNA with a corresponding U or C in the wobble position. Thus, the minimum number of tRNAs is 16 plus 29, or 45. 6–71 The single codon for tryptophan is 5¢-UGG. The anticodon of the normal tryptophan tRNA is 5¢-CCA, which pairs specifically with this codon. A mutation that changes the anticodon to 5¢-UCA would allow the tRNA to recognize the 5¢-UGA stop codon, which would lead to insertion of tryptophan at UGA and prevent termination of translation. Because of wobble (see Table 6–5), the mutant anticodon would also recognize the normal 5¢-UGG codon, so that, in principle, its ability to insert tryptophan at the normal UGG codons would not be compromised. Many genes use UGA codons as the natural stop sites for their encoded proteins. These stop codons would also be affected by the mutant tRNA. In reality there is a competition between the mutant tRNA and the termination factors. Whenever the tRNA wins the race, the affected proteins would be made with additional amino acids at their C-terminal ends. The additional lengths would depend on the number of codons before the ribosomes encounter another stop codon in the mRNA. The potential chaos that such mutations might cause is mitigated by two factors: the efficiency of translation of stop codons by such mutant tRNAs is usually low, and many bacterial genes are ‘protected’ by double stop codons at their ends. In reality, such suppressors have been invaluable for genetic studies in bacteria. 6–72 This experiment beautifully demonstrates that the ribosome does not check the amino acid that is attached to a tRNA. Once an amino acid has been coupled to a tRNA, the ribosome will ‘blindly’ incorporate that amino acid according to the match between the codon and anticodon. We can therefore conclude that a significant part of the correct reading of the genetic code— namely, the matching of a codon with the correct amino acid—is performed by the synthetase enzymes that attach amino acids to tRNAs. Reference: Chapeville F, Lipmann F, von Ehrenstein G, Weisblum B, Ray WJ & Benzer S (1962) On the role of soluble ribonucleic acid in coding for amino acids. Proc. Natl Acad. Sci. U.S.A. 48, 1086–1092. 6–73 One effective way of driving a reaction to completion is to remove one of the products. The flow of substrates to products then increases to reestablish the equilibrium ratio—the principle of mass action. All three of the products of this reaction are removed. The concentration of AMP is constantly reduced by conversion to ADP and then to ATP by other reactions in the cell. Similarly, the aminoacyl-tRNAs are used in protein synthesis, constantly decreasing their concentrations. But by far the most dramatic influence is the removal of PPi by hydrolysis to two phosphates. That reaction yields as much free energy as the hydrolysis of ATP to ADP, which means that essentially all of the PPi will be converted to free phosphates. As a result, the linked reactions for charging a tRNA and hydrolyzing PPi—the reactions as they occur in cells—have a DG° of –6.9 kcal/mole. 6–74 A. The ratio of N-terminal to total radioactivity will increase with increasing time of exposure. Because you isolate only complete protein for your analysis, the position that will be labeled at the shortest time point will be the Cterminus. At the shortest time point the radioactivity at the N-terminus will be nearly nonexistent, giving a very low ratio of N-terminal to total radioactivity. With increasing time more and more protein will carry label at the Nterminus so the ratio is expected to rise. By the time the N-terminus becomes labeled, all the rest of the leucines in the protein will also be FROM RNA TO PROTEIN labeled; thus, at late times the ratio of N-terminal to total radioactivity will equal the ratio of N-terminal to total leucines, which is 1:5 or 0.2. 6–75 The proportion of a cell’s total energy devoted to protein synthesis is typically determined by measuring oxygen consumption in the presence and absence of inhibitors of protein synthesis. Because oxygen is used principally for generation of ATP via oxidative phosphorylation, and nearly all the cell’s energy is derived from oxidative phosphorylation, oxygen consumption is a fairly direct measure of energy usage. The fractional drop in oxygen consumption in the absence of protein synthesis (when it’s inhibited) indicates the proportion of the cell’s energy normally devoted to protein synthesis. 6–76 A. 5¢-GUAGCCUACCCAUAGG-3¢ B. This short mRNA encodes three different peptides because there are three different reading frames. In the second reading frame, the first codon is the stop codon UAG; thus, it is unlikely that the subsequent codons will be translated. 5¢-GUAGCCUACCCAUAGG-3¢ Frame 1 V A Y P * Frame 2 * P T H R Frame 3 S L P I The other possible mRNA from this DNA would read 5¢-CCUAUGGGUAGGCUAC-3¢ Frame 1 P M G R L Frame 2 L W V G Y Frame 3 Y G * A Thus, the sequence of the peptides encoded by the complementary DNA strand would be completely different. Be careful to keep the polarity of the strands correct; don’t fall into the trap of thinking that the complementary sequence of the first mRNA is 5¢-CAUCGGAUGGGUAUCC-3¢, which is incorrect because it is the wrong polarity. C. If translation begins at the 5¢ end of the RNA, the synthesized protein would be valine-alanine-tyrosine-proline (VAYP). Only after a peptide bond has been formed between alanine and tyrosine will tRNAAla leave the ribosome. Thus, the next tRNA that will bind to the ribosome after tRNAAla has left is tRNAPro. When the amino group of alanine forms a peptide bond, the ester bond between valine and tRNAVal is broken, tRNAVal moves from the P-site to the E-site (exit site), and tRNAAla moves from the A-site to the P-site. 6–77 When EF-Tu has positioned an aminoacyl-tRNA in the acceptor site on the ribosome, it hydrolyzes its bound GTP and exits, leaving the aminoacyltRNA behind. The first delay comes about because the rate of GTP hydrolysis is faster for a correct codon–anticodon pair than for an incorrect one. As a result, an incorrectly bound tRNA has more time to dissociate from the ribosome. The second delay occurs between dissociation of EF-Tu and full accommodation of the tRNA into the A-site. This time delay, which is also shorter for correct than incorrect tRNAs, allows a second opportunity for the incorrect tRNA to dissociate from the ribosome. An incorrect tRNA will, on average, dissociate more rapidly than a correct one because its codon–anticodon match contains fewer hydrogen bonds. 6–78 In eucaryotic cells protein synthesis is normally initiated by scanning from the 5¢ end of the mRNA until the first AUG codon is found. (Sometimes the second or third AUG codon may be used instead—a phenomenon known as leaky scanning.) This mechanism of initiation ensures that ribosomes will all start translating near the 5¢ end of the mRNA. When the ribosomes complete synthesis of the protein, they fall off the mRNA and must reinitiate by scanning A137 A138 Chapter 6: How Cells Read the Genome: From DNA to Protein from the 5¢ end. By contrast, in procaryotic cells protein synthesis is initiated by base-pairing between mRNA sequences adjacent to an initiation AUG codon and sequences in the 16S rRNA of the small ribosomal subunit. The procaryotic initiation strategy allows ribosomes to recognize several start sites in the same mRNA. This key difference in mechanism underlies their ability to make several proteins from a single polycistronic mRNA. 6–79 A broken mRNA when translated would produce a truncated protein that could be harmful to the cell. A protein fragment can retain some of the functions of the whole protein, allowing it, for example, to bind to a target protein but trap it in an unproductive complex. Alternatively, a protein fragment can display new, aberrant binding surfaces that allow it to bind to novel partners, interfering with their function. 6–80 A. Edeine specifically inhibits initiation of protein synthesis by preventing the joining of the 60S ribosomal subunit to the 40S subunit/mRNA/initiator tRNA complex. Since elongation is not blocked, ribosomes that have already begun synthesis complete their individual chains and fall off the mRNA, leaving attached only the small subunit and the initiator tRNA. Edeine is an antibiotic produced by certain strains of Bacillus brevis. B. A lag occurs before protein synthesis shuts off because edeine inhibits initiation but has no effect on elongation. Thus, a ribosome that has just started making a new polypeptide is free to complete it. Incorporation of label continues for just the length of time it takes to complete the protein (in this case, the globin chains of hemoglobin), which takes about a minute. C. If cycloheximide (or any other elongation inhibitor) is added at the same time as an initiation inhibitor, the polyribosomes are ‘frozen.’ Polyribosome breakdown by initiation inhibitors requires ribosome movement, which is blocked by elongation inhibitors. Reference: Safer B, Kemper W & Jagus R (1978) Identification of a 48S preinitiation complex in reticulocyte lysate. J. Biol. Chem. 253, 3384–3386. 6–81 D. The formation of one peptide bond, but no more, eliminates all choices except D. If farsomycin inhibited formation of the 80S initiation complex (choice A), inhibited binding of aminoacyl-tRNAs to the A-site (choice B), or inactivated peptidyl transferase (choice C), no peptide would have been formed. If it interfered with chain termination and release (choice E), the entire peptide would have been made. 6–82 In a well-folded protein, the majority of hydrophobic amino acids will be sequestered in the interior away from water. Exposed hydrophobic patches thus indicate that a protein is abnormal in some way. Some proteins initially fold with exposed hydrophobic patches that are used in binding to other proteins, ultimately burying those hydrophobic amino acids as well. As a result, hydrophobic amino acids are usually not exposed on the surface of a protein, and any significant patch is a good indicator that something has gone awry. The protein may have failed to fold properly after leaving the ribosome, it may have suffered an accident that partly unfolded it at a later time, or it may have failed to find its normal partner subunit in a larger protein complex. 6–83 Molecular chaperones fold like any other protein. Molecules in the act of synthesis on ribosomes are bound by Hsp70 chaperones. And incorrectly folded molecules are helped by Hsp60-like chaperones. That they function as chaperones when they have folded correctly makes no difference to the way they are treated before they reach their final, functional conformation. Of course, properly folded Hsp60-like and Hsp70 chaperones must already be present to help fold the newly made chaperones. At cell division, each daughter cell inherits a starter set of such chaperones from the parental cell. A139 FROM RNA TO PROTEIN E1 is the ubiquitin-activating enzyme. It repetitively donates activated ubiquitin monomers to the E2 component of the E2-E3 ubiquitin ligase, which then transfers ubiquitin units repetitively to bound substrate molecules. Particular combinations of E2 and E3, of which there are many in the cell, provide binding specificity so that only certain species of proteins are targeted for ubiquitylation and destruction. CALCULATIONS 120 100 80 60 40 20 0 0 6–85 A. As shown in Figure 6–59, the rate of synthesis is linear with time. The curvature so apparent in the autoradiograph in Figure 6–29 results from the nonlinear migration of proteins in SDS polyacrylamide gels. B. The rate of protein synthesis can be determined from the slope of the line in Figure 6–59. This system is synthesizing roughly 52,000 daltons of protein per 10 minutes, or 5200 daltons per minute, which corresponds to about 47 amino acids per minute [(5200 daltons/minute)/(110 daltons/amino acid)]. This rate is less than the rate in E. coli, which is about 10 times faster. The rate is also about three times less than that of globin synthesis in the same reticulocyte lysate. As discussed in part C, part of the reason for the low rates may be that the mix of tRNAs in the reticulocyte lysate is not optimal for this plant virus protein. C. The autoradiograph contains many bands, rather than just a few, because ribosomes keep loading onto the mRNA throughout the course of the experiment. You could obtain the theoretical result in Figure 6–30B by adding an inhibitor of initiation after 5 minutes or, alternatively, by adding unlabeled methionine in vast excess after 5 minutes. The presence of discrete bands rather than a continuous background fuzz suggests that there are specific hang-up points along the mRNA, perhaps where ribosomes must wait for rare tRNAs. The tRNA population in the reticulocyte is specialized for making globin, not a protein from a plant virus! 6–86 A. Since an average protein contains about 455 amino acids [(50,000 d/protein) ¥ (amino acid/110 d)], it will take a muscle cell about 3.8 minutes to make it [(455 amino acids) ¥ (sec/2 amino acids) ¥ (min/60 sec)]. Since titin is 60 times the size of an average protein, the muscle cell will require 3.8 hours to make it [(3,000,000 d/titin) ¥ (amino acid/110 d) ¥ (sec/2 amino acids) ¥ (hr/3600 sec)]. B. It will take a muscle cell about 23 minutes to transcribe an average gene and 23 hours to transcribe titin. For the average protein 455 amino acids corresponds to 1365 nucleotides of RNA [(3 nt/codon) ¥ (455 codons)]. Given that 5% of the initial transcript is converted to mRNA, the initial transcript is 2.7 ¥ 104 nucleotides (1365 nt ¥ 20), which would require about 23 minutes to transcribe [(2.73 ¥ 104 nt) ¥ (sec/20 nt) ¥ (min/60 sec)]. Because titin is 60 times as big, the muscle cell will require about 23 hours to transcribe it. 6–87 140 molecular mass of largest peptide (kd) 6–84 The energy cost of translation and transcription will be equal when 30 protein molecules have been made from one mRNA. Protein synthesis requires four high-energy phosphate bonds per codon (per three nucleotides). Transcription consumes six high-energy phosphate bonds to make a codon, but also consumes 19 times more energy synthesizing RNA that will be discarded (95%). Thus, transcription consumes a grand total of 120 high-energy phosphate bonds per codon (6 + 114), compared to four per codon for translation. The ratio of energy costs per codon (120/4 = 30) defines the number of proteins that will have been made when the energy cost of translation matches that of transcription. Because most mRNAs are used to make hundreds to thousands of proteins, translation consumes a much higher fraction of the cell’s energy than does transcription. 5 10 15 20 25 time (minutes) Figure 6–59 Rate of synthesis of a TMV protein (Answer 6–85). A140 Chapter 6: How Cells Read the Genome: From DNA to Protein 6–88 A. Since the bacteria were labeled for one generation, which represents a doubling in mass, 4 mg of the 8 mg of flagellin isolated from the gel were synthesized in the presence of 35S-cysteine. The amount of radioactivity in the sample indicates that about 1 out of every 1670 flagellin (flgn) molecules contains a cysteine. Cys 300 cpm Cys mmol pmol Cys 4 ¥ 104 mg flgn ¥ 6 ¥ = ¥ flgn 4 mg flgn mmol flgn 10 pmol 5 ¥ 103 cpm –2 = 6 ¥ 10 pmol Cys 100 pmol flgn Cys = 6 ¥ 10–4 flgn which is equal to 1 cysteine per 1670 flagellin molecules [1/(6 ¥ 10–4)]. B. The normal codons for cysteine are UGU and UGC. Thus, the error in anticodon–codon interaction is a mistake at the first position of the codon (third position of the anticodon). The experiment described, as well as others, suggest that ribosomes tend to mistake U for C and C for U in the first two positions of the codon, and C and U for A in the first position. C. Assuming that all six arginine codons are equally frequent, there should be six sensitive (CGC and CGU) arginine codons [(2/6) ¥ 18] in a flagellin molecule. Therefore, the actual error frequency per codon-at-risk is 1 cysteine flagellin molecule ¥ 1670 flagellin molecules 6 sensitive codons error frequency = 10–4 error frequency = D. If the probability of making a mistake at each codon is 10–4, the probability of not making a mistake at each codon is (1 – 10–4). The probability of not making a mistake at n codons is then (1 – 10–4)n. Thus, the percentage of correctly synthesized molecules 100 amino acids in length is (1 – 10–4)100, or 99%. For a protein 1000 amino acids long, 90% are correct. For a protein 10,000 amino acids long, only 37% are correct. Given these sorts of estimates, it is perhaps not surprising that proteins more than 3000 amino acids long are rare. If you are curious, you might calculate the fraction of titin molecules that you would expect to be made correctly (see Problem 6–86)? Reference: Edelman P & Gallant J (1977) Mistranslation in E. coli. Cell 10, 131–137. DATA HANDLING acceptor stem 6–89 A. A simple change of anticodon allows tRNAVal to be charged by IleRS at about 60% normal efficiency. Thus, the anticodon is the most important part for charging. As the rest of the tRNAVal molecule becomes more like tRNAIle, the efficiency steadily increases, suggesting that additional sequences that aid in the charging with isoleucine are dispersed throughout tRNAIle. B. Results with the chimeric tRNAs show that a tRNAVal carrying just the D-loop and anticodon from tRNAIle, is very nearly as effective for valine editing as normal tRNAIle. In fact, a close examination of the sequences of the D-loops from the two tRNAs reveals just three nucleotide differences. These nucleotides are located at the elbow in the ‘L’ shaped three-dimensional structure of tRNAs (Figure 6–60). C. Generally no. However, the D-loop is crucial to valine editing and it also improves the efficiency of tRNAIle charging by IleRS. Reference: Hale SP, Auld DS, Schmidt E & Schimmel P (1997) Discrete determinants in transfer RNA for editing and aminoacylation. Science 276, 1250–1252. anticodon Figure 6–60 Schematic structure of tRNAIle (Answer 6–89). Arrows indicate the locations of the nucleotide differences between the D-loops in tRNAIle and tRNAVal. Dotted lines indicate hydrogen bonds between nucleotides that help to establish the three-dimensional structure. A141 FROM RNA TO PROTEIN (A) RIBOSOMES WITH ATTACHED PEPTIDES mRNA 8 start of coding region 16 end of coding region N 8 16 N 8 16 N 8 16 nascent peptide chains N 8 N radioactivity in peptide (B) EVENLY SPACED RIBOSOMES (C) BLOCKED RIBOSOMES block amino acid number amino acid number 6–90 A. The data in Figure 6–32 indicate that the N-terminus of the protein is synthesized first. The steadily decreasing level of radioactivity from the N-terminus to the C-terminus is exactly what you would expect if synthesis began at the N-terminus. As illustrated in Figure 6–61A, all the ribosomes carry a labeled lysine at position 8 in their nascent chains, but the ribosome at the 5¢ end of the mRNA has not yet reached the lysine at position 16. Thus, when digested with trypsin, all of these nascent chains will yield a labeled N-terminal peptide, but a smaller fraction will yield the second peptide. Fewer still will contain the third peptide, and so on. Almost none of the ribosomes will carry a nascent chain with the labeled lysine nearest the C-terminus. B. The lines for the a and b chains in Figure 6–32 are very similar, with nearly identical intercepts on both axes, which indicates that roughly equal numbers of each chain are being synthesized. However, there is not enough information to decide whether the numbers of a- and b-globin mRNA molecules are equal. You would need to know how many ribosomes there were on each mRNA—the average polyribosome size for a- and b-globin mRNAs—to deduce their relative abundance from these kinds of data. Actually, there is about twice as much a-globin mRNA as b-globin mRNA, but the a-globin mRNA is less efficiently translated; that is, fewer ribosomes initiate synthesis on a-globin mRNA per unit time than on b-globin mRNA. These factors cancel out to give a balanced production of the two chains. C. The graph in Figure 6–32 hits zero right at the end of the coding region, which indicates that chains are released from ribosomes as soon as they encounter the stop codon—or at least they do so without a measurable pause on this time scale. D. If there were a significant roadblock to ribosome movement, the data would resemble that in Figure 6–33A. A roadblock would result in more densely packed ribosomes in front of the block and less densely packed ribosomes beyond the block. The consequences of normal and inhibited ribosome movement are illustrated schematically in Figure 6–61B and C. Figure 6–61 Relationship of ribosome position to peptide length and labeling pattern (Answer 6–90). (A) Lengths of peptides associated with ribosomes at various positions along b-globin mRNA. Numbers refer to positions of the first two lysines. (B) Pattern of peptide labeling for evenly spaced ribosomes. (C) Pattern of peptide labeling for ribosomes whose movement is inhibited at a point about midway down the mRNA. Peptides associated with each ribosome are shown on the graphs in B and C as lines. Small circles correspond to the C-termini of the polypeptides and are aligned immediately below the ribosome on which the peptides are synthesized. Dashed lines through the small circles show the expected patterns of peptide labeling. A142 Chapter 6: How Cells Read the Genome: From DNA to Protein 6–91 A. The DNA sequence GGG TAT CTT TGA CTA CGA CGC C should not encode the protein sequence of RF2, since UGA is a termination codon. It appears that this sequence must break the usual rules of the triplet code, with a leucyl-tRNA decoding the quadruplet shown in italics below. GGG TAT CTTT GAC TAC GAC GCC In essence, the ribosome must shift its reading frame in the middle of the gene! Frameshift mutations were originally isolated by Seymour Benzer in his work on the rII genes of bacteriophage T4 and exploited by Francis Crick in a proof of the triplet nature of the genetic code. Later, mutant tRNA molecules that could read four bases at a time were isolated by clever genetic selection and shown to suppress certain frameshift mutations. It came as a great surprise, however, to find natural examples of frameshift suppression. The first example was found in the bacteriophage T7 gene 10. Since then, several retroviruses and retroposons have been found to use frameshift suppression of termination codons as a way of making minor gene products. The mechanism of suppression in these cases is not clear. B. The occurrence of an in-frame suppressible UGA codon (which is recognized uniquely by RF2) in the sequence of RF2 immediately suggests a novel form of gene control. Although the mechanism of frameshifting is undefined, frameshifting and termination at the UGA codon probably compete with one another. When the level of RF2 in the cell is high, termination should occur more frequently at the UGA codon than when the level of RF2 is low. Thus, new RF2 would be synthesized infrequently when its levels were already adequate, but if the levels fell, the chances of ribosomal frameshifting would increase and more RF2 would be made. Thus, this situation seems to be a very cleverly appropriate autoregulation. References: Craigen WJ, Cook RG, Tate WP & Caskey CT (1985) Bacterial peptide chain release factors: conserved primary structure and possible frameshift regulation of release factor 2. Proc. Natl Acad. Sci. U.S.A. 82, 3616–3620. Jacks T & Varmus H (1985) Expression of the Rous sarcoma virus pol gene by ribosomal frameshifting. Science 230, 1237–1242. 6–92 SmpB evidently plays no role in the charging of tmRNA with alanine, since that reaction is unaffected by the presence or absence of SmpB. SmpB is critical for the association of tmRNA with ribosomes (see Figure 6–35A), which presumably explains why protein fragments are not tagged and degraded in SmpB-deficient cells (see Figure 6–35B). Although these studies identify roughly where in the process SmpB acts, they do not define its precise function. Reference: Karzai AW, Susskind MM & Sauer RT (1999) SmpB, a unique RNA-binding protein essential for the peptide-tagging activity of SsrA (tmRNA). EMBO J. 18, 3793–3799. 6–93 A. The sequence data for the Tetrahymena protein are unusual because they indicate that UAG and UAA, which are stop codons in other organisms, specify glutamine (Q) in Tetrahymena. B. The minor protein above the full-length, 116-kd protein is produced from the pure TMV mRNA by readthrough of the normal stop codon. Although it is difficult to be sure exactly how such a rare event occurs, the amount of this protein is thought to represent the frequency with which the reticulocyte translation system mistakenly inserts an amino acid at the site of the stop codon instead of terminating properly. It is a little surprising that a second termination codon is not encountered for 506 codons (about 50 kd of additional protein). FROM RNA TO PROTEIN C. Given that Tetrahymena uses UAG and UAA as codons for glutamine, the increase in the proportion of the readthrough TMV protein is most likely due to the presence of a tRNAGln species with an anticodon complementary to the normal TMV stop codon (which is UAG). The addition of Tetrahymena RNA causes a small shift in the proportions because it contains some charged tRNAGln. The cytoplasm causes a larger shift because it also contains the appropriate aminoacyl-tRNA synthetase. (The additional shift with the cytoplasm suggests that the tRNA synthetases in the reticulocyte lysate cannot recharge the special Tetrahymena tRNA.) These results suggest that at least two components from Tetrahymena—a special tRNA and its cognate aminoacyl-tRNA synthetase—must be added to a reticulocyte lysate to allow Tetrahymena mRNA to be translated efficiently. These components compete effectively with the reticulocyte release factors, allowing the Tetrahymena mRNAs to be read. D. Although slight variations in the genetic code were originally discovered in mitochondrial genomes, they were not as surprising as the Tetrahymena changes. After all, mitochondrial genomes are small and encode relatively few proteins, so it is less difficult to imagine how changes might occur. By contrast, the Tetrahymena genome encodes thousands of proteins. It is much more surprising that it managed to survive the presumptive transition from the standard code to its present-day code. References: Horowitz S & Gorovsky MA (1985) An unusual genetic code in nuclear genes of Tetrahymena. Proc. Natl Acad. Sci. U.S.A. 82, 2452–2455. Andreasen PH, Dreisig H & Kristiansen K (1987) Unusual ciliate-specific codons in Tetrahymena mRNAs are translated correctly in a rabbit reticulocyte lysate supplemented with a subcellular fraction from Tetrahymena. Biochem. J. 244, 331–335. 6–94 A. The set of control experiments argues convincingly that the association between DnaK and the labeled proteins is meaningful; that is, it reflects some biological function. In the presence of SDS, which eliminates protein–protein interactions, antibodies precipitate only DnaK (see Figure 6–37A, lane 2), suggesting that protein–protein associations are required for precipitation of the labeled proteins. The absence of labeled proteins from DnaK-deletion cells (see lane 3) indicates that precipitation depends on DnaK and is not the result, for example, of nonspecific association with the antibodies. The lack of precipitation of labeled proteins from a mixture of labeled DnaK-deletion cells and unlabeled wild-type cells (see lane 4), argues that the associations of proteins with DnaK were established in cells and not during subsequent experimental procedures. B. ATP would be expected to interfere with precipitation of labeled proteins if Hsp70 used it in the normal way; that is, to power the cycling of Hsp70 on and off the protein. In the absence of ATP, DnaK will have hydrolyzed a bound molecule of ATP—that it acquired in the cell—to ADP, altering its own conformation and allowing it to latch onto a hydrophobic patch in a nascent protein. If ATP is present in the extract, it will displace the ADP, reversing the conformational change and releasing DnaK from the nascent protein. In the more dilute conditions in the extract, the presence of ATP greatly favors the off reaction, and as a result labeled proteins are not precipitated. C. The pulse-chase experiment in Figure 6–37B indicates that DnaK binds the labeled proteins only for a few minutes after 35S-methionine has been incorporated. A natural interpretation of this experiment is that some of the labeled proteins are incorrectly folded initially, and bind DnaK as a consequence. Antibodies against DnaK precipitate these bound proteins. With time and with help from DnaK, the proteins correctly fold and are no longer substrates for DnaK binding; hence, they disappear from the immunoprecipitates. D. Nothing in these experiments shows directly that the proteins bound by DnaK are in the process of being translated on ribosomes. The very short A143 A144 Chapter 6: How Cells Read the Genome: From DNA to Protein duration of the labeling pulse (15 seconds) would be expected to label proteins in the process of being translated, but it would also label proteins that were completed during the pulse and, therefore, clear of the ribosome. The category of protein that DnaK binds is not clear from these experiments. In additional experiments the authors show convincingly that some of the proteins that are bound by DnaK were indeed attached to ribosomes. References: Teter SA, Houry WA, Ang D, Tradler T, Rockabrand D, Fischer G, Blum P Georgopoulos C & Hartl FU (1999) Polypeptide flux through bacte, rial Hsp70: DnaK cooperates with trigger factor in chaperoning nascent chains. Cell 97, 755–765. Deuerling E, Schulze-Specking A, Tomoyasu T, Mogk A & Bukau B (1999) Trigger factor and DnaK cooperate in folding of newly synthesized proteins. Nature 400, 693–696. 6–95 A. When IPTG is present, the strains that do not express TF (DTig) or DnaK (IDnaK) grow as well as the wild-type strain at all temperatures. When IPTG is absent, the DTig strain continues to grow whereas, in contrast, the I-DnaK strain does not do so at either 15°C or 42°C, although it grows fine at the intermediate temperatures. Thus, DnaK appears to be the more critical chaperone under the stressful conditions of high and low temperature. B. Unlike the single mutants, the double mutant does not grow at any temperature in the absence of both TF and DnaK (that is, when there is no IPTG in the plate). The lethality of the double mutant at 30°C and 37°C, where both of the single mutants grow perfectly well, is termed synthetic lethality; it indicates that the two gene products cooperate in some way, which need not be direct. Because both of these genes encode Hsp70 chaperones, it seems likely that they collaborate or cover for one another in some way in protein folding. DnaK can fully compensate for the loss of TF throughout the temperature ranges tested in these experiments, and TF can compensate for the loss of DnaK at intermediate, but not extreme, temperatures. When neither chaperone is functional, misfolding of proteins even at intermediate temperatures is lethal to the cells. Additional experiments demonstrated that in the absence of both chaperones cytosolic proteins undergo massive aggregation. References: Deuerling E, Schulze-Specking A, Tomoyasu T, Mogk A & Bukau B (1999) Trigger factor and DnaK cooperate in folding of newly synthesized proteins. Nature 400, 693–696. Teter SA, Houry WA, Ang D, Tradler T, Rockabrand D, Fischer G, Blum P, Georgopoulos C & Hartl FU (1999) Polypeptide flux through bacterial Hsp70: DnaK cooperates with trigger factor in chaperoning nascent chains. Cell 97, 755–765. 6–96 A. The results suggest that tritium exchange occurs within one cycle. Although there is time for multiple cycles—and they presumably occur in the presence of ATP—the results with AMPPNP indicate that a single cycle is sufficient. Because AMPPNP cannot be hydrolyzed, the chaperone will not be able to eject the protein and repeat the cycle. B. The accelerated exchange of tritium in the presence of the chaperone and ATP indicates that the protein is unfolded before it is refolded. The isolationchamber model starts from the premise that aggregation limits the folding of a protein. If the cavity of GroEL facilitated proper folding by reducing inappropriate interactions, it would not seem essential that the protein first be unfolded. By contrast, if a stable but incorrectly folded domain blocked correct folding, the protein would, by necessity, have to be unfolded first. Thus, the results with the particular protein used in these experiments, which was the plant CO2-fixation protein, RuBisCo, support an active unfolding model, which is powered not by ATP hydrolysis but by FROM RNA TO PROTEIN ATP binding. More recent experiments indicate that accurate refolding of RuBisCo occurs inside the GroEL chamber, which in some way helps the protein to avoid misfolded intermediates by encouraging it along the productive folding pathway. References: Shtilerman M, Lorimer GH & Englander SW (1999) Chaperonin function: Folding by forced unfolding. Science 284, 822–825. Brinker A, Pfeifer G, Kerner MJ, Naylor DJ, Hartl FU & Hayer-Hartl M (2001) Dual function of protein confinement in chaperonin-assisted protein folding. Cell 107, 223–233. 6–97 A. In this experiment the plateau value for the radioactive proteins in the absence of proteasome inhibitors was about 30% lower than it was in the presence of inhibitors, suggesting that 30% of newly synthesized proteins are degraded in proteasomes in lymph node cells. Similar results in other cell types suggest that substantial degradation of newly synthesized proteins may be common in most cells. B. The absence of differential affects on specific proteins is surprising. One possibility is that a constant fraction (about 30% of all proteins) misfolds and is degraded by proteasomes; however, given the variety of proteins it seems unlikely that they would all misfold to the same degree. Another possibility is that newly synthesized proteins are sampled randomly for degradation to serve some other biological function. (One such function is to provide an array of peptides for display on the cell surface to inform the immune system of the proteins the cell is currently making—which is thought to give it a head start in identifying infected cells.) Alternatively, it may be that ribosomes make a high fraction of mistakes (30%) in the form of peptide fragments and misincorporations that cannot fold properly and are normally removed. These errors would contribute to a background of radioactivity throughout the gel, accounting for an increased overall intensity in the absence of proteasomes. Reference: Schubert U, Antón LC, Gibbs J, Norbury CC, Yewdell JW & Bennink JR (2000) Rapid degradation of a large fraction of newly synthesized proteins by proteasomes. Nature 404, 770–774. 6–98 Overexpression of dm-N70 does not cause accumulation of any of the proteasome substrates. That is the expected result if the D-box is critical for its interaction with APC. Because it doesn’t bind to APC, it doesn’t influence ubiquitylation. Overexpression of K0-N70 gives the result that was initially anticipated: Dbox proteins accumulate, but non-D-box proteins do not. This result suggests that K0-N70 specifically interferes with destruction of D-box proteins, probably by competing for binding to APC. The most difficult result to understand is the original one: overexpression of N70 causes accumulation of both D-box and non-D-box proteins. Because removal of its lysines eliminates this effect, it seems likely that the effect is caused by ubiquitylation of N70. Overexpression of N70 and its ubiquitylation is thought to sequester a large fraction of the cellular supply of ubiquitin; thus, interfering with ubiquitylation of other proteins by decreasing the availability of ubiquitin. Reference: Yamano H, Tsurumi C, Gannon J & Hunt T (1998) The role of the destruction box and its neighbouring lysine residues in cyclin B for anaphase ubiquitin-dependent proteolysis in fission yeast: defining the Dbox receptor. EMBO J. 17, 5670–5678. 6–99 A. Although the first codon of b-galactosidase could have been changed by recombinant DNA techniques, it would no longer have served as a start site for translation. All proteins, bacterial and eucaryotic, are initially translated A145 A146 Chapter 6: How Cells Read the Genome: From DNA to Protein with methionine at their N-termini. In many cases methionine is removed (and occasionally additional amino acids as well), leaving a new N-terminus. The procedure described here was arrived at by chance! The investigators were originally interested in whether ubiquitin at the N-terminus would cause a protein to be degraded. This question led them to generate the fusion gene. In bacteria, which do not have a ubiquitin-dependent protease, the fusion protein was made as they anticipated; however, in yeast the same plasmid produced only b-galactosidase, suggesting that the ubiquitin was removed. To try to prevent this cleavage, they altered the codons at the junction. The ubiquitin was still removed, but now the resulting b-galactosidases differed remarkably in stability. The focus of their study quickly changed, leading to insights into the role of the N-terminus in determining the stability of proteins. B. The half-lives of the different b-galactosidases can be estimated from the graph in Figure 6–44B by finding the time at which half the b-galactosidase remains. The three b-galactosidases have very different half-lives: R-b-galactosidase has a half-life of about 2 minutes; I-b-galactosidase has a half-life of about 30 minutes; and M-b-galactosidase has a half-life that is too long to be measured in this experiment (it was estimated to be greater than 20 hours). Reference: Bachmair A, Finley D & Varshavsky A (1986) In vivo half-life of a protein is a function of its amino-terminal residue. Science 234, 179–186. 6–100 A. The absence of radioactivity at the position of b-galactosidase in Figure 6–45 indicates that the labeled antibodies against ubiquitin do not react with the protein at that position. Thus, the band that is marked b-galactosidase in Figure 6–44A does not carry any attached ubiquitin, indicating that ubiquitin was removed from the N-terminus of the fusion protein. B. The ubiquitin above the position of b-galactosidase in Figure 6–45 must be attached to b-galactosidase, since the enzyme was purified by binding to antibodies specific for b-galactosidase. Ubiquitin attached to the two unstable enzymes—but not to the stable enzyme—suggests that ubiquitin marks the protein for degradation. The ladderlike appearance of the bands suggests that a variable number of copies of ubiquitin are attached to each protein (at lysines) before the enzyme is degraded. Reference: Bachmair A, Finley D & Varshavsky A (1986) In vivo half-life of a protein is a function of its amino-terminal residue. Science 234, 179–186. THE RNA WORLD AND THE ORIGINS OF LIFE DEFINITIONS 6–101 RNA world TRUE/FALSE 6–102 False. Although only a few types of reactions are represented among the ribozymes in present-day cells, ribozymes that have been selected in the laboratory can catalyze a wide variety of biochemical reactions, with reaction rates similar to those of proteins. In light of these results, it is unclear why ribozymes are so underrepresented in modern cells. It seems likely that the availability of 20 amino acids versus four bases affords proteins with a greater number of catalytic strategies than ribozymes, as well as endowing them with the ability to bind productively to a wider range of substrates (for example, hydrophobic substrates, which ribozymes have difficulty with). A147 THE RNA WORLD AND THE ORIGINS OF LIFE THOUGHT PROBLEMS 6–103 RNA has the ability to store genetic information like DNA and the ability to catalyze chemical reactions like proteins. Having both of these essential features of ‘life’ in a single type of molecule makes it easier to understand how life might have arisen from nonliving matter. The use of RNA molecules as catalysts in several fundamental reactions in modern-day cells supports this idea. Nevertheless, it is not yet possible to specify a plausible pathway from the ‘primordial’ soup to an RNA world, and many have speculated that there may have been a precursor molecule to RNA—one that also had catalytic and informational properties. 6–104 Although RNA is thought to have played an important role in the evolution of life on Earth, possibly as a replicating catalyst, it is unclear that it was the first replicating catalyst. Other less efficient molecular systems that combined informational and catalytic properties may have preceded RNA. Regardless of its original role, it is clear that RNA now plays a larger role than that of mere messenger in information flow; RNA provides critical functions in replication, gene regulation, splicing, translation, peptide-bond formation, membrane transport of proteins, and telomere maintenance. 6–105 The RNA molecule will not be able to catalyze its own replication. As a single molecule with a single catalytic site, it cannot be both template and catalyst simultaneously. (To visualize the critical difficulty, try to imagine how the active site of the RNA could copy itself.) Once a second molecule—either template or catalyst—was generated, then replication could begin. Reference: Bartel DP & Szostak JW (1993) Isolation of new ribozymes from a large pool of random sequences. Science 261, 1411–1418. 6–106 The complement of this hairpin RNA could also form a similar hairpin, as shown in Figure 6–62. The two structures would be identical in the doublestranded regions that involved standard GC and AU base pairs. They would differ in the sequence of the single-stranded regions. Because GU base pairs are stable in RNA, whereas CA base pairs are not, one hairpin would be predicted to contain an additional base pair, as shown. 6–107 Compartments are essential for evolution in the RNA world for two reasons. First, a set of mutually beneficial RNA molecules would have had to remain in proximity to have been of any use to one another. Second, selection of a set of RNA molecules according to the quality of the self-replicating systems they generated—the basis for natural selection and evolution—could not have occurred efficiently until some form of compartment evolved to contain the molecules and thereby make them available only to the RNA that had generated them. Reference: Szostak JW, Bartel DP & Luisi L (2001) Synthesizing life. Nature 409, 387–390. 6–108 The deoxyribose sugar of DNA makes the molecule much less susceptible to breakage. The hydroxyl group on carbon 2 of the ribose sugar is an agent for catalysis of the adjacent 3¢-5¢ phosphodiester bond that links nucleotides together in RNA. Its absence from DNA eliminates that mechanism of chain breakage. In addition, the double helical structure of DNA provides two complementary strands, which allows damage in one strand to be repaired accurately by reference to the sequence of the second strand. Finally, the use of T in DNA instead of U, as in RNA, builds in a protection against the effects of deamination—a common form of damage. Deamination of T produces an aberrant base (methyl C), whereas deamination of U generates C, a normal base. The cell’s job of recognizing damaged bases is much easier when the damage produces an abnormal base. 6–109 A. Ligation of the substrate oligonucleotide to the pool RNA is analogous to C-U 5¢-G-C-A C-C-G 3¢-C-G-U G-G-C U A-C 5¢-GCACUCCGUCGGCAUGC-3¢ 3¢-CGUGAGGCAGCCGUACG-5¢ G 5¢-G-C-A-U C-C-G 3¢-C-G-U-G G-G-C A A Figure 6–62 Hairpins formed by an RNA strand and by its complement (Answer 6–106). An RNA and its complement are shown as double-stranded RNA in the middle. The structures formed by each strand are shown above and below the duplex. The nonstandard GU base pair in the lower hair pin is highlighted with a dashed box. A148 Chapter 6: How Cells Read the Genome: From DNA to Protein (A) POLYMERIZATION (B) LIGATION Figure 6–63 Similarity of polymerization and ligation (Answer 6–109). (A) RNA polymerization. (B) RNA ligation. Analogous parts of the reactions are labeled. NTP NTP analog pppG-C C-G primer pppG-C C-G template 5¢ primer 3¢ template 5¢ 3¢ PPi 3¢-5¢ bond PPi G-C C-G primer template 3¢-5¢ bond 5¢ 3¢ B. C. D. E. 5¢ G-C C-G 3¢ chain elongation during RNA polymerization (Figure 6–63). In both cases, the growing strand (primer) and the nucleoside triphosphate (NTP) or its analog base-pairs to a template. In both cases, the 3¢ hydroxyl of the growing strand attacks the a-phosphate of the 5¢ triphosphate and displaces pyrophosphate (PPi) with concomitant formation of a 3¢-5¢ phosphodiester bond (Figure 6–63). It is critical to the selection and amplification scheme that the catalytic RNA becomes attached to the tag. The tag is used to fish out specific RNA molecules from the large pool of random molecules. If the tag were not attached to the ribozyme that catalyzed the linkage, no selection and amplification of the relevant ribozyme (the point of the whole scheme) would occur. The random segment in the middle is the part of the molecule that guarantees that a very large number of different sequences—hence, conformations and catalytic activities—will be present in the starting pool. It is your hope that one or a few such molecules can catalyze the intended reaction. The constant regions at the ends of each pool RNA molecule serve different purposes. The constant region at the 5¢ end of the pool RNAs serves as a binding site for the substrate oligonucleotide, so that the ends can be juxtaposed to create the substrate for ligation. This constant region also serves as one site required for regenerating a pool of RNA by T7 RNA polymerase transcription. This is an essential step if the cycle of selection and amplification is to be repeated. The constant region at the 3¢ end of the pool RNAs serves as a site for attachment to the agarose bead for ease of manipulation, for specific amplification of linked substrate and catalytic RNAs, and for amplification to link the T7 promoter so that the DNA oligonucleotides can be reconverted to RNA oligonucleotides for subsequent cycles. A catalytic RNA molecule is selected by passing the pool of RNA through an affinity column that carries oligonucleotides that are complementary to the substrate oligonucleotide. Only in those molecules that have undergone a ligation reaction will the catalytic RNA be attached to the substrate. The vast majority of noncatalytic RNAs will pass through such an affinity column. When the RNA is eluted from the column, it will contain a mixture of the sought-after catalytic RNA and contaminating noncatalytic RNA. The catalytic RNA can be specifically amplified using PCR primers, one of which is specific for the substrate RNA and the other for the pool RNA. Such a pair of PCR primers will selectively amplify catalytic RNAs, because only the catalytic RNAs will be attached to the substrate RNAs and be amplified. Even assuming that one cycle of selection and amplification is sufficient to remove all contaminating noncatalytic RNA molecules, which is probably not the case, there is still a critical reason for carrying out multiple cycles of selection and amplification. In the starting pool of RNA molecules it is THE RNA WORLD AND THE ORIGINS OF LIFE unlikely that any molecule will be represented more than once. Thus, at the end of the first time period for ligation, the very best catalyst in the population, many much weaker catalysts, and even some noncatalytic RNAs that are linked by an uncatalyzed mechanism, will all be attached to the substrate RNA. They will all be represented equally in the amplified pool. Purification at this stage would yield an extensive mixture of RNA molecules with a very wide range of catalytic activities. Subsequent rounds of selection and amplification allow the best catalysts to win out over the weaker ones. Consider, for example, the second cycle. In the window for ligation, most of the good catalysts will attach themselves to the substrate, while many fewer of the weaker catalysts and essentially none of the noncatalytic RNAs will do so. Thus, the amplification step in the second cycle will enrich considerably for the better catalysts. By decreasing the time for ligation in subsequent cycles, better and better catalysts can be selectively amplified. Reference: Bartel DP & Szostak JW (1993) Isolation of new ribozymes from a large pool of random sequences. Science 261, 1411–1418. CALCULATIONS 6–110 A. There are 6 ¥ 1015 molecules, 300 nucleotides (nt) in length, in 1 mg of RNA. 6 ¥ 1020 d 1 RNA molecule nt ¥ ¥ 300 nt 330 d 1 mg = 6 ¥ 1015 RNA molecules number = B. If the 220 nucleotide segment were completely random, there would be four choices of nucleotide at each of 220 positions, which is 4220 or about 3 ¥ 10132 possible different RNA molecules. Thus, in a 1-mg sample, there will be 2 ¥ 10–117 [(6 ¥ 1015)/(3 ¥ 10132)] of all possible sequences represented…a trivial fraction of the whole. (A sample large enough to have one copy of each possible RNA would outweigh the known universe by more than 30 orders of magnitude.) C. If a single 50-nucleotide RNA were required to catalyze the ligation, your chances of success would be close to nil. There are about 1018 different 50nucleotide sequences represented in a 1-mg sample of RNA. Considering just the random 220 nucleotides, there would be about 170 different 50-mers in each of 6 ¥ 1015 molecules (imagine sliding a 50-nucleotide window across the 220 nucleotides one nucleotide at a time) for a total of 170 ¥ 6 ¥ 1015, or 1018 different molecules. Since there are 450 or about 1030 (45 @ 103) different 50-mers, your chances would be roughly 1 in a trillion (10–12) of having the unique catalytic RNA in your sample. Are you feelin’ lucky? That so many ribozymes have been successfully isolated from such pools argues that a very large number of different sequences must be able to catalyze any given reaction, or that the catalytic RNAs must be very small. Since the identified ribozymes are not particularly small, it must be that many different sequences are capable of catalysis. Reference: Bartel DP & Szostak JW (1993) Isolation of new ribozymes from a large pool of random sequences. Science 261, 1411–1418. DATA HANDLING 6–111 A. Error-prone PCR was used to introduce mutations into the pool of RNA molecules in some rounds in order to try to generate ever more efficient catalysts of ligation. Because all possible molecules cannot be present in the starting material (see Problem 6–110), this technique gives you a way to increase the diversity of molecules that are closely related to those with A149 A150 Chapter 6: How Cells Read the Genome: From DNA to Protein demonstrated catalytic activity. It is likely that better catalysts will be found in the ‘sequence neighborhood’ of existing catalysts. You waited until round 5 to apply error-prone PCR to give time for some moderately good catalysts to arise. B. By making ligation more and more difficult—by lowering the concentration of Mg2+ and by decreasing the time available for ligation—you are selecting for better and better catalysts. C. Your scheme for selection and amplification has improved the ligation rate about 3 million-fold from 0.000003 ligations per hour for the starting RNA pool to 8.0 per hour after round 10. Thus, your final pool of ribozymes catalyzes ligation about 3 ¥ 106-fold faster than the uncatalyzed reaction. D. The diversity evident in your round-10 pool of RNA molecules indicates that many sequences can carry out efficient ligation. Since 11 of 15 of the cloned and sequenced molecules are clearly similar, they form a single sequence family presumably with a very similar overall conformation. The other molecules may represent additional catalytically active conformations. You will tell your audience that additional structural and enzymological studies will be needed to determine the catalytic mechanism(s) represented in your pool of ribozymes. Reference: Bartel DP & Szostak JW (1993) Isolation of new ribozymes from a large pool of random sequences. Science 261, 1411–1418. ...
View Full Document

This note was uploaded on 01/07/2011 for the course BIOLOGY 7.012 taught by Professor Ericlander during the Spring '04 term at MIT.

Ask a homework question - tutors are online