Lecture 16(2)

Lecture 16(2) - Lecture 16 How Are Genomes Sequenced?...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Lecture 16 How Are Genomes Sequenced? The Human Genome Project was proposed in 1986 to determine the normal sequence of all human DNA. The publicly funded effort was aided and complemented by privately funded groups. Methods used were first developed to sequence prokaryotes and simple eukaryotes. How Are Genomes Sequenced? To sequence an enJre genome, the DNA is first cut into fragments about 500 base pairs (bp) long. The haploid human genome has about 3.3 billion bp, resulJng in 6 million fragments. The fragment sequences are put together using larger, overlapping fragments. How Are Genomes Sequenced? Example: Using a 10 bp fragment, cut three different ways̶ TG, ATG, and CCTAC AT, GCC, and TACTG CTG, CTA, and ATGC The correct order is ATGCCTACTG. How Are Genomes Sequenced? The field of bioinforma2cs was developed to analyze DNA sequences using complex mathemaJcs and computer programs. How Are Genomes Sequenced? The shotgun sequencing method cuts DNA into smaller, overlapping fragments that are cloned and sequenced. Computers are used to search for overlapping markers. This approach is much faster and cheaper. Figure 17.2 Sequencing DNA (Part 1) Figure 17.2 Sequencing DNA (Part 2) How Are Genomes Sequenced? High ­throughput sequencing methods are used to sequence large genomes. They involve amplificaJon of DNA templates by PCR and physical binding of template DNA to a solid surface or to microbeads. Thousands or millions of sequencing reacJons are run at once: Massively parallel DNA sequencing. How Are Genomes Sequenced? Genome sequence informaJon is used to idenJfy: •  Open reading frames, or coding regions •  Amino acid sequences of proteins •  Regulatory sequences •  RNA genes • Other noncoding sequences How Are Genomes Sequenced? In compara=ve genomics, newly sequenced genomes are compared with sequences from other organisms. This can give informaJon about the funcJons of sequences, and is used to trace evoluJonary relaJonships. What Have We Learned from Sequencing ProkaryoJc Genomes? The first life forms to be sequenced were the simplest viruses with relaJvely small genomes. The first complete genomic sequence of a free ­living cellular organism was for the bacterium Haemophilus influenzae, in 1995. What Have We Learned from Sequencing ProkaryoJc Genomes? Func2onal genomics assigns funcJons to the products of genes. H. influenzae chromosome has 1,738 open reading frames. When it was first sequenced, only 58 percent coded for proteins with known funcJons. Since then the roles of many other proteins have been idenJfied. Figure 17.5 FuncJonal OrganizaJon of the Genome of H. influenzae What Have We Learned from Sequencing EukaryoJc Genomes? •  Much of eukaryo=c DNA is noncoding, including introns, gene control sequences, and repeated sequences •  Eukaryotes have mul=ple chromosomes; each must have an origin of replicaJon (ori), a centromere, and a telomeric sequence at each end What Have We Learned from Sequencing EukaryoJc Genomes? Eukaryotes have closely related genes called gene families. These arose over evoluJonary Jme when different copies of genes underwent separate mutaJons. For example: Genes encoding the globin proteins all arose from a single common ancestral gene. Figure 17.11 The Globin Gene Family What Have We Learned from Sequencing EukaryoJc Genomes? During development, different members of the globin gene family are expressed at different Jmes and in different Jssues. Hemoglobin of the human fetus contains γ ­ globin, which binds O2 more Jghtly than adult hemoglobin. What Have We Learned from Sequencing EukaryoJc Genomes? Many gene families include nonfuncJonal pseudogenes (Ψ), resulJng from mutaJons that cause a loss of funcJon. A pseudogene may simply lack a promoter and thus fail to be transcribed; or a recogniJon site needed for the removal of an intron. What Have We Learned from Sequencing EukaryoJc Genomes? EukaryoJc genomes have repeJJve DNA sequences: •  Highly repe22ve sequences̶short sequences (< 100 bp) repeated thousands of Jmes in tandem; not transcribed Short tandem repeats (STRs) of 1–5 bp can be used in DNA fingerprinJng Figure 17.13 Sequences in the EukaryoJc Genome What Are the CharacterisJcs of the Human Genome? The complete haploid human genome sequence was finished in 2005. Since then, the diploid genomes of several individuals have been sequenced and published. What Are the CharacterisJcs of the Human Genome? Some interesJng facts about the human genome: •  Protein ­coding regions make up less than 2 percent, about 24,000 genes Each gene must code for several proteins, and posfranscripJonal mechanisms (e.g., alternaJve splicing) must account for the observed number of proteins in humans What Are the CharacterisJcs of the Human Genome? •  An average gene has 27,000 base pairs •  All human genes have many introns •  Over 50 percent of the genome is transposons and other repeJJve sequences What Are the CharacterisJcs of the Human Genome? •  97 percent of the genome is the same in all people •  Genes are not evenly distributed over the genome. The Y chromosome has the fewest genes (231); chromosome 1 has the most (2,968) Figure 17.14 EvoluJon of the Genome Proteomics and Metabolomics Reveal? The proteome is the sum total of proteins produced by an organism; it is more complex than the genome. The aim of proteomics is to idenJfy and characterize all of the expressed proteins. Proteomics and Metabolomics Reveal? Two techniques are used to analyze the proteome: •  Two ­dimensional gel electrophoresis separates proteins based on size and electric charges •  Mass spectrometry idenJfies proteins by their atomic masses Proteomics and Metabolomics Reveal? Proteins have different funcJonal regions or domains. Proteins that are unique to a parJcular organism are ohen just unique combinaJons of domains that exist in other organisms. This reshuffling of the gene=c deck is a key to evolu=on. Proteomics and Metabolomics Reveal? The metabolome is the quanJtaJve descripJon of all of the small molecules in a cell or organism: •  Primary metabolites are involved in normal processes, such as in pathways like glycolysis. Also includes hormones and other signaling molecules Proteomics and Metabolomics Reveal? Secondary metabolites are ohen unique to parJcular organisms or groups. Examples include: AnJbioJcs made by microbes, and chemicals made by plants for defense against pathogens and herbivores. RegulaJon Nega2ve regula2on—Gene is normally transcribed, but binding of a repressor protein prevents transcripJon. Posi2ve regula2on—Gene is not normally transcribed; an acJvator protein binds to sJmulate transcripJon. Figure 16.1 PosiJve and NegaJve RegulaJon (Part 1) Figure 16.1 PosiJve and NegaJve RegulaJon (Part 2) How Is Gene Expression Regulated in Prokaryotes? The cell can: •  Downregulate mRNA transcripJon •  Hydrolyze mRNA, prevenJng translaJon •  Prevent mRNA translaJon at the ribosome •  Hydrolyze the protein aher it is made •  Inhibit the protein’s funcJon How Is Gene Expression Regulated in Prokaryotes? Prokaryotes generally use the most efficient way—down regulaJng mRNA transcripJon. Less energy is wasted as it is is early in protein synthesis. An example, The Lac operon •  β ­galactosidase—an enzyme that hydrolyses lactose ­ lacZ •  Lactose permease ­ an enzyme that transports lactose across the membrane, symporter ­lacY •  β ­galactoside transacetylase—transfers acetyl groups to certain β ­galactosides ­ lacA If E.coli is grown with glucose but no lactose present, no enzymes for lactose conversion are produced. If lactose is predominant and glucose is low, E.coli synthesizes all three enzymes. If lactose is removed, synthesis stops. A compound that induces protein synthesis is an inducer; the proteins are inducible proteins. Cons2tu2ve proteins are made at a constant rate. Figure 16.8 An Inducer SJmulates the Expression of a Gene for an Enzyme The rate of a metabolic pathway can be regulated in two ways: Allosteric regulaJon of enzyme ­catalyzed reacJons allows rapid fine ­tuning. RegulaJon of protein synthesis (regulaJon of the concentraJon of enzymes) is slower but conserves resources. Figure 16.9 Two Ways to Regulate a Metabolic Pathway A gene cluster with a single promoter is an operon—the one that encodes for the lactose enzymes is the lac operon. A typical operon consists of: •  A promoter •  Two or more structural genes •  An operator—a short stretch of DNA between the promoter and the structural genes Figure 16.10 The lac Operon of E. coli Three ways to control operon transcripJon: •  An inducible operon regulated by a repressor protein •  A repressible operon regulated by a repressor protein •  An operon regulated by an acJvator protein Figure 16.10 The lac Operon of E. coli Figure 16.11 The lac Operon: An Inducible System (Part 1) Figure 16.11 The lac Operon: An Inducible System (Part 2) Features of negaJve control: •  If inducer is absent—operon is turned off •  Repressor protein exerts control and turns operon off •  If inducer is present—binds to repressor and changes its shape so it cannot bind to the operator •  Without repressor, operon is turned on Difference in two types of operons: In inducible systems—metabolic substrate (inducer) interacts with a regulatory protein (repressor); repressor doesn’t bind and allows transcripJon. In repressible systems—a metabolic product (co ­repressor) binds to regulatory protein, which then binds to the operator and blocks transcripJon. An ac2vator protein can increase transcripJon through posiJve control. If high lactose—low glucose, CRP binding to the lac operon promoter acJvates the lac operon. CRP makes RNA polymerase ­promoter binding more efficient, and increases structural gene transcripJon. Figure 16.12 Catabolite Repression Regulates the lac Operon (Part 1) Figure 16.12 Catabolite Repression Regulates the lac Operon (Part 2) If high glucose, CRP does not bind to the lac operon promoter and efficiency of transcripJon is reduced. An example of catabolite repression—a system of gene regulaJon. Presence of a preferred energy source represses other catabolic pathways. Summary Bacterial systems are regulated either posiJvely, negaJvely or a combinaJon of both How Is EukaryoJc Gene TranscripJon Regulated? EukaryoJc gene expression: Must be regulated to ensure proper Jming and locaJon of protein producJon. RegulaJon can occur at mulJple points in transcripJon and translaJon. Figure 16.13 PotenJal Points for the RegulaJon of Gene Expression (Part 1) Figure 16.13 PotenJal Points for the RegulaJon of Gene Expression (Part 2) Figure 16.13 PotenJal Points for the RegulaJon of Gene Expression (Part 3) TranscripJon factors act at eukaryoJc promoters—regions of DNA where RNA polymerase binds and iniJates transcripJon. Two important sequences: •  Recogni2on sequence—recognized by RNA polymerase •  TATA box—where DNA begins to denature and expose the template strand Transcrip2on factors (regulatory proteins) must assemble on the chromosome before RNA polymerase can bind to the promoter. TFIID binds to the TATA box; then other transcripJon factors bind, forming a transcripJon complex. Figure 16.14 The IniJaJon of TranscripJon in Eukaryotes (Part 1) Figure 16.14 The IniJaJon of TranscripJon in Eukaryotes (Part 2) Some sequences are common to promoters of many genes; recognized by transcripJon factors in all cells. Some sequences are specific to a few genes and are recognized by transcripJon factors found only in certain Jssues. These play an important role in differenJaJon. Besides the promoter, other sequences bind regulatory proteins that interact with RNA polymerase and regulate rate of transcripJon. Some are posiJve regulators—enhancers; others are negaJve—repressors. The combina=on of factors present determines the rate of transcrip=on. Figure 16.15 TranscripJon Factors, Repressors, and AcJvators (Part 1) Figure 16.15 TranscripJon Factors, Repressors, and AcJvators (Part 2) Protein domains that bind to DNA have four structural mo2fs: •  Helix ­turn ­helix •  Leucine zipper •  Zinc finger •  Helix ­loop ­helix Figure 16.16 Protein–DNA InteracJons (1) Figure 16.16 Protein–DNA InteracJons (2) Figure 16.16 Protein–DNA InteracJons (3) Figure 16.16 Protein–DNA InteracJons (4) Three criteria for DNA recogniJon by a protein moJf: •  Fits into major or minor groove •  Has amino acids that can project into interior of double helix •  Has amino acids that can bond with interior bases Many repressor proteins have helix ­turn ­helix configuraJon. Repressors inhibit transcripJon by prevenJng acJvators from binding, or interact with binding proteins to decrease rate of transcripJon. Genes to be regulated simultaneously may be far apart or on different chromosomes. Gene expression is coordinated if they have the same regulatory sequences that bind same transcripJon factors. Example: A regulatory sequence in plant genes called stress response element (SRE) —encodes for proteins needed to cope with drought. Figure 16.17 CoordinaJng Gene Expression (1) Figure 16.17 CoordinaJng Gene Expression (2) Epigene2cs refers to changes in expression in a gene or set of genes, without a change in the DNA sequence. Changes are someJmes heritable and stable, but are reversible. Includes two processes: DNA methyla2on and chromosomal protein alteraJons. How Is EukaryoJc Gene Expression Regulated Aher TranscripJon? EukaryoJc gene expression can be regulated in the nucleus before mRNA export, or aher mRNA leaves. Control mechanisms include alternaJve splicing of pre ­mRNA, microRNAs, translaJon repressors, or regulaJon of protein breakdown. Different mRNAs can be made from the same gene by alterna2ve splicing. As introns and exons are spliced out, new proteins are made. May be a deliberate mechanism for generaJng proteins with different funcJons, from a single gene. Figure 16.22 AlternaJve Splicing Results in Different Mature mRNAs and Proteins MicroRNAs(miRNAs)—small molecules of noncoding RNA—are important regulators of gene expression. In C. elegans, the gene lin ­4 is involved in nega=ve regula=on of development; it encodes not for a protein but for an miRNA. The miRNA inhibits lin ­14, a gene that advances development, by binding to its mRNA. Each miRNA is about 22 bases long and has many targets. miRNAs are transcribed as longer precursors then cleaved to double ­stranded miRNAs. Proteins guide miRNA to target mRNA— translaJon is inhibited and mRNA is degraded. Figure 16.23 mRNA InhibiJon by MicroRNAs mRNA translaJon can be regulated. Protein and mRNA concentraJons are not consistently related—governed by factors acJng aher mRNA is made. Cells either block mRNA translaJon or alter how long new proteins persist in the cell. Three ways to regulate mRNA translaJon: •  miRNAs can inhibit translaJon •  GTP cap on 5′ end of mRNA can be modified—if cap is unmodified mRNA is not translated •  Repressor proteins can block translaJon directly Protein longevity is regulated—protein content is a funcJon of synthesis and degradaJon. Ubiqui2n afaches to a protein to be destroyed and afracts other ubiquiJns. This complex binds to a proteasome—a large complex where the ubiquiJn is removed and the protein is digested. Figure 16.24 A Proteasome Breaks Down Proteins ...
View Full Document

Ask a homework question - tutors are online