_Dr. Wu's lecture_lecture_kao_030904

_Dr. Wu's lecture_lecture_kao_030904 - PMI 285 Cellular...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
Gene Discovery and Disease Research PMI 285 Cellular Basis of Disease Winter Quarter 2004 Cheng-Yuan Kao
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Information explosion of biological data z 1988 : national center for biotechnology information (NCBI) created z 1990 : BLAST: fast sequence similarity searching created z 1995 : first bacterial genomes completely sequenced z 1996 : yeast genome completely sequenced z 1998 : worm (multicellular) genome completely sequenced z 1999 : fly genome completely sequenced z 2000 :completion of a "working draft" DNA sequence of the human genome ( Taken from NCBI )
Background image of page 2
Why search? z Is my sequence novel ? z Are there any similar genes? z How to predict the function of my novel gene by comparing it to other sequences?
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
( Taken from NCBI ) z Gene family : A group of genes descended from a common ancestral gene by duplication or speciation. z Homology vs. Similarity Similarity : quantifiable value in which a sequence can be compared to another based on their composition and order Homology : Similarity attributed to descent from a common ancestor z Orthologous vs. Paralogous Orthologous : Homologous sequences in different species that arose from a common ancestral gene during speciation; may or may not be responsible for a similar function. Paralogous : Homologous sequences within a single species that arose by gene duplication.
Background image of page 4
Alignment The process of lining up two or more sequences to achieve maximal levels of identity (and conservation, in the case of amino acid sequences) for the purpose of assessing the degree of similarity and the possibility of homology. ( Taken from NCBI ) Align GGGAT with CGATT ? GGGAT GGGAT– –GGGAT | | or | | | or | | | CGATT CG –ATT C– –GAT 3 matches, 2 indels, 0 extend gap 3 matches, 2 indels, 1 extend gap 2 matches, 0 indel, 0 extend gap
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
z Global Alignment The alignment of two nucleic acid or protein sequences over their entire length : Might miss sub-sequence matches z Local Alignment The alignment of some portion of two nucleic acid or protein sequences : find the best sub-sequence pairs
Background image of page 6
Pairwise sequence comparison / Single sequence ( DNA / protein ) analysis : Algorithms Exhaustive search: Smith-Waterman Search (Smith and Waterman, 1981) : the most sensitive method available, use it with care - it is VERY SLOW and uses a LOT OF COMPUTER POWER. Heuristic searches FASTA (Pearson and Lipman, 1988) BLAST (Altschul et al., 1990)
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
z Speed: BLAST > FASTA > S-W z Accuracy (sensitivity + specificity ) : z Sensitivity: the ability to find true positive matches ( might still have false positives ) -> S-W > FASTA > BLAST z Specificity: the ability to detect false positive matches (might miss some true positives) (Shpaer et al,1996) 100 88.28 87.03 83.69
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 40

_Dr. Wu's lecture_lecture_kao_030904 - PMI 285 Cellular...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online