This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: An example of a multiple alignment Let us use the the cellulose-binding domain of cellobiohydrolase I (CBD-CBH1) as an example of what one may do with a multiple sequence alignment. This is a small (about 30-35 residues) disulfide-bo nded domain of known 3D structure (PDB code 1CBH ). Homologous domains can be found in a number of other cellulose-degrading enzymes. It is believed that the function of the domain is either to bind with high affinity to the cellulose fiber to allow the adjacent enzymatic domain to hydrolyse the cellulose. Another possibility is that the CBD domain wedges itself in between cellulose chains, making it easier for the enzymatic domain to attack the fiber. The multiple alignment of these sequences is taken from the Pfam database (the entry with identifier CBM_1 (formerly CBD_1), accession code PF00734 ). Shown below is the so-called seed alignment , containing the sequences the Pfam curators have used to define the family. This is just a part of the complete alignment file; some comments have been removed. For each sequence, the SWISS-PROT identifier and the position in the parent protein is given on the left. The top line shows the position numbers using the 1CBH 3D structure scheme. The bottom line shows the consensus , which we define here as the same amino-acid residue type in 14 or more sequences (out of 18). Please note that this definition of consensus is just one of many possible. 1 2 3 45678901...234567890123456789012 GUX1_TRIRE/481-509 HYGQCGGI...GYSGPTVCASGTTCQVLNPYY GUN1_TRIRE/427-455 HWGQCGGI...GYSGCKTCTSGTTCQYSNDYY GUX1_PHACH/484-512 QWGQCGGI...GYTGSTTCASPYTCHVLNPYY GUN2_TRIRE/25-53 VWGQCGGI...GWSGPTNCAPGSACSTLNPYY GUX2_TRIRE/30-58 VWGQCGGQ...NWSGPTCCASGSTCVYSNDYY GUN5_TRIRE/209-237 LYGQCGGA...GWTGPTTCQAPGTCKVQNQWY GUNF_FUSOX/21-49 IWGQCGGN...GWTGATTCASGLKCEKINDWY GUX3_AGABI/24-52 VWGQCGGN...GWTGPTTCASGSTCVKQNDFY GUX1_PENJA/505-533 DWAQCGGN...GWTGPTTCVSPYTCTKQNDWY GUXC_FUSOX/482-510 QWGQCGGQ...NYSGPTTCKSPFTCKKINDFY GUX1_HUMGR/493-521 RWQQCGGI...GFTGPTQCEEPYICTKLNDWY GUX1_NEUCR/484-512 HWAQCGGI...GFSGPTTCPEPYTCAKDHDIY PSBP_PORPU/26-54 LYEQCGGI...GFDGVTCCSEGLMCMKMGPYY GUNB_FUSOX/29-57 VWAQCGGQ...NWSGTPCCTSGNKCVKLNDFY PSBP_PORPU/69-97 PYGQCGGM...NYSGKTMCSPGFKCVELNEFF GUNK_FUSOX/339-370...
View Full Document
- Fall '10