Evolutionary Relatedness

  1. You will compare the sequences of the protein hemoglobin from bats, birds, and mammals. Decide whether you want to do your work with alpha-hemoglobin or with beta-hemoglobin. These are the two protein chains that carry oxygen in the circulatory systems of animals. Both proteins have been studied extensively in multiple species and should work equally well. Alternatively, you may want to collaborate with a partner and do companion searches, one doing searches with -hemoglobin, and one doing searches with - hemoglobin. At the end of the exercise, you can compare your results with each other to determine whether your different proteins showed the same evolutionary relationships between species of bats, birds, and mammals.
  2. Go to http://www.uniprot.org/uniprot.
  3. In the Protein Knowledgebase (UnitProtKB), under Query, type “alpha hemoglobin” or “beta hemoglobin” and click Search. The results of this search will come up on your screen. How many protein sequences were reported to you from this query?
    6,810
  4. You may scroll down and look through this long list of -hemoglobin/-hemoglobin sequences for one from a bat species, but it may be faster to narrow your search. Go back to Query and type “bat alpha hemoglobin” or “bat beta hemoglobin” and click Search. When you get the results of this search, how many sequences of alpha hemoglobin/beta hemoglobin did you get for bat species? 98 bat alpha hemoglobin

Note: Check the species names and common names for each of the -hemoglobin/- hemoglobin in this sequence report to make sure that they are bat sequences. Sometimes a search may not recognize the difference, for example, between “bat” and some other word, such as “wombat”!

  1. Select one bat -hemoglobin/-hemoglobin sequence to save to a file by clicking on the color-highlighted Accession number for that protein sequence. An accession number is how protein sequences are identified and archived in databases. In the case of -hemoglobin sequences, the Entry name will start with the letters “HBA.” The symbols for all - hemoglobin will begin with “HBA.”
    Note: If you are working with a partner who is doing a companion study with - hemoglobin, you will have to collaborate on your selection of which bat -hemoglobin sequences to save.
  2. The page that opens will contain information about the sequence, such as the taxonomy of the organism from which it came. Mid-way of the page, you will see a “Sequence” section where you will find the protein sequence written with single-letter designations of the amino acids. Here is where you will find the FASTA hyperlink. This is the best way to save sequence information to your file, because it is a sequence format that all computer search programs can understand. Click the FASTA link and this will bring up a page containing the sequence.
  3. Copy the amino acid sequence to a file:
    a. Highlight the amino acid sequence (the entire script on your FASTA formatted page)
    b. Right click, and select Copy
    c. Open Notepad, right click, select Paste
    d. Save your file
  4. Return to the web page with the list of bat alpha hemoglobin sequences/-hemoglobin (clicking twice on the Back button on the web browser will get you there). Identify another sequence for a bat -hemoglobin and repeat the process of highlighting the FASTA formatted amino acid sequence to your file. Save all your FASTA formatted - hemoglobin/-hemoglobin sequences together in one file.
  5. When you have saved two -hemoglobin/-hemoglobin sequences from two bat species, repeat steps to get 2 sequences from bird species and 2 sequences from mammalian species. It does not matter which species you choose, as long as 2 are from birds and 2 are from mammals. You might want to choose species that you think are related to bats. If you are collaborating with a partner searching for -hemoglobin sequences, your partner should search for the same species that you have chosen.

Note: Be aware that if you are limiting your search for bird -hemoglobin sequences with the keyword “bird,” the search will only locate protein entries where the word “bird” appears. If the entry was archived under other descriptions such as “hawk”, “eagle”, or “penguin,” you will not find entries using the keyword “bird.”

  1. When you have saved six -hemoglobin/-hemoglobin sequences to your file (two from bats, two from birds, and two from mammals), go to http://www.genome.jp/tools/clustalw is a computer program that you can use to search for sequence similarities between many sequences at a time and display regions of alignment.
  2. Copy your entire file of sequences into the textbox. Note that the sequence descriptions proceeded by the “>” will be copied in with the protein sequences. This will not be a problem with your search. Without changing any of the default settings on your search, click on the blue colored Execute Multiple Alignment bar.
  3. The next page will show the alignment of amino acid sequences for the 6 proteins that you have retrieved from the SwissProt database, using the single-letter designations for amino acids. An asterisk will appear along the bottom row of amino acid alignment at positions where there is an amino acid that is found in all 6 proteins. These amino acids are said to be highly conserved since they have not changed since these species diverged from a common ancestor.

a. How many of the amino acids are found to be the same in all of the 6 -hemoglobin/- hemoglobin sequences in your alignment?

b. What percentage of all the -hemoglobin/-hemoglobin amino acids are conserved in all 6 proteins?

c. Are there any specific regions of the -hemoglobin/-hemoglobin sequences that are especially conserved? Is one end of the molecule more conserved than the other? Describe your observations.

d. Are there any amino acids that appear more frequently in conserved regions of the protein than in the non-conserved regions? If so, which amino acids are they?

e. If you did find amino acids that were more frequently conserved in your alignment report, were the ones with side groups that were nonpolar, polar, or charged?

  1. At the top of your ClustalW report, you will find the exact percentages of amino acids in the sequence alignment that are identical when comparing only two sequences at a time. For example, if your report says Sequences (1:2) Aligned. Score: 87.2, this means that when the first two sequences saved to your file were aligned, 87.2% of the amino acids were identical in both sequences. Transfer these percentages into a table format, in which the species whose sequences you have aligned are headers for both the columns and the rows. Your table should look like Table 1. Percent identify in amino acid alignment for α-hemoglobin.

Table 1. Percent identity in amino acid alignment for α-hemoglobin
Species Bat #1 Bat #2 Bird #1 Bird #2 Mammal #1 Mammal #2
Bat #1 100 95.0355 62.4113 69.5035 85.9155 85.2113
Bat #2 100 60.9929 68.7943 86.5248 85.1064
Bird #1 100 60.2837 60.9929 59.5745
Bird #2 100 70.922 72.3404
Mammal #1 100 86.6197
Mammal #2 100

Note: You do not need to fill out both halves of this table since the information is redundant. From this table, can you see whether the -hemoglobin/-hemoglobin sequences are more similar for bats and birds, compared with bats and mammals? What does this suggest about the evolutionary relatedness of these species? Which species diverged from each other the most recently and have the most recent common ancestor? Which species have been divergent from each other the longest and have the most ancient common ancestor? From the information in this you should be able to predict that bats are more closely related to either birds or mammals.

A phylogenetic tree can present the relatedness of species from sequence similarity data, such as your Table 1. Percent identity in amino acid alignment for α-hemoglobin These trees link species that are more closely related in branches, and the length of the branches is their evolutionary distance. You can draw a phylogenetic tree from your amino acid alignment report by pairing species that have the most sequence similarities to make short branches. Species who have fewer sequence similarities will branch from each other on the tree farther apart. The ClustalW on the page that your report appears on will automatically draw a phylogenetic tree for you. At the bottom of the page, click on the pull-down menu Dendrogram or Rooted Dendrogram. Print out the tree that appears on the screen. Does the information on this tree agree with your analysis above of the Percent identity in amino acid alignment for -hemoglobins table? Explain.

  1. One way to evaluate the validity of the phylogenetic tree that you drew for bats, birds, and mammals is to compare it with trees constructed from sequences of other proteins. Compare your tree with a tree constructed by your partner searching for -hemoglobin sequences. Does your tree derived from -hemoglobin sequences agree with one drawn from - hemoglobin sequences? Are the relative lengths of the branches the same?

Sample Solution