Skill:
• Analysis of cladograms to deduce evolutionary relationships
Constructed cladograms all typically share certain key features:
- Root – The initial ancestor common to all organisms within the cladogram (incoming line shows it originates from a larger clade)
- Nodes – Each node corresponds to a hypothetical common ancestor that speciated to give rise to two (or more) daughter taxa
- Outgroup – The most distantly related species in the cladogram which functions as a point of comparison and reference group
- Clades – A common ancestor and all of its descendants (i.e. a node and all of its connected branches)
Key Features of a Cladogram
Constructing Cladograms
Cladograms can be constructed based on either a comparison of morphological (structural) features or molecular evidence
- Historically, structural features were used to construct cladograms, but molecular evidence is now more commonly used
1. Using Structural Evidence
Step 1: Organise selected organisms according to defined characteristics
- Use characteristics that are developmentally fixed (i.e. innate) and not influenced by environmental pressures
Step 2: Sequentially order organisms according to shared characteristics to construct a cladogram
- Grouping of organisms may be facilitated by constructing a Venn diagram prior to developing a cladogram
- Each characteristic will be represented by a node, with more common characteristics representing earlier nodes
The species with the least number of characteristics in common will represent the outgroup (establishes baseline properties)
2. Using Molecular Evidence
Step 1: Select a gene or protein common to a range of selected organisms
- Examples of molecules which are ubiquitously found in many animals include haemoglobin and cytochrome c
Step 2: Copy the molecular sequence (DNA or amino acid) for each of the selected organisms
- Use online databases such as Genbank or Ensembl to identify relevant DNA or amino acid sequences
- Sequences can be collated in a Word document and then saved as a document in plain text format (.txt)
- Before each sequence, designate a species name preceded by a forward arrow (e.g. '>Human’ or ‘>Chimpanzee’)
Step 3: Run a multiple alignment to compare molecular sequences (DNA or amino acid)
- Multiple alignment software compares DNA or protein sequences for similarities and differences
- Closely related species are expected to have a higher degree of similarity in their molecular sequence
- Clustal Omega is a free online tool that will align multiple DNA or amino acid sequences for comparison
Step 4: Generate a phylogeny tree (cladogram) from multiple alignment data
- Clustal Omega can generate branched phylograms after a sequence alignment is completed (select ‘Phylogenetic Tree’)
- Below is a plain text file that can be uploaded to compare amino acid sequences from different species:
- HBA – Haemoglobin alpha chain (amino acid sequence) from various species
Multiple Alignment of a Protein Sequence from Various Species
Understanding:
• Evidence for which species are part of a clade can be obtained from the base sequence of a gene or the
corresponding amino acid sequence of a protein
All organisms use DNA and RNA as genetic material and the genetic code by which proteins are synthesised is (almost) universal
- This shared molecular heritage means that base and amino acid sequences can be compared to ascertain levels of relatedness
Over the course of millions of years, mutations will accumulate within any given segment of DNA
- The number of differences between comparable base sequences demonstrates the degree of evolutionary divergence
- A greater number of differences between comparable base sequences suggests more time has past since two species diverged
- Hence, the more similar the base sequences of two species are, the more closely related the two species are expected to be
When comparing molecular sequences, scientists may use non-coding DNA, gene sequences or amino acid sequences
- Non-coding DNA provides the best means of comparison as mutations will occur more readily in these sequences
- Gene sequences mutate at a slower rate, as changes to base sequence may potentially affect protein structure and function
- Amino acid sequences may also be used for comparison, but will have the slowest rate of change due to codon degeneracy
Amino acid sequences are typically used to compare distantly related species (i.e. different taxa), while DNA or RNA base sequences are often used to compare closely related organisms (e.g. different haplogroups – such as various human ethnic groups)
Comparison of the Haemoglobin Beta Chain in Different Species
Understanding:
• Sequence differences accumulate gradually so there is a positive correlation between the number of differences
between two species and the time since they diverged from a common ancestor
Some genes or protein sequences may accumulate mutations at a relatively constant rate (e.g. 1 change per million years)
If this rate of change is reliable, scientists can calculate the time of divergence according to the number of differences
- E.g. If a gene which mutates at a rate of 1 bp per 100,000 years has 6 bp different, divergence occurred 600,000 years ago
This concept is called the molecular clock and is limited by a number of factors:
- Different genes or proteins may change at different rates (e.g. haemoglobin mutates more rapidly than cytochrome c)
- The rate of change for a particular gene may differ between different groups of organisms
- Over long periods, earlier changes may be reversed by later changes, potentially confounding the accuracy of predictions
Molecular Clocks