7. Gene trees and DNA data#
This section emphasizes that our data are the outcome of a graphical process. Namely, the long-term process of individuals reproducing within a pedigree structure gives rise to trees that connect present-day genomes.
The type of data that we typically deal with will be DNA sequence genotypes.
The following text output shows diploid genotypes for five individuals.
The genome position of each site labels the remaining columns and each genotype is written down as allele 1/allele 2
.
29 | 60 | 71 | 95 | individual |
---|---|---|---|---|
G/G | G/T | A/A | G/G | 1 |
G/G | T/G | A/A | G/G | 2 |
G/G | G/G | G/G | T/G | 3 |
G/G | G/G | G/A | T/G | 4 |
A/G | G/T | G/A | G/G | 5 |
From this variation table, you can read off whether or not each individual is a heterozygote or a homozygote for each of the possible alleles.
The variation table shown here is the output of a computer simulation. The simulation created the gene tree with mutations shown in Fig. 7.1. This gene tree looks similar to those seen earlier when we discussed pedigrees. However, there are the following differences:
The time scale is quite a bit longer! Compare the y axis in Fig. 7.1 to Fig. 4.5.
There is a single common ancestor of all of our sample nodes. The reason why there is a single ancestor node is because we have done something like simulate a pedigree forwards in time until all of our present day individuals are the descendants of some (long dead) ancestor. What we have actually done is more clever, and we’ve instead simulated backwards in time using approaches that we will describe later.
We can get from Fig. 7.1 to our variation table using the logic described in Section 2 to trace from ancestral to derived states along a tree.
Attention
Test yourself!
You should be able to do the following:
Variation tables like those shown here leave a few things implicit/unsaid:
The source of the data is unstated.
If this was DNA sequence data, how much DNA was sequnced?
If this was “SNP chip” data, what are the properties of the genotyping chips?
Further, for the gene trees:
We only show the sites where mutations occurred.
Therefore, you can assume that all other sites have the ancestral state for all sampled individuals/nodes.
Fig. 7.1 The gene tree that is the true evolutionary history of our variation table.#
Fig. 7.2 Test yourself!#