Friday, April 11, 2014

Phylogenetic diversity

Yesterday we discussed in our local journal club a recent paper arguing that the concept of phylogenetic diversity is flawed, or at a minimum not useful as a proxy for what the authors call "feature diversity".

Obviously to make sense of what I just wrote, a bit of background is needed. I will therefore use this post to explain what phylogenetic diversity is, and then discuss the actual problem (if there is one) the next time.

In trying to inform conservation policy or in comparing areas in general, we are often interested in assessing how much biological diversity exists in a habitat or region. The question might be which of two possible areas to protect if only one can be protected; in that case you want to pick the one that has greater conservation value, and that is understood to be the biologically more diverse of the two. More generally there might be research projects trying to understand patterns of diversity, e.g. why there is more diversity in some regions than in others. In any case, if we want to do more than just intuit we need metrics to quantify diversity.

The simplest approach is to count taxa, usually species. Area A contains 23 species, area B contains 15. A is richer and thus more biodiverse, end of story. This metric is often called richness.

However, this bean counting approach may make us uneasy if we imagine what those species might be. What if the 23 species in A are all grasses, but the 15 in B comprise ferns, grasses, daisies, oaks, lilies and suchlike? In that case we might want to make the argument that B has greater conservation value because it would preserve many branches of the tree of life whereas A only contains one of its twigs.

So we might say that there is a kind of diversity that is greater in B than in A that is different from what richness captures. Intuitively, we all understand that, which is why we are excited about and try to protect evolutionarily isolated species such as the platypus or the Woollemi pine. ("Living fossils" and all that stuff.) But as scientists, we want to put numbers on things so that we can actually compare and quantify and analyse patterns.

In 1992, the conservation scientist Dan Faith published a metric that captures this kind of evolutionary "specialness": Phylogenetic Diversity (PD). The approach is as follows. First we need to infer a phylogenetic tree for all species of a group of interest. Then, to calculate the PD of an area, we look at the position on the tree of only the species found in the area and connect all of them back to the root of the tree.

Faith's original idea was to count the number of tree nodes in the connections, but these days the metric is generally computed from branch lengths, which might be numbers of character changes or times since divergence. That is, in effect the PD score of an area is the union of the paths connecting the tree root to every terminal on the tree found in the area. To make PD scores comparable between different study groups, they can be normalised by dividing through the length of the complete tree including all terminals.

Here is an example. In the above tree, every internode has a length of exactly one, making for an easy computation. If area A contains four species but they are all forming one clade, then the branch lengths in the area (red) add up to a PD of 7. In area B are also four species, but they are dispersed evenly across the phylogeny and thus represent more divergent evolutionary lineages. In this case, the branch lengths add up to 10, so here we have a higher PD score for the same species number.

Next time: A potential problem and what to make of it.


Faith D, 1992. Conservation evaluation and phylogenetic diversity. Biological Conservation 61: 1-10.

No comments:

Post a Comment