Wednesday, August 20, 2014

Diversity metrics

I want to blog about two recently published papers, one on keys and one on a method for spatial analyses of biodiversity, but for the latter some groundwork is necessary. This post will provide that groundwork so that I can then cunningly link back to it.

The last 25 years or so have seen the rise of spatial studies of patterns of biodiversity. They have been made possible by the increased availability of large databases with specimen occurrence records such as Australia's Virtual Herbarium, for example. Where a generation ago most information on the occurrence of species came from distribution maps drawn by specialists on the various groups of organisms, we can now enter a species name into a database search and are rewarded by a large list of geocoded specimens ready for use in our analyses.

Over the same time, several new diversity metrics have been developed to allow ever more sophisticated analyses. What is a diversity metric? It is a numerical value that tells us how diverse the organisms of our study group are in a particular part of our study area.

The study area as a whole is divided into cells; ideally these are equal area cells of for example 100 km x 100 km, alternatively they are biogeographical or political units. We can then look at our diversity metric and say, aha, in this cell there is particularly high diversity, and that might influence our decisions about what areas to prioritise for conservation. Okay, now what metrics are there?

The simplest diversity metric is really banal. Richness is the number of taxa at some rank - usually species - found in an area. So we could say, area A has got twenty species but area B has got only ten, so obviously area A should have higher conservation priority.

What, however, if the twenty species occurring in A are all common as mud while the ten in B are very rare, and indeed cell B constitutes the majority of their natural habitat? So we might want to take rarity into account, and we might want a diversity metric for that.

Rarity across the landscape is also known as endemism; a species is commonly called "endemic to area B" if it only occurs in area B and nowhere else. So we could count the number of endemics in A and the number of endemics in B and compare them. However, we will quickly realise that endemism is a very relative concept and highly scale dependent. With such a simplistic approach we would, for example, not be able to differentiate between a cell on our map containing a species occurring in two cells total and a different cell containing a species occurring in thirty cells total. Because both species aren't endemic to one cell, both cells would end up with a score of zero although obviously the first cell constitutes a striking 50% of the entire distribution of a species.

These considerations lead Crisp et al. (2001) to develop two endemism scores that can deal with different grades of endemism. The first is Weighted Endemism (WE) and is calculated as the sum of the inverses of the areas of distribution of all species in the cell. In other words, if a cell contains only three species which are found in two, three and ten cells, respectively, its score would be WE = 1/2 + 1/3 + 1/10 = 0.933.

However, they also immediately realised that WE is strongly correlated with richness: the more species you have in a cell, the higher WE is going to be even if the species are not particularly rare. To correct for that effect, Crisp et al. also introduced Corrected Weighted Endemism (CWE), defined as WE divided by Richness. In other words, CWE provides the average endemism score per species in the cell. In our example, it would be CWE = 0.933 / 3 = 0.311. The metric varies from one (all species in the cell are endemic to it) to close to zero (all species in the cell are found everywhere) and is thus well comparable across studies.

But there is also another reason we might be concerned about relying on simple counting of taxa to infer how diverse an area is. Imagine you have, again, an area A with twenty species and an area B with ten species. Now imagine that all twenty species in A are grasses, but the species in B are one grass, one fern, one daisy, one orchid... you get the picture. There is some kind of diversity in area B that is different from both raw species number and from rarity across the landscape.

This kind of diversity is what we call Phylogenetic Diversity (PD), a concept first formalised by Faith (1992). In the way we currently use it in our analyses, the PD score of a cell on the map is calculated as the union of the branches of the phylogenetic tree connecting all the species found in the cell to the root of the phylogenetic tree. So if you have three species in a cell that are in very distant parts of the phylogenetic tree, you will arrive at a higher score than for another cell in with three species that are very close together on the phylogeny.

In this way, PD provides a quantitative metric for the evolutionary distinctness of the species occurring in an area, so that for example having the Wollemi Pine in a grid cell should boost its PD score substantially. Subsequent studies have demonstrated that although PD is correlated with Richness - unsurprisingly, because all else being equal, adding species will add more parts of the phylogeny to the score - there are indeed cases where PD is high although Richness isn't, which may have implications for conservation or for understanding biogeographic history.

One of the most recent additions is a logical combination of endemism and PD. Phylogenetic Endemism (PE) provides a metric for the geographic rarity of all parts of the phylogenetic tree found in a cell (Rosauer et al., 2009). In practice, it is the same as PD except that for each cell the length of each individual branch of the phylogeny is first divided by the union of the areas of distribution of the members of that clade found in the cell. A different way of putting it is that PE is PD after range-weighting all the branches of the phylogeny: widely distributed groups at all levels make the value smaller, narrowly distributed groups keep it high.

That was the state of the art before the publication of the paper I want to discuss in one of my next posts.


Crisp MD, Laffan S, Linder HP, Monro A, 2002. Endemism in the Australian flora. Journal of Biogeography 28: 183-198.
Faith DP, 1992. Conservation evaluation and phylogenetic diversity. Biological Conservation 61: 1-10.
Rosauer D, Laffan SW, Crisp MD, Donnellan SC, Cook LG, 2009. Phylogenetic endemism: a new approach for identifying geographical concentrations of evolutionary history. Molecular Ecology 18: 4061-4072.

No comments:

Post a Comment