So, after laying the groundwork by looking into biodiversity metrics such as (Species) Richness, Corrected Weighted Endemism (CWE), Phylogenetic Diversity (PD) and Phylogenetic Endemism (PE), we can come to the first of two recent papers that I wanted to discuss. A few weeks ago several friends and colleagues, some of them from my own institution, published a new, quantitative method for locating hotspots of endemism called Categorical Analysis of Neo- and Paleo-Endemism (CANAPE; Mishler et al., 2014).
First, in case this is unclear, what is neo-, and what is paleo-endemism? As mentioned before, endemism means that a taxon is restricted to a certain area or, more generally speaking, that it has a (perhaps unusually) restricted distribution. There are different reasons why, say, a species could be restricted to a very small area. It could for example have a narrow ecological niche and be unable to survive anywhere else. But from a historical perspective, and all else being equal, we distinguish taxa that have a restricted distribution because they are very young and have just not had the time to spread elsewhere and ancient taxa that now have a restricted distribution after formerly being more widespread because they have died out elsewhere.
Of course, many species will start out as isolated populations diverging from a more widespread sister species and, if they are lucky and don't die out too soon, only spread out afterwards. Conversely, there are many species that are now very narrowly distributed although they formerly occurred elsewhere. A good example are the ginkgos, a strange group of gymnosperms that was once found across the entire northern hemisphere but is now restricted to a small area in China (and human cultivation).
Identifying areas that are hotspots of neo- or paleo-endemics has obvious biogeographic and conservation implications. Areas rich in neo-endemics would be those that are hotspots of recent speciation, while areas rich in paleo-endemics must presumably be environmentally stable over long periods to allow such ancient lineages to survive the changes that killed them off elsewhere.
Now how to identify those areas? Again, a neo-endemic is a young taxon with a restricted distribution, and a paleo-endemic is an old taxon with a restricted distribution. At the same time we are thinking phylogenetically, so we are not only interested in the tips of the tree but in taxa (clades) at all levels of the tree of life. So what we are searching for are signs that an area on the map has got rare branches of the phylogeny and either significantly more short branches of the phylogeny (neo) or significantly more long branches of the phylogeny (paleo) than a 'normal' area.
However, some of these patterns might cancel each other out when forced into a single value in our analysis. For example, if we are looking at Phylogenetic Endemism, low values can come from the clades being widespread or from the branches on the phylogeny being short - and in the latter case that interesting signal of young taxa might cancel out the signal of geographic rarity that we are also interested in.
This is why the new CANAPE method works in two steps. First, cells are identified that show either significantly high geographic rarity at all phylogenetic levels or significantly high Phylogenetic Endemism (or both). These are hotspots of some form of rarity. To find out which form, they are then classified into three groups based on whether they show significantly high or low values of a derived metric called Relative Phylogenetic Endemism (RPE). This is the final, crucial step, so we have to take a closer look. It is defined as follows:
RPE = (Phylogenetic Endemism of the cell / geographic rarity of all phylogenetic levels in the cell) * (number of internodes in the phylogeny / total length of the phylogenetic tree)
The second factor is a constant across each study and only serves to standardise values to make them comparable across studies of different groups of organisms. The important part is the first factor. Again, Phylogenetic Endemism is Phylogenetic Diversity after range-scaling all branches of the phylogeny, a metric that provides information on both the evolutionary distinctness and the rarity of the contents of a cell. The denominator of the first factor is basically the same for a hypothetical phylogeny with the same topology but all branches of equal length.
The idea is that the fraction of the two provides information on branch length distribution, that is whether they are longer (high values) or shorter (low values) than expected. And that brings us full circle: long branches are indicative of old taxa, short ones of young taxa. So based on RPE values, the cells containing significant amounts of some form of rarity are grouped into either hotspots of neo-endemism if the RPE is significantly low, hotspots of paleo-endemism if it is significantly high, or mixed (neither particularly neo nor paleo) endemism if it is somewhere in the middle.
This seems very complicated at first sight, but the problem is really that an attempt to analyse both geographic rarity and branch length distributions at all phylogenetic levels in the same test necessarily has to be quite intricate. The important point is that this is the first quantitative test for hotspots of neo- and paleo-endemism where previously one would have simply pointed a finger at the map and said, "I kind of intuitively think there are a lot of young rare taxa there".
Mislher BD, Knerr N, Gonzalez-Orozco CE, Thornhill AH, Laffan SW, Miller JT, 2014. Phylogenetic measures of biodiversity and neo- and paleo-endemism in Australian Acacia. Nature Communications 5: 4473.