Tuesday, July 1, 2014

In biological classification, rank is arbitrary

Today one of my alerts for scientific literature, well, alerted me to a newly published paper in the journal Paleobiology: Hendricks et al., The Genericifiation of the Fossil Record.

The starting point of the article is something that I had also noticed before: Studies of changes in diversity across geological time scales often use supraspecific taxa as their operational taxonomic units (OTU). Hendricks et al. stress the use of genera but there are cases of even higher level groups being counted, for example when the number of animal phyla before or after an extinction event is compared.

The problem is that while we can have a productive discussion about whether species are comparable with each other under this or that species concept, everybody but the most blinkered systematist will agree that higher ranks in the classification aren't - if you compare their numbers, apples and oranges don't have anything on it.

There are several reasons why that is the case.

Let's use genera as an example, to stay with the theme of Hendricks et al. In case it isn't clear, genera are something like Homo (of Homo sapiens), Drosophila (with Drosophila melanogaster as one of the species) or Eucalyptus. They are the main rank between species and family although intermediate categories such as subgenus are sometimes used. The important thing is that all the problems subsequently enumerated apply with at least equal force to all other supraspecific ranks in the classification, and they only get worse the higher up the classification you go.

First, and most obviously, genera can have very different numbers of species. Homo, like all too many genera of large mammals, has got only one, the gum tree genus Eucalyptus has hundreds. However, if they had some other trait that made them comparable that would not necessarily matter, for example if all genera were of the same age.

But second, of course they are not of the same age, or if they are then only by accident. Homo diverged from Pan perhaps five to seven million years ago whereas Australian Acacia, for example, is at least 17 million years old. Other genera are an order of magnitude older still.

The underlying issue is simply that while there are species concepts that one could use, even if there is disagreement about which is the best, there simply is no 'genus concept'. Above the species level all we have is a phylogenetic tree of life with clades nested in larger clades, and the decision of what subclade to call a genus (or family, or order, or division...) is arbitrary. Often the decision is based partly on tradition; often it has a lot to do with how important something is to us. Therefore the small size of genera of large mammals but the large size of genera of beetles.

The only criterion is monophyly, but that is a necessary, not a sufficient criterion. But even that cannot be taken for granted. As Hendricks et al. point out, many genera are still non-monophyletic, often because they simply haven't yet been tested for monophyly. And that means that they are utterly useless for comparison, akin to the nonsense question "are Catholics taller than Australian citizens?"

So whenever I read how there were so-and-so-many animal families in this geological epoch and so-and-so-many in that, I flinch, because such a sentence is just meaningless. It was good to see the Hendricks et al. article, and I hope that people will at least become more careful in their interpretations.

1 comment:

  1. Speaking of delimiting families, did you get/respond to the APGIII survey via EvolDir today?
    It took me a fair few questions before I realized that it wasn't actually an exercise in gauging researcher's general views on sink/split, or how much/what data is appropriate to make such decisions, but was a survey on *particular* instances of taxonomic debate (placement of family x in relation to y). This realization cheesed me off as it seems a far less interesting use of a survey. I guess it's meant to allow the opinions of people other than those already privileged to publish on each particular cases, but the data and background provided for each example was laughably inadequate (including typos) for making any sort of opinion unless one was already familiar with the group, which rather defeated the point of the exercise I would have thought.
    Really all it did was highlight the fact that these decisions are effectively subjective, within vague bounds (especially where data is equivocal or poor) and for groups where I have no personal investment I found myself hitting the "I have no opinion" button more often than not.