Tuesday, November 25, 2014

Yay! A special issue all about paraphyly!

(The following is the first part of a series of posts on an Annals of the Missouri Botanical Garden special issue on "Evolutionary Systematics and Paraphyly". All posts in this series are tagged with "that special issue".)

In 2011, the International Botanic Congress, the largest meeting of plant scientists on the planet, and indeed so large a meeting that it is only held every six years or so, took place in Melbourne. Among the symposia organised at the IBC in that year there was one with the title “Evolutionary Systematics and Paraphyly”, chaired by some of the few botanists who still insist that paraphyletic supraspecific taxa should be accepted.

Sadly, I missed that symposium because I went to a more important parallel session. Recently, however, a special issue based on the symposium appeared in the Annals of the Missouri Botanical Garden. Time to blog about phylogenetic systematics again, it seems. My plan is to go through the articles one by one (with the exception of the last one, which for some unfathomable reason appears to be about Pandas and has nothing to do with classification; indeed one wonders whether they mixed up the journal in which that paper was supposed to appear). But before I start, I want to clear my throat, so to say.

There are several points that will come up again and again in the papers making up this special issue. One is cladists being called dogmatic or some variant thereof for insisting on relatedness as the one and only criterion for classification. Cladists want to base their phylogenetic system entirely on descent, whereas paraphylists* argue that their classification is superior because it takes two different criteria, descent and similarity or dissimilarity (“modification”), into account, an approach that is supposedly less dogmatic and more flexible.

Faced with this rather odd accusation, some cladists point out that they are exactly as dogmatic as any scientist insisting on the scientific method, or as any mathematician who insists that calculations have to add up. I would like to make a different point: Can you think of any other classification that uses two completely different criteria at the same time? And if so, is it actually a useful classification?

We hardly ever think about it, but we constantly classify things, both in science and in everyday life. Books in a library by topic, types of vehicles by use, school children by year, and so on. I have used the example before, but consider the books. If you want to find something in a library, it is extremely useful that the institution organises its books by topic. There is fiction and non-fiction; within fiction, you will find novels and poetry collections; and in the novel section, you will find subsections for Classics, for Science Fiction, for Fantasy, for Romance, etc. It works precisely because the classification is using one and only one criterion.

Now imagine a paraphylist approaching the library: this classification is too dogmatic, we should classify books by topic and by name of the author. Okay, you say, that is perhaps not a bad idea. At the lowest level, when we find that we have forty books under non-fiction / travel literature / China and run out of ideas how to break up that category, then we can perhaps sort books within it alphabetically by the authors' last names. Ah, but we are talking past each other, replies the paraphylist, what I obviously mean is that we should classify by both criteria at the same time, and at every level, including the highest one.

So imagine: You have fiction and non-fiction, but all non-fiction books written by authors starting with A, F, G, J, S, U and Z go into the fiction section. Why? It sure won't help the user locate any book. Indeed it makes the classification utterly useless.

That is precisely the problem of so-called Evolutionary Systematics, and it is the problem of all classifications using more than one criterion at all levels, which is why we usually do not to do something like that. The classification of living organisms seems to be special in that regard, at least as far as paraphylists are concerned.

Just to consider what is at stake in the discussion, I would also like to point out that there are many different classifications of living organisms, all of them with their special uses, and that no cladist would dream of dismantling any of them. We classify in many different ways by ecology, for example into herbivore, carnivore, omnivore, parasite, detrivore, etc., or plants into annuals, perennials, shrubs and trees. We may also classify plants by their pollination syndromes or by their dispersal syndromes.

Again we observe that each of these classifications uses precisely one criterion, but I want to draw attention to the fact that all these are special purpose phenetic, that is similarity-based, classifications. And no cladist has any problem with that. What the cladist says is merely that in addition to all these useful phenetic classifications, each using a single criterion, we should also have one phylogenetic system, also using a single criterion (because, again, only that makes a classification useful in the first place).

Really it is important to realise that that is what is going on here. It is not that, as sometimes claimed, the cladists dogmatically want to destroy everything that is not purely phylogenetic. It is instead the case that the paraphylists insist that there not be a phylogenetic system in existence. They want to make it so that the only system we have been working towards to be predictive of relatedness, shared ancestry and, consequently, undiscovered shared inherited characters, be rendered useless. Regardless of whether their motivation is seen to be as noble or merely as misguided traditionalism, this is what is at stake.

Finally, when I will write about the papers in this special issue several other points will also come up again and again. And indeed they have come up again and again since the 1970ies. It is typical of the discussion about paraphyletic taxa that it goes in circles. It is likely that paraphylists would say the same of cladists, but my feeling is that at least some of them do not actually listen to counter-arguments.

For example, they may argue that a long branch on a phylogeny provides the justification to treat the group descended from that branch as separate from the paraphyletic residue around that long branch. It is pointed out to them that the long branch is an artefact of extinction and / or an incomplete knowledge of the fossil record, and that newly discovered extant or fossil species could easily break the long branch, throwing their classification into confusion. And the very same paraphylists then write a rebuttal in which they argue ... that a long branch on a phylogeny provides the justification to treat the group descended from that branch as separate from the paraphyletic residue around that long branch.

Which of course wouldn't be so much of a problem if they had bothered to address the counter-argument, but no, they merely reiterated their previous statement. This is a very polite way of covering one's ears and saying lalala-I-can't-hear-you but it doesn't help move the discussion along.

A related problem that also leads to the discussion going in circles is that many paraphylists do not actually understand the cladist approach, the school of systematics that they reject. Among the most striking indications that one is dealing with somebody who does not know what they are talking about are any mention of “paraphyletic species” (unless it is specifically stated that we are talking about asexually reproducing organisms) and the argument that we need to accept paraphyletic taxa because of reticulate evolution.

The problem is in both cases the same: phylogenetic terminology and phylogenetic systematics can only apply to phylogenetic, that is tree-like, relationships. In a net-like relationship such as would be found within a sexually reproducing species or in any group undergoing severely reticulate evolution, words like paraphyletic or monophyletic do not apply; there is no phyly to be para in. The paraphylist making these arguments makes as much sense as if they would lambaste somebody for not using meters above sea level on a planet that is utterly devoid of water.

After all this it should have become clear what it will take to convince me that the opponents of phylogenetic systematics are right. Merely arguing that mixing two incompatible criteria in one classification is better won't do because that is obviously false, see above. For several specific purposes we need a natural system that classifies organisms by their relatedness alone, next to all the other classifications that classify by pollinator alone, by ecology alone, by life cycle alone, and so on. Arguments involving paraphyletic species are also out, as are any that rest on a misrepresentation of how phylogenetic systematics work.

However, what would convince me is an argument that demonstrates phylogenetic systematics to be impossible, infeasible, misguided.

I will see how it goes.


*) Their autonym is often Evolutionary Systematists, but that is quite a mouthful, and I want to use something short and snappy like cladists. Also, I do not accept the implication that their approach is any more evolutionary than the cladist one. Quite the opposite, actually.


  1. I'm not a scientist, so I found your library analogy confusing. It sounds to me like a better analogy would be (just as a for instance) by topic and language rather than by surname. A library has already been sorted by language. Would you put all the books in English together, and then separate out Anglo-Saxon, Middle English, Elizabethan and so on first, or stick by topic? But then what do you do about books in French with topics that have clearly been influenced by specific books in English? Just trying to understand...!

  2. I am not sure I understand how you would sort our library.

    The point here is that it has to be clear what your categories are based on, because only then somebody will be able to use the system. A phylogenetic system uses only one criterion (down to the lowest level where it cannot apply any more, just as there comes a level where it might make sense to sort books by other criteria); but 'evolutionary' classification uses two criteria throughout.

    That means if I say, for example, "the genus Veronica" ("the science section"), and you know that we are dealing with a purely phylogenetic system, you can assume that all species in Veronica are each others' closest relatives (there are only science books in the section, and no science books elsewhere). Likewise, in a purely phenetic system, you would know that the group "stem succulents" contains all plants with a certain stem morphology, and only those (you will find all books authored by people called "Smith" in the same place).

    If, however, you know that you are dealing with an 'evolutionary' classification, you cannot make an assumption like that. "Veronica" might contain only species that are each others' closest relatives, if the paraphylist thought relatedness more important in this case. Or the closest relatives of some of them might be in the separate genus Hebe, if the paraphylist thought herbaceous vs. shrubby habit was more important.

    You don't know. It is as if you can't find astronomy books in the science section and then have to wonder whether the library just doesn't have any or whether they put them next to the SF novels or next to a poetry collection extolling the beauty of the night sky. (Because stars!) But usually you will wrongly assume that the library just doesn't have them, and usually the end user of a classification will wrongly assume that the paraphyletic Veronicas are a meaningful evolutionary unit...

  3. Thanks, Alex. Actually, my local library is sorted in multiple ways. First off, it always depends how the book classification decision has been made (many subjects cross - is it more economics, sociology or psychology if it dips into all 3). Scifi horror romance? Over here, here and here! Then some end up on the children's shelves. And some are in the oversize, some in the stack,etc. And many are treated as part of the same catalogue but distributed around other library shelves throughout the county, even including being noted as in the administrator's office! A keyword search with wildcards may work better than looking on the shelves, but not if the title's too clever for its own good and nobody's really understood the book.
    I'm not saying your argument's wrong, just not convinced by the analogy.
    I suppose in classification you get what you're looking for, and if you're looking to apply a single criterion (at a particular level) then anything else will look wrong.
    Possibly for botany your way just makes more sense.
    What's good, though, is its made me think about cultural evolution: http://books.google.co.uk/books?id=BcuwuTEgUHIC&pg=PA105&lpg=PA105&dq=paraphyly+cladistics&source=bl&ots=NU9HhxgiW_&sig=HLSC7h5tfzzccmoeG8rP0qJjwNE&hl=en&sa=X&ei=s8Z9VJtN5rLtBsnfgMAN&ved=0CDMQ6AEwAzgK#v=onepage&q=paraphyly%20cladistics&f=false
    Thanks, Pablo

  4. The problem with analogies is that they are rarely perfect. Few things are organised in such a clear tree-like and hierarchical structure as that produced by common descent.

    In the present case it occurs to me that the real counterpart to the biological classification is not the library anyway but the Dewey Decimal System (or similar) that it uses to classify its books. While the library may have no choice but to use a different shelf for oversize books, they would still have the expected call number in that system.

  5. Yes, I thought you were thinking of Dewey Decimal (although to confuse things further, our library's running 2 systems in parallel during a changeover from one classification system to another - forgot to mention that above!)

    Looking at Wikipedia - Cladistics in disciplines beyond biology:
    and Morphological phylogenetics:
    - this basically illustrates that the further you get from the molecular level, the more likely you will get competing abstractions and fuzzy or arbitrary classification. There need to be strong emergent properties before you can use the same methods at a higher level.

    So for novels to be analysed using "memes" in a scientific (or serious literary) way, would need memes to exist in a definable, quantifiable form. And a big computer.

    Thanks again, really interesting.

    I've tweeted a link to our discussion here: