Monday, December 21, 2015

Aubert's analysis of phylogenetic terminology, part 1: abstract

As always the following is my personal opinion and does not necessarily represent the opinion of any of my friends and relations, of my employer or colleagues, or of anybody whosoever except me.

I have recently set an alert with the key word evolutionary systematics (or similar), and today it delivered the first noteworthy catch: Damien Aubert, A formal analysis of phylogenetic terminology: Towards a reconsideration of the current paradigm in systematics. Published in Phytoneuron, an online journal that has peer review "if deemed appropriate or necessary by the editor, or if requested by the author", this piece of criticism of phylogenetic systematics runs over an astonishing 54 pages. It will certainly have to be tackled in homoeopathic doses.

So for the moment, let's just have a look at the abstract and table of contents.

BACKGROUND: For too many years the practice of systematics has been impeded by profound disagreements about the very foundations of this discipline, that is to say the type of information that should or should not be incorporated into a proper classification of life.
Note here that Aubert writes "information ... incorporated", not "information content". There is no doubt that phylogenetic and 'evolutionary' systematists build their classifications taking different pieces of information into account. The problem is that this is then often transformed into the claim that we can get more information out of 'evolutionary' classifications than out of phylogenetic ones (see especially various publications by Elvira Hörandl).

I, on the other hand, consider 'evolutionary' classifications to be useless because the end user would have to check for each individual taxon with what kind of rationale it was circumscribed. (Similarity or relatedness? And if the former, similarity in what traits?) In contrast, a system incorporating only one type of information is truly informative: in a phylogenetic classification, all the members of one group are each others' closest relatives, and that is that. I am looking forward to seeing how that issue is handled in the present contribution.
Two main schools of systematics, both recognizing evolution, oppose each other: cladism states that the classification should only reflect the branching order of the lineages on the tree of life whereas evolutionism states that the length of the branches, that is the degree of modification, should also be taken into account so as to reflect macroevolutionary leaps. The first one forbids the exclusion of any descendant from a group that contain its ancestors, while the second one explicitly requires that the descendants too much different [sic] from their ancestors must be classified separately.
This is the first time I have ever seen 'evolutionism' used in this sense; it sounds more like an insult made up by creationists. I will continue to scare quote this and related terms because the classification that best reflects evolution is the phylogenetic (= cladist) one.

I am also looking forward to seeing how the present paper handles the problem that the morphological gaps between an ancestor and a descendant are merely the result of our ignorance of the intermediates that must have existed (and might still turn up as newly found species or fossils), given that evolution is the gradual change of allele frequencies in populations. The use of the term "leap" here in the paper abstract might imply that the author is a saltationist, i.e. perhaps he believes that evolution is not gradual but proceeds in rare, big jumps on the lines of a cow suddenly giving birth to a whale. If that were the case, then at least one of the many problems for 'evolutionary' classifications would be solved, but unfortunately saltationism appears to be discredited.
Moreover, both schools often use the same words, such as "monophyly," to designate different ideas. This prevents proper communication between the proponents of either side.
This is the first part of the paper that starts to annoy me. As far as I can tell, the situation is this: Thousands of taxonomists, systematists and phylogenetists working on all groups of organisms across the planet use terminology in the sense of Hennig in papers, in conference presentations, in textbooks, and in lectures. About, I guess?, five to ten people in botany, perhaps a few more in zoology, who promote the acceptance of non-monophyletic supraspecific taxa and thus have a vested interest in blurring the distinction between monophyly and non-monophyly, publish the odd paper here or there in which they use terminology in the sense of Haeckel or Ashlock, or sometimes they invent their own definitions Humpty Dumpty style. And then the same five to ten people turn around, as in this case, and say that everything is now so confusing and complicated, and the other thousands of systematists had better accept their fringe view to make things clear again.

So it seems a bit like that story of the boy who murdered his own parents and then pleaded for leniency on the basis of being an orphan: it is an entirely self-created problem.
Consequently, the research in phylogenetics is globally erratic and the taxonomic classification is highly unstable.
Classifications will remain unstable as long as new data come in, or they are not scientific. That is how science works: we get new insights, we change the conclusions.
RESULTS: I rigorously define the terms which designate the phyletic relationships and explore their properties through use of graph theory. I criticize a similar work (Kwok 2011) that was unable to properly catch these notions.
When I read words like these, I always wonder: Does the author in question really think they alone understand what is going on in a way that nobody else did when these matters were discussed over and over again in the 1960s, in the 1970s, in the 1980s, in the 1990s, in the 2000s, and just again in the last few years?

Well, we shall see what the rest of the paper delivers, but it seems rather... odd ... to believe that all the world's experts would have missed some critical detail that will suddenly bring phylogenetic systematics crashing down, especially one that can be discovered by drawing a bunch of diamonds connected with arrows. (I peeked ahead, that is what the figures in this paper look like.)
This leads me to provide three independent arguments -- one historical, one utilitarian, and one morphosemantic -- in order to retain the original Haeckelian meaning of the term "monophyly" rather than the redefined Hennigian one.
No surprise. Here is a utilitarian and historical counter-argument: >99.9% of all systematists use the words in Hennig's sense, and changing it now would cause precisely the confusion that currently only exists in the imagination of a handful of die-hard opponents of phylogenetic systematics.
I identify some polysemy regarding the term "clade," and that is why I define two new words, "holoclady" and "heteroclady," to contrast respectively with "holophyly" and "heterophyly."
Podani and later Vanderlaan et al. have in recent years undertaken similar efforts especially with regard to the distinction between asynchronous and synchronous classifications. We shall see if this is really new or the same. At any rate, I have my doubts that such distinctions are really necessary in practice.
I also show that a strictly holocladic or holophyletic classification advocated by cladists is formally impossible.
This is another aspect that I am looking forward to. In the past, 'evolutionary' systematists have taken three main approaches to demonstrate that empirical reality forces us to accept paraphyletic taxa: (1) claiming that there is too much hybridisation or introgression, so that we don't really have monophyletic groups anyway, (2) claiming that taxa are just paraphyletic, so there, and (3) Brummitt's observation that there is a tension between phylogenetic systematics and Linnaean ranks when classifying ancestors. The first is self-defeating because if there is no phylogenetic structure then there is no paraphyly either, the second argument is clearly circular, and the third ... depends.

Brummitt's argument is correct if we have positively identified ancestral species, if we want to treat them as ancestral as opposed to side branches, if we want to build one classification of all life that has ever existed as opposed to one per time-slice, and if we insist on using Linnaean ranks, all at the same time. That is a lot of "ifs", and it could be added that the paraphylist ('evolutionary') approach causes even more problems when classifying ancestral species, because the ancestors break the morphological divergence between the extant species down into smooth gradients. Just think of all the feathered dinosaurs, and what that does to the idea that birds should be classified as separate from dinosaurs.
I therefore review and criticize the philosophical postulates subtending such an illogical paradigm. I show that cladism is part of a more general philosophical movement named structuralism, which is mainly characterized by anti-realism and a metaphysical way of thinking.
The charge of cladism being equivalent to structuralism has previously been advanced by Richard Zander. It is fascinating how these colleagues claim to know better than cladists what cladists believe. Why, I am a cladist and I have no idea what structuralism even is! (At least to me, Zander's explanation was about as clear as Minoan Linear A script, but maybe that is a failing on my part.)

And to the degree that the above quotation allows any inferences to be made, I would suspect that I am not a structuralist. In fact I am wondering whether this might be a case of projection; in my experience it is usually the pro-paraphyly people who have extremely formulaic views of how things should be while rarely grounding their thoughts in a clear mental model of how alleles evolve in meta-populations, how sexually reproducing individuals are related to other such individuals, how species are related to other species, and how evolution actually happens in real life.

Not all of them of course, but the ones who do try to visualise what is going on then have an unfortunate tendency to conflate gene trees with species trees, and tokogeny with phylogeny.
I identify the biologically unrealistic assumptions on which cladism is based and argue that they have been empirically falsified.
This has also been tried in the past, but unfortunately it always relied on a complete misunderstanding of the assumptions of cladism. If this has anything to do with "paraphyletic species" I am going to be very disappointed, because that is like telling a mathematician that they are doing circles wrong when they are drawing them without corners. Circles are defined as not having corners, and similarly all the words ending in -phyletic and -phyly are defined in a way that they cannot possibly apply to a group of sexually reproducing individuals forming a single meta-population species.
I therefore defend the use of paraphyletic groups in the scientific classification of life and review the main arguments that have been opposed to this solution. Some of them, such as anthropocentrism or the lack of an objective manner to determine paraphyletic groups, are grossly outdated, while others simply rest upon the difficulty in conceptualizing emergent phenomena.
This is then what interests me most about this paper, because it sounds as if it will rebut some arguments that I published myself, and indeed the reference list features three of my own papers. At the moment I don't get the relevance of emergence - it is generally the pro-paraphylists who have great difficulty understanding that different approaches to classification emerge above the species-lineage level than below, but again, let's see.
CONCLUSION: Since clades are still useful for methodological reasons,
Clades are "still useful"? What is next, the insight that the concept of the population is still useful to population genetics?
I offer a compromise that should make possible the coexistence of the two main opposing schools of systematics by eliminating competition between clades and taxa for the same names. I propose therefore that in a future revision, the BioCode should approve a dual system by recognizing both a "phyletic arrangement" made of clades and a "phylogenetic classification" made of taxa.
I haven't read through this paper yet, but somehow I doubt that it will convince the worldwide community of taxonomists, phylogeneticists, and systematists to suddenly drop the monophyly requirement after, and that cannot be stressed enough, this discussion has already been had thoroughly in the early 1970s.

Also, note that the term "phylogenetic classification" is here used for one that contains non-monophyletic taxa, in other words for a non-phylogenetic classification. Can we perhaps go easy on redefining words that already have a clear and widely accepted meaning? That would be nice.
"We often discussed his notions on objective reality. I recall that during one walk Einstein suddenly stopped, turned to me and asked whether I really believed that the moon exists only when I look at it." (Pais 1979)
Yes, that is really between abstract and table of contents. Here, have a random quote that mentions Einstein. The background is presumably the aforementioned bizarre idea that cladists are "anti-realists".

Finally, from the table of contents:
10.2. Paraphyletic Species .... 34

Well, I know one item I will be reading on and off over the Christmas and New Year break.


  1. “Can we perhaps go easy on redefining words that already have a clear and widely accepted meaning? That would be nice.”

    You mean like redefining “fish” to include tetropods or redefining “dinosaurs” to include birds? Sure, that would be great. Skipping ahead to the conclusion, Aubert seems to be proposing parallel systems: a paraphyletic Osteichthyes* alongside clade euteleostomi. Speaking as a non-specialist, I have no problem with morphologically heterogeneous clades but I would appreciate having familiar words refer to relatively stable and easily recognized groups of species.

    Of course, we have parallel systems now: the informal “bryophytes” alongside the formal Embryophyta. However, there’s a lot of confusion about whether it’s acceptable to talk about the primitive features of basal angiosperms; and while one can get used to “non-avian dinosaurs,” a term like “non-tetropodan euteleostomes” would be too cumbersome to use.

    Under all the misrepresentation and semantic confusion, Brummit had a straightforward argument about the problems of communication that come about when trying to combine Linnean ranks with cladistic thinking. If I hypothesize that the aquatic group of white flowered Ranunculus had a yellow-flowered terrestrial ancestor, it is easy to express that ancestor-descendant relationship by talking about an unranked aquatic clade A within Ranunculus. It is also easy to express this idea with a paraphyletic terrestrial Ranunculus and a monophyletic aquatic genus A, although this comes at the price of reduced clarity in discussing the diversification of the terrestrial Ranunculus. But unless the surviving terrestrial species form a clade, becomes cumbersome to express this idea using Linnean ranks without excessive splitting at the subgenus level.

    1. Are you seriously implying that systematists should refrain from doing the job they are paid for? Never change the circumscription of taxa because then they are not the way people traditionally understood them? Do you also extend that imperative to other areas of science, e.g. physicists should not have updated our understanding of gravity beyond "stuff falls towards earth because solid things want to be together with other solid things", because relativity is "unfamiliar"?

      And yes, we have parallel systems alright. I dare you to show me one cladist who will tell people of for using a polyphyletic group such as "herbivores". Cladists are perfectly happy with having parallel systems, whereas most 'evolutionary' systematists want to destroy the one system that is developing towards being phylogenetic.

      The job of a systematist has always been to figure out what characters are really informative of relatedness, ironically even before common descent was discovered. Cladism is just doing that consistently where its opponents go "but not in this case, because that is not how I learned this group to be circumscribed when I was in school".

      Brummitt's argument was not so much about communication but about logical incompatibility. It does at a minimum not appear to be obvious that we should continue to use the Linnean system, which was invented in ignorance of common descent, if it turns out to be a poor fit to reality as now understood.

      Finally, yes, you can write out a description of the relationships between non-monophyletic sections in Ranunculus. But the point of a system is that you shouldn't have to. Somebody should be able to look at "section Whateveres" and be able to extract some information from the fact that the species in question are grouped into that section. Otherwise, why have a classification? What is its purpose?

      In the case of a phylogenetic classification, that information is that they are each other's closest relatives. In the case of an ecological classification, "rheophytes" tells us something about their preferred habitat, for example. In the case of a mixed classification with mono- and paraphyletic taxa, well, we don't know. "Section Whateveres" will then be uninformative if not misleading, if some end user assumes that they are dealing with a natural classification.

    2. Sorry, I was chuckling at the seeming irony of the debate and gave entirely the wrong impression.

      I do not mean to imply that traditional usage should take priority over new findings by systematists. However, if there’s a way to communicate relatedness without changing the circumscription of an easily recognized but paraphyletic taxon, it’s worth considering.

      As I understood it, Aubert’s proposal does that with case and punctuation conventions. Ranked taxa are capitalized. Paraphyletic taxa are allowed but must be marked with an asterisk so readesr can recognize them. Unranked clades are designated in lowercase, and cannot be synonymous with a ranked taxon. That way, the reader could distinguish the paraphyletic Osteichthyes* from the more inclusive euteleostomi.

      I actually think an unranked system is a better way to deal with the incompatibility between Linnean and phylogenetic hierarchies. However, if people insist on retaining Linnean ranks and supposing that they imply anything at all about the degree of morphological disparity (at least when comparing extant mammal families to extant mammal families and extant angiosperm genera to extant angiosperm genera), a system that can handle paraphyletic taxa has some merit.

      First of all, whenever you have a situation like an aquatic clade of buttercups nested within an extant terrestrial clade, accepting paraphyletic taxa makes it possible to rank the derived clade without splitting out less distinctive plesiomorphic clades at the same rank. Permitting the plesiomorphic clades to be unassigned at that rank would accomplish the same thing. To the very limited extent that there ever was a “genus concept,” that concept and the practicality of the classification system in general are undercut by mandatory, monophyletic ranks.

      Second, describing an extinct species in a terrestrial genus giving rise to an aquatic genus is a more intuitive way to think about the most likely evolutionary scenario than describing an unranked terrestrial ancestor diversifying into to a nearly identical terrestrial genus and a distinctive aquatic genus. Not all paleontologists assign fossils to families and orders, but all paleontologists do name fossils at genera. Some of these genera are exclusive and monophyletic only in the legalistic sense of “we can’t be sure that it was a member of the same population as the ancestor of another genus.” Brummitt seemed to be more concerned with being able to talk sensibly about the evolution of genera and families than he was with actually giving a Linnean binomial to the undiscovered species at the nodes of a cladogram, although I can’t say the same of Zander or Aubert.

    3. Sorry then, I misinterpreted your comment.

  2. Hi Alexander. I can't understand the point of criticizing... an abstract! However, I have a few superficial remarks to say. 1) You may be a structuralist without knowing it, there is no paradox here (Many people are made of atoms without knowing what an atom is). Moreover, this is not particularly a charge against cladism, simply a fact. 2) Don't get confused with the adjectives anti-realist (=nominalist) vs unrealistic, however it is surprizing to me that an anti-realist ideology easily leads to unrealistic theories. 3) Daniel Haug was right in his comments, he got the spirit of the dual system of classification I advocate, I think. 4) You could be disappointed in some of your waitings because this paper is only a framework for my future works (new algorithms). 5) Even if I don't cite him (my mistake), consider that I subscribe to the rebuttal of your paper against paraphyletic taxa wrote by Holynski. 6) It is not "Ashcroft" but Ashlock. 7) Your claim that 99.99% of systematists are cladists is an exagerration. Many systematists just don't care. I was myself a cladist since I was only taught cladism (ad nauseam) at school and at university. Excuse my poor English, but if you have questions about my paper I am open to help you to make a more serious rebuttal by avoiding straw man arguments.

    Damien Aubert

    1. 0) Unfortunately, I find your paper extremely hard to read through in one go, both because of its length and, I am sorry to say, its style of writing, so it works better for me to do it bit by bit. You will also find that the above is mostly a series of "this is Aubert's next claim, let's see what the paper provides", with few direct criticisms as of yet.

      1)-2) Yeah, and perhaps 'evolutionary' systematists are all merely motivated by an irrational attachment to the circumscription of taxa they learned in school, right?

      3)-4) Great. I have no idea how "algorithms" enter into the question whether non-monophyletic taxa make sense though.

      5) Ye gods, really?

      6) Oops, thanks for pointing that out. Will correct it later.

      7) I claim that the vast, vast majority of us use *terminology* in Hennig's sense. I am fully aware that many don't care, and many others have been trained too superficially to fully understand the principles of phylogenetic systematics, either leaving them open to nonsensical arguments like "but most species are paraphyletic, look at this ITS tree" or else leading them to say nonsense like "this species needs to be recircumscribed because it is paraphyletic on the ITS tree". Like myself until c. 2011, when I started looking into the issue more deeply than at the level of stuff needs to be monophyletic, so it is not as if I don't understand.

    2. Perhaps many evolutionary systematists are attached to taxa they learned in school... However you will have to prove this claim. And I am myself a counter-example since I learned only clades at school.

      Your belief that paraphyly or polyphyly don't apply to the species level or below is nonsensical to me, although I read all your posts about this. It is just a terminology trick to avoid the debate I would say...

      Sorry for the length and style, but I don't think I repeat myself, and I see no point in dividing a paper into 10 pieces. I will appreciate if you can clearly explain what is wrong with my style so that I can correct it for my future papers.

      Kind regards,

      Damien Aubert

    3. No, I don't have to prove it because I don't actually believe that, except perhaps in some cases. Apparently my attempt at demonstrating how annoying it is to be told one believes something that one openly rejects was too subtle.

      Wait a second. You write a paper where you spend 30 pages (re)defining everything in exactly the way you need it to be to conclude "I win", at a completely abstract mathematical level, without any consideration of biological reality or actual phylogenetic studies, but you think that *I* am using a terminology trick?

      The whole point of Hennig's work was to argue that different relationships hold between branches of a tree than between knots in a web, and all the words ending in -phyly are defined in ways that make them inapplicable to the latter.

      Of course you are free to redefine them at your leisure, but that does not mean that your definitions are helpful.

    4. Sorry I just can't catch sarcasm. By the way, it can be easily claimed that Hennig redefined these words in order to say "I win" at the end. However I argue in my paper that what happened is more subtle than that. It is not productive for either side to think that the other one is not honnest or just stupid and don't understand what we explain.

    5. Yes, that is kind of what I meant. However, Hennig did not just define a few terms. He also provided actual arguments for why it does not make sense to give formal recognition to incomplete clades, why such a classification does not reflect evolution, is uninformative and misleading, and makes it impossible to come up with a testable and universal criterion for group delimitation.

      It is that what the discussion is about, it is that what lead to him "winning", and Elvira Hoerandl for example is addressing these issues heads-on in her opinion pieces (although as can be seen above I come to the opposite conclusion on information content).

      On the other hand, inventing new words or discussing the properties of beautifully abstract networks while conveniently ignoring the biological differences between groups of metapopulation lineages on the one hand and groups of individual sexually reproducing organisms on the other can probably not be expected to convince many people.