Saturday, January 2, 2016

Aubert's analysis of phylogenetic terminology, part 3: Different kinds of relationships

Continuing the discussion of this paper from here and here, and working through the main claims of the paper as I see them:
  • The various definitions provided in the paper are in some way better than the ones that are currently accepted.
  • There is no relevant difference between the systematics-relevant relationships and structures existing at any level of the diversity of life. (E.g. mother > daughter is completely equivalent to bony fish > land animals - they can all be drawn as diamonds and arrows, right?)
  • A strictly phylogenetic classification is formally impossible.
  • Cladism is part of structuralism and therefore characterised by "anti-realism and a metaphysical way of thinking".
  • Cladism is built on biologically unrealistic assumptions that have been empirically falsified.
  • There exists an objective approach to delimiting paraphyletic groups.
  • It would be preferable to have two parallel classifications, one of clades and one that includes taxa that are allowed to be non-monophyletic.
One of the claims made repeatedly in Damien Aubert's paper (and in a recent comment here) is that the same definitions and approaches apply to all levels of biological diversity. In this post, I would like to take the opportunity to discuss in detail why I disagree. I do believe that the relationships between individuals of the same biological species are not equivalent to the relationships between species or groups of species. Importantly, the same stance underlies phylogenetic systematics, AKA cladism.

I am not going to discuss in any depth gene trees, alleles, and coalescent theory here, because at least in theory we are all agreed that systematists don't and shouldn't classify genes but organisms. In practice, many paraphylists unfortunately forget that and point towards incomplete lineage sorting to claim that "many species are paraphyletic". What really happens quite often, especially if effective population sizes are large and time between lineage splits is short, is that, and the phrasing is really crucial here, the gene copies of a certain gene found within a species are paraphyletic to the gene copies found in a another species.

In the lineage split figure further down, for example, blue diamonds indicate individuals with two copies of the allele that ultimately becomes fixed and thus an autapomorphy of the right hand daughter lineage, while purple diamonds indicate heterozygous individuals. Accordingly, as late as the fifth generation after the split the gene copies in the right hand species are still paraphyletic to the gene copies in the left hand species, and in realistically sized populations it could be many, many more generations later.

Although we may use the shorthand "paraphyletic species" in everyday conversations or even the odd ill-conceived paper title that we are now embarrassed by, this is not really a good way of putting it. If we wanted to claim that a species is paraphyletic we would have to talk about the individuals of that species forming a paraphyletic assemblage, not gene copies.

I would argue that by the very definition of the terms ending in -phyletic they do not apply in a net-structure but only in a tree-structure (such as a gene tree - incomplete lineage sorting again), and thus the individuals in a sexually reproducing species cannot form a paraphyletic assemblage, see below. If several populations of one species are really genetically so cleanly separated that they do form a grade leading up to another species then they should each be recognised as separate biological entities, according to all but the most pre-evolutionary species concepts.

Even if one were to twist the definition and force it to apply to relationships the word was not created for, it should still in practice be pretty much impossible to have such a situation. Precisely because the individuals of a biological species interbreed pretty freely they will not form a grade.

With this out of the way, let us look at three levels of biological diversity that do matter to systematics and classification: individuals, species, and groups of species.

Relationships between individuals

Individuals in sexually reproducing species generally need a partner to produce offspring, thus everybody has two parents:

If we wanted to characterise these relationships, we would have to say something like this:
  • The items we are concerned with are individual organisms.
  • Because of sexual reproduction, each individual contributes 50% of the genetic information found in each of their immediate descendants, and ever less to descendants further downstream.
  • Also because of sexual reproduction, genes are constantly being mixed between individuals, from generation to generation. This has the consequence that we cannot infer common descent from individual traits; I, for example, could theoretically have inherited 0% of the genes of my maternal grandfather, although that is statistically unlikely. Over a few generations, however, it becomes pretty much unavoidable that one of your descendants will not have inherited any genes from you.
  • Also because of sexual reproduction, there is no tree-structure between individuals of the same species, instead there is a net-structure. Again, two direct ancestors to each descendant.
  • Individuals are, seen in evolutionary time-scales, extremely transient. They make new individuals with recombined genes, then they die.
There are exceptions, as so often in biology. In some bizarre cases individuals do not necessarily contribute 50% of the genes to the next generation, as in the pentaploid dog-roses where the female partner contributes 80%, although one could say they do if we average across male and female roles. Males of many species of hymenopterans do not have a father, although they do have a grandfather. Some plants can self-pollinate, in which case the same plant may be mother and father, although in the long run these are probably evolutionary dead ends. None of this fundamentally changes what is going on.

Aubert himself draws attention to all the asexually reproducing lineages out there, but that is hardly relevant for the promotion of paraphyletic taxa because it merely means that in those cases we can and should apply phylogenetic systematics down to the level of the individual instead of stopping at the species level. In fact most commonly used species concepts do not even apply to asexually reproducing lineages in the first place.

Relationships between species

At the level of species, our diamonds and arrows graph should look a bit like the following. In fact most publications dealing with these issues have a nearly identical graph, so in theory there should be widespread agreement about what is going on. In practice, however, the interpretations differ:

To characterise relationships at this level, we would have to say something like this:
  • The items we are concerned with are species, which are groups of freely interbreeding individuals and as such an emergent, higher order structure in nature than the individuals themselves.
  • Because new species mostly arise from lineage splits, each ancestral species usually contributes 100% of the genetic information found in the descendant species, and likewise 100% of the genetic information in every of its descendants down the line, no matter how deep into the future.
  • Also because of lineage splitting, genes are not being mixed to any significant degree between species. This is why we can use synapomorphies to infer common descent: Once an ancestral species has fixed a trait across its populations, all descendant species should inherit this trait, at least at first and excepting secondary losses later, and only its descendants should inherit it, excepting very simple traits that can easily be gained in parallel (e.g. red flower colour).
  • Also because of lineage splitting, there is a tree-structure between species, instead of a net-structure; one immediate ancestor to every descendant.
  • Species are potentially immortal. They do not make new species then die; instead, one lineage splits into two lineages, one species turns into two species. The same goes for the process of anagenesis that paraphylists are unaccountably obsessed with: A species does not make a new species and then dies of old age; instead, a species gradually turns into another, and is in a very real sense still the same species-lineage.
Unless somebody wants to doubt any of the above, it should be clear that the biological reality on the ground, as it were, is extremely different from that seen between parents and children.

Again there are exceptions to the above, at least as far as the lack of reticulations is concerned. When paraphylists attempt to muddle the distinction between the tree of life and the network-structure that exists within sexually reproducing species, they tend to speak vaguely of "hybridisation". Technically, a hybrid is a cross between two species, and usually it is a dead end, because it is often sterile (e.g. mules or peppermint) or otherwise unfit. Sometimes, however, it can back-cross and thus lead to introgression between the two paternal species. This is where, and why, there is a grey zone between distinct species and not yet quite distinct species that can sometimes still exchange genes.

Many people think a bit black-and-white and find it difficult to believe that there can be two categories even if there is a bit of a fuzzy area between them. In this case, Aubert concludes that he can simply treat net-relationships within species and tree-relationships between species in the same abstract, diamond-and-arrow-graph fashion and apply the same terms and group shapes because there is a gradient between lots of reticulation within species and none between distantly related species:
But it is clear that in reality there exists a continuum between these two situations, there [sic] are not of different nature.
At least in my eyes, this is equivalent to claiming that there is no relevant difference between toddlers and adults, between orange and blue, or between a cloud of interstellar gas and a star. In each of those cases there is also a continuum, but the two categories are still different in ways that matter. What is more, at least in the case of biological species and stars the fuzzy area is rather small compared to the 'pure' stages. Yes, you can cross some species in Primula section Primula, but you will find it hard to cross them with other sections, let alone with elephants.

I have met at least one cladist who fell into the same trap and has likewise concluded that it doesn't make a difference whether we are talking individuals making up a species or species making up a clade. Of course, he then tries to apply monophyly to the infraspecific level, whereas most paraphylists who make this mistake believe they have found an argument to defeat cladism at the supraspecific level. The point is, black-and-white-thinking is popular across different schools of thought.

Interestingly, Aubert himself warns against conflating rare gene flow with systematics-relevant lines of descent (p. 29), so perhaps we can put the issue of introgression aside anyway. But in addition to back-crossing, hybrids may sometimes be fertile enough to form a new, hybridogenic species, or they may become allopolyploid, an event that restores their fertility.

I suspect that this is what paraphylists usually mean when they use the word "hybridisation", and at first sight it looks like a greater challenge to phylogenetic systematics than introgression. Still, hybrid speciation likewise only happens between closely related species, so once we 'zoom out' a bit we will once more see a purely tree-shaped pattern of descent, perhaps with groups of very closely related species forming the tree branches.

In this context we should also remember that the number of hybridisation events and all that follows depends strongly on what approach we take to species circumscription. An extreme splitter will see lots of hybridisation events. Conversely, an extreme lumper will see hardly any, because they would just argue that everything that introgresses should be considered the same species anyway. Again, that would just be a more 'zoomed out' view of the tree of life.

Relationships between groups of species

Coming now to the relationships between entire groups of species, between higher level taxa, we first have to note that from a cladist perspective the situation is the same as for relationships between species: Valid groups (clades) are descended from a single ancestral species, which was itself the product of a lineage split, and they include all the descendant species of that ancestral species. Consequently, there is a tree-structure between these groups, and they do not generally mix genes.

Note, however, that whereas species are ancestral to species, clades are not ancestral to clades. The only entity that is ancestral at species level or higher is a species. A clade cannot be ancestral because it always contains all the descendants from its point of origin. (Perhaps it would be even clearer to say that the clade is all those species.)

Paraphylists have a different perspective: In their eyes, an incomplete clade can be a valid group, and as a group it can be ancestral to another such group. In other words, paraphylists want to recognise groups of several species as collectively ancestral to other groups.

Does that make sense? For example, paraphylists would claim that the whole paraphyletic 'group' of bony fish is ancestral to the tetrapod land animals. But are, say, New Caledonian Thorny Seahorses actually ancestral to us? Of course not; that species only originated hundreds of millions of years after the tetrapods originated, so how can it be an ancestor of tetrapods? Put like that, it should immediately become clear that the whole idea is absurd, just like the creationist canard "you believe we are descended from chimpanzees".

Of course, one might attempt to argue that I am not being consistent here. In speciation, I accept a species and thus a group of individuals as collectively ancestral to another species although only some of its individuals really are the immediate ancestors of any given descendant species. So why not accept a paraphyletic group as ancestral although only some of its species are immediately ancestral to a 'descendant' group?

There are two somewhat related reasons. The obvious one is that I do not consider paraphyletic supraspecific taxa to be meaningful groups, so the question wouldn't even arise in the first place. Now this might be seen as begging the question. It isn't, because there are independent reasons to consider only clades as meaningful groups, but I will now lean over backwards and argue that even if we leave open the questions whether paraphyletic 'groups' are acceptable, ancestral species are still a different story.

Because the individuals in a species are all happily interbreeding, it turns out that a large proportion of the individuals that have existed over time in the ancestral species are actually ancestral to that descendant species. In fact it would be the vast majority of those that managed to reproduce to the second generation or so; once your lineage survives for a bit, sexual reproduction has a good chance of snowballing your ancestry across the entire population. Given enough time, which is available because speciation events are rare, and again noting that few of your genes may be left in each of your individual descendants 10,000 years later.

Just look at my figure with the lineage split above, and consider how many of the red diamonds in the lower half are ancestral to the blue lineage. Because there are no forever-separated side branches but a constantly reticulating net, the descendant species really does have a gazillion ancestors in the ancestral species even at the level of individual organisms.

In fact because of the reticulation and constant genetic admixture within the ancestral species, it would be quite impossible to separate out who is and who isn't ancestral except in the last few generations immediately before the split; thus the 'immediate' ancestors in my earlier sentence. Admittedly my figure has only seven individuals per species, but then again that is evened out as it also has got unrealistically few generations before the split.

This constant reticulation is quite definitely not the case for paraphyletic 'groups'. Because they have a tree-structure internally, only a minuscule part of any reasonably large paraphyletic 'group' will ever be ancestral to a nested clade: the one directly ancestral species and all its ancestral species down to the common ancestor of the paraphyletic group. In the smallest case, a grade whose side lineages have not diversified at all, only half of its tree internodes will be ancestral to the descendant group in question. And that is the maximum, it goes downhill from there as the side lineages diversify themselves.

The side branches of a grade just aren't ancestral to the nested clade, and there is no reasonable argument to be made that they are.

I hope this has at least made clear why I believe that there are massive, significant biological differences between the relationships between individuals and the relationships between species, and that a very different approach to classification is needed at those two levels. Differences that get lost if we lump them into the same terminology or abstract everything into the same kind of diamond in a graph.


  1. I did not have to twist the definitions of monophyly etc. to force them to apply to every level. I have just shown that these properties are consistently inherited from one level to another one. So there is no reason to restrict their use to only some levels. I subscribe to your anti black-and-white-thinking (= metaphysical thinking) discussion, however you fall in the trap of thinking that the properties of the relationships inside different levels are necessarily different. As a continuum, the properties of one level arises from the properties of another level (without being necessarily either different or the same).

    We do agree that we shouldn't classify genes but organisms. In your graph "descendant species 1" is indistinguishable from "ancestral species", so it is biologically the same in any meaningful sense. Hennig's internodal species concept is completely inconsistent (both biologically and mathematically) as it is repeatedly demonstrated in many papers from both schools. See for example :
    Michael S. Y. Lee (1995) Species concepts and the recognition of
    Velasco J. (2007) The internodal species concept.

    As a thought experiment, let imagine a lizard species that became separated by a new river. The two populations A et B remain the same without any interbreeding because of stabilizing selection and Hardy-Weinberg principle. Then a couple of individuals from pop B is transported from the continent to an island thus creating pop C. On that island conditions are very different from the continent, so there is a directional selection plus a genetic drift, and population C quickly become another biological species. Let consider in your cladist framework of thinking that pop B became B' when it has splitted from C. Then no matter how much you struggle, pop B' is still biologically identical to A, thus (A+B') is still a biospecies. The species (A+B') is therefore paraphyletic to species C since pop no individuals from pop A is an ancestor of pop C.

    Some cladists are well aware of this kind of dilemma and thus many have come to the conclusion, as the one you mention, that cladism is not actually classifying species, but populations (at best).

    It is the same problem with you discussion about ancestral groups. You think that cohesiveness of a group can only come from interbreeding. So you accept that a species can be descended from another species because nearly all individuals of the latter are ancestors of the former, right? First of all, it only apply in ideal cases. Secondly, evolutionists argue that higher taxa can have cohesiveness for other reasons besides common ancestry, for example being in the same ecological niche is thus being submitted to the same kind of selection. If a descendant escape this niche it quickly become very different to adapt its new niche. Thus the group of the former niche, as a whole, can be considered ancestral to the group of the latter niche. The fact that only one lineage in the former niche participated to the colonization of the latter one is not more a problem than the fact that only one population of a species gave birth to another species. Furthermore, the cohesiveness principle of evolutionists can apply to asexual species. I believe that you really think more black-and-white than you thought.

    1. you fall in the trap of thinking that the properties of the relationships inside different levels are necessarily different

      There is no "necessarily" going on. I have described how they are, indeed, fundamentally different. I also submit that you do not understand emergence if you think that "the properties are consistently inherited from one level to another one". Will you also write a paper arguing that we should describe political history in terms of organic chemistry? One level of complexity builds on the other!

      "descendant species 1" is indistinguishable from "ancestral species", so it is biologically the same in any meaningful sense.

      Those are two of the fundamental problems right there: First, you just do not accept the idea of anything but phenetic similarity being a criterion, ever. Second, you ignore the process of evolution. Both descendant species together are the ancestral species.

      Your scenario does not appear very realistic. Either A and B have split for good, in which case they should be separate species, problem solved. (I know Richard Zander does not appear to accept morphologically cryptic species, but most people do.) If they get together again and start reticulating, then they are one species, problem solved.

      Note again that I was leaning over backwards to point out that even if we leave open the question whether paraphyletic taxa make sense, the situation with species as a collective is still totally different from paraphyletic group of species as a collective. But I don't need that argument because I don't consider paraphyly to be groups in the first place, for several unrelated reasons.

    2. "One level of complexity builds on the other!" Then we agree. That is what I studied in my paper.

      No, it is not a matter of phenetic criteria. Your "descendant species 1" is indistinguishable from "ancestral species" by any criteria. Let's take a specimen, how do you diagnose if it belongs to "descendant species 1" or "ancestral species"? You will have to know when (before or after the split) the specimen was sampled, then this means that you classify according to information about your sampling and not according to biological information about individuals or populations. Thus, your classification is artificial, not natural.

      "Both descendant species together are the ancestral species". There aren't. The verb being is very problematic here, because evolution is all about changing, not remaining the same. In this case, you can't say that RED equals RED + BLUE. One does not equals two. It is you who ignores the budding process and only see the splitting pattern.

      Imagine again, you don't know species BLUE, but you know species RED before and after the split happened (a specitation can realistically happen during a human lifetime). You won't notice any change in species RED and consider it is still the same species you have known in your youth. So you would change your classification of current species RED according to an information about a completely independent event (does this couple successfully reproduced on this far island or did they died out?).

      This is what is called nominalism (anti-realism) in science, you treat and classify your data, not the real things that your data represent (realism = anti-nominalism). In itself, it may not be an argument against cladism. Many scientists are very proud to nominalists, it is an important epistemological school.

      My scenario happens more frequently than you assume I would argue, but even if it were a very rare case, it shows the inconsistency of your conceptual framework. In both of your solutions you are simply waiting that the problem disappear with time, it is not what I would call solving the problem. You cannot classify future, and we need a classification scheme that always works.

      If your point is to save species from paraphyly, then it is only a half-success. I have shown that X-phyletic can be consistently extended to any level without any twist, and you haven't shown any self-contradiction. If you advocate not to apply the holophyletic principle to the species level, then we agree, this would make a nonsensical species concept. This exception is then an anomaly in the cladist framework (why species and not also genera? families? etc.). To my knowledge, only pattern cladists are really consistent in this regard, since they completely assume their nominalism by saying that species don't exist and that we should only classify holophyletically synchronous populations. Yes, this works, but it is a misrepresentation of reality I would argue. On the contrary, if species are special and really exist, then why genera and families wouldn't also be real? Evolutionnists argues that they are indeed real cohesive entities and that this cohesiveness should not be disrupted by the blind application of the holophyletic principle.

    3. Yes, evolution is about changing. Ancestral red has changed into descendant red and blue. You assume that ancestral red has not changed, that it has somehow made blue and then walked off, like a parent bearing a child. I get it. We are going in circles. It really doesn't have much sense to continue this.

      I have explained in the post above why and how individuals inside species and species inside the tree of life are very different in their behaviour and relationships. You don't accept that because you can draw them all as diamonds and arrows and then draw lines around groups of diamonds. Again, I get it. But I am fairly sure it is not me who is ignoring biological reality here.

  2. I don't ignore that individuals aren't species and that species aren't supraspecific taxa if it is what you mean by "biological reality", but it doesn't rebut that the same kind of cohesiveness could arise in these different levels, so that the same kind of relationships could be drawn between these cohesive groups. It seems to my point of view that you rise illegitimate barriers between these levels.

    Perhaps we will have to agree that we disagree about this point, at least for now.

  3. There is a lower limit beyond which it is impossible to use apomorphies to diagnose taxa or determine the phylogenetic relationships between them. There are at least two "cladist" species concepts that recognize this (Wheeler and Platnick's phylogenetic species concept and Kornet and McAlister's composite species concept) so I don't see how Hennig's internodal species concept is anything but a distraction from the solid case for applying monophyly criteria only at the supraspecific level where it makes sense.

    Alex: Hennig's concept makes sense in terms of ancestor and descendant lineages, but unless an internodon has a unique combination of characters, it's not a distinct species in any sense that is meaningful to taxonomy or cladistic analysis.

    Damien: Your lizard example ignores the fact that there are neutral traits, both molecular and morphological, that would be unaffected by stabilizing selection. So long as sufficient time has elapsed and there is sufficient reproductive isolation to allow some of these traits to become fixed in a population, it would be possible to distinguish A from B', and indeed that's the only way we could ever recover your hypothetical phylogeny.

    If the island C diverged before new traits had become fixed in the mainland A or B, we would likely get a polytomy from species tree methods. We will also get a polytomy if breeding resumes between A and B, unless additional traits become fixed across both populations that allow us to distinguish (A+B) from the ancestral species.

    If by cohesiveness, you mean "possessing traits of adaptive or ecological significance" it's worth mentioning that under PhyloCode, the following classifications would be permissible.

    Important Clade {species A, species B, (Important Clade C)}
    Important Clade {species (A+B), Important Clade C}

    1. I agree that e.g. the composite species concept works for phylogenetic systematics, no problem, and as such you are completely right that this is not making any difference for the main issue.

      Nonetheless I still personally think that the clade is what the ancestral species turns into, that conversely the ancestor isn't just one of its descendants but all of them, and that relatedness instead of instead of a combination of superficially impressive looking morphological characters should be meaningful to taxonomy.

      Composite species are one approach, and if we were to see a speciation event happening now it would be impractical to give the morphologically unchanged remainder of what was one species a new Linnean binominal. But I can't help but find that conceptually the remainder is not identical to the ancestor, but the remainder plus the other new species are.