PhyloBotanist: Ranunculus, part 7: Information content of classifications

This is the final part of a series on a paper by Hörandl & Emadzade (subsequently H&E) suggesting an "evolutionary", i.e. pre-cladistic, classification of Ranunculus. See the previous installments here: part 1, part 2, part 3, part 4, part 5, part 6.

The internal logic of the paper so far suggested that the major arguments for paraphyletic taxa would be (1) the existence of polyploidy, reticulate evolution and anagesis, as mentioned in the introduction, (2) the lack of five morphological synapomorphies to circumscribe monophyletic taxa, as mentioned in the materials and methods section (because only four clearly aren't enough!), and (3) the paraphyly of phenetic clusters, as mentioned in an earlier part of the discussion section. As seen previously, the first justification is pretty fuzzy to start with, all issues it raises except reticulate evolution are entirely irrelevant, and rampant reticulate evolution would result in the absence of any phylogenetic structure so that there could be no paraphyletic taxa either. The second is arbitrary and unscientific, and the third is an example of circular reasoning.

However, in the last paragraphs of the paper's discussion, under the heading "conclusion for classification", the authors advance two more arguments for paraphyletic taxa that are in part clarifications of what they had only hinted at earlier in the paper. Instead of going through the text sentence by sentence, I will try to synthesize the arguments very concisely and then point out what is problematic about them.

The first argument is that insistence on monophyletic taxa often makes a nice subdivision of groups into equally large subgroups impossible. To quote directly: "The paraphyletic ladder at the base of the genus Ranunculus... makes a big subdivision of the genus into well-defined holophyletic subgenera difficult." In other words, H&E see it as a problem if you have a classification like this:

Genus Planta
    Subgenus Plantella, 1 species
    Subgenus Planteopsis, 3 species
    Subgenus Planta
        Section Plantastrum, 1 species
        Section Planta, 356 species

Okay. I understand the argument on a purely semantic level, but I seriously cannot in any way understand its point. What is actually the problem with a classification like that? If this is the structure of biological diversity, if, for example, section Plantastrum is sister to and thus as old a lineage as all the rest of its subgenus, then the biological classification should bloody well describe that reality accurately. Accurately describing reality is kinda the job of science. You don't get to say that it is inconvenient that praying does not cure AIDS or that CO2 is a greenhouse gas, and that scientists should therefore change their conclusions to be more convenient. The same obviously goes for systematic botany. Although luckily an emotionally pleasing but scientifically inaccurate classification will only mislead downstream studies in biogeography, evolutionary biology, phytochemistry, etc., instead of threatening the health and livelihood of numerous people, the principle is the same.

The second and much more eloquently presented argument is that evolutionary classifications contain more information about "evolutionary processes" than phylogenetic classifications. The last part of the discussion comprises only about 800 words, and in that short space the terms information, information content and loss of information are mentioned, by my count, no less than ten times. Surely this is then a major concern for the authors. If you have read through this series of posts, you will already know one severe problem with this argument, but because it features so prominently in the part of the paper we are looking at now it seems advisable to examine it once again, and perhaps in greater detail.

So, in what way do classifications with paraphyletic taxa supposedly contain more information? The discussion is in this regard perhaps a bit confusing, but basically it boils down to this: the authors of the present paper would be unhappy with morphologically heterogeneous taxa, they would like to recognize anagenetic change, and they would be unhappy with two taxa that have no clear morphological differences. And really, the first two of these are one and the same; they boil down to "but these plants look so different, how can they be in the same taxon?" In the present case, most species of Ranunculus are terrestial and yellow-flowered, but one of the otherwise terrestrial and yellow groups contains an aquatic and white-flowered subgroup.

This is a concern that often comes up, especially when talking to non-systematists. As I have already discussed it at length, I will only quickly mention again the most obvious problem with that reasoning: the great difference seen by a layperson between, for example, the birds and the dinosaurs or between the white aquatic and the yellow terrestrial buttercups is an illusion brought about by extinction and an incomplete knowledge of the fossil record, because in reality there was imperceptibly gradual change from one morphology to the other instead of a sudden jump. Now, however, I will take that as a given because I want to focus on the question of the information content of classifications.

The evolutionary systematists' position is that a classification should show the big morphological difference between two groups that are not reciprocally monophyletic; that would mean a phenetic classification. On the other hand, they also argue that a classification should show phylogenetic structure; that would mean a phylogenetic classification. They then argue that only their classifications combine those two pieces of information, and thus they contain more information than phylogenetic ones. But how can we accommodate both these principles in the same classification? Well, here is the "evolutionary" classification of Ranunculus from the present paper, so that you can judge for yourself:

Genus Ranunculus
    Subgenus Auricomus
        Section Thora
        Section Aconitifolii
        Section Ranuncella
        Section Epirotes
        Section Leucoranunculus
        Section Pseudadonis
        Section Batrachium
        Section Hecatonia
        Section Auricomus
        Section Flammula
    Subgenus Ranunculus
        Section Ranunculus
        Section Polyanthemos
        Section Echinella
        Section Trisecti
        Section Oreophili
        Section Euromontani
        Section Ranunculastrum
        Section Pterocarpa

Now to extract some information. Please tell me, what are the phylogenetic relationships between those sections? If this were a phylogenetic classification, we would know merely from this list that all taxa at the same level are reciprocally monophyletic and that lower level taxa are nested in the next higher level ones. Because this is actually not a phylogenetic classification, we know that they aren't all reciprocally monophyletic, but we have no way of knowing which taxon is in reality nested in what other taxon, if any, and that means that no phylogenetic information whatsoever can be gleaned from the classification. Zero. Zilch. Nada. We are left guessing.

The same is obviously true for phenetic information. Because the sections are partially based on phylogenetic structure, we already know that they aren't all equally dissimilar (if that were the case, the first division would likely have to be between the white aquatic ones and the rest). The classification is utterly useless for both purposes, and its information content is nil.

Of course, some reader might say, "wait a second, look at appendix A of the paper. For each section of Ranunculus, the authors have a text that mentions whether it is monophyletic or whether it is paraphyletic to another." Quite true. But allowing evolutionary systematists to count the description of the taxa as the information content of their system is seriously unfair. Their entire argument that phylogenetic classifications "lose" or "do not contain" information on morphological or ecological divergence or evolutionary processes is, after all, based on the pretense that a phylogenetic classification likewise consists only of taxon names. If we allow descriptions to be part of the classification that is to be assessed for its information content, we find that phylogenetic classifications obviously also contain that supposedly lost information. Look, for example, into the pages of Kenrick & Crane (1997), and you will find a list of synapomorphies in the description of each taxon. That is the information on its morphological or ecological divergence from the paraphyletic residue that would be left if you removed it from the containing higher level taxon.

In other words, if we look at the descriptions of the groups, both types of classifications contain all the information that H&E claim to be lost in the phylogenetic ones. If, however, we do not have those descriptions and consider literally the information content of the classification as a list of nested taxa, the evolutionary classifications do not contain any reliable information whatsoever because we cannot be sure what criterion was used to circumscribe any given taxon and how the taxa are related to each other. And note that when using names, people generally assume that the respective taxa are evolutionarily meaningful units without bothering to look up the descriptions in the original publication!

As if that were not enough, the issue of heterogeneous taxa can also be approached from two other angles. The first is to ask whether the concept of "morphological or ecological divergence of a nested clade from the ancestral paraphyletic group" (paraphrasing) advanced by evolutionary systematists is actually defensible. Think back to the previous post, and the problem becomes clear - the idea is based on circular reasoning. First, the status of the paraphlyetic residue as a valid taxon is assumed, and then a divergence between "it" and the nested clade is observed. As an example, an evolutionary systematist might argue that the tetrapods show strong evolutionary and ecological divergence from bony fish. But of course there is no "it", there are no "bony fish minus the tetrapods" to show a divergence from the tetrapods, those are simply non-natural groups, they are non-groups, they are artificially and arbitrarily delineated residues.

To put it in the clearest terms, saying that the tetrapods diverged from the bony fish is simply not an accurate description of evolutionary history, and because it is factually incorrect nobody should wish to include that "information" into a classification. The factually correct way of describing evolutionary history as currently understood would be to say that the bony fish, originally purely aquatic, diversified into aquatic and land-living lineages. And that is of course what the phylogenetic classification with its tetrapod clade nested inside the bony fish clade shows, and therefore it is this classification that contains accurate information on evolutionary processes while an "evolutionary" one contains disinformation, plain and simple.

Which brings us to the second angle from which we can look at the issue of being uneasy with morphologically heterogeneous taxa. Many people dislike the idea that birds are dinosaurs, or, as in my above example, that we humans are a subgroup of bony fish, for this very reason. How can the fish taxon be so heterogeneous, and how can you put something as different as a trout and a rat into the same taxon?

Now here is the thing. Remember that taxon called vertebrates? Yup, that also contains the trout and the rat, and us. Curious how people get uncomfortable with the nested boxes approach of phylogenetic systematics making the bigger boxes more heterogeneous but don't bat an eye at other heterogeneous nested boxes as long as they learned about them when they were younger. There is really no difference between the two, and to pretend otherwise is merely to dress one's emotional discomfort at having to learn a new, improved classification into slightly less unreasonably sounding language.

Finally, there is the separate issue of a phylogenetic classification (rarely) making it necessary to recognize two morphologically indistinguishable taxa because uniting them would produce a paraphyletic taxon. That is indeed a valid concern from the perspective of the practical utility of the taxa in field guides, floras, and so on. From a purely scientific perspective, however, any synapomorphy should be okay, even if it is merely molecular or biochemical, and obviously there must be one or we would never have inferred the existence of those two clades. Moreover, I think that it is rarely an issue even from a morphological perspective. A good example is the phylogenetic classification of the speedwell genus Veronica, about which I will perhaps blog another time. The choice was between sinking several genera into a large Veronica that is easily recognized or splitting the traditionally circumscribed, paraphyletic Veronica into five new genera that could not really be distinguished from each other. Phrased like that, the solution was clear.

References

Kenrick & Crane, 1997. The origin and early diversification of land plants – a cladistic study. Smithsonian Institution Press, Washington USA.

Monday, February 11, 2013

Ranunculus, part 7: Information content of classifications

No comments:

Post a Comment