Saturday, October 19, 2019

Arguments for paraphyletic taxa, part 543,997 or so

Although having largely moved on from blogging, I found myself writing another post on the most frequent topic of this blog, arguments for the acceptance of paraphyletic taxa and whether they make sense. A paper has recently appeared that describes a new species of flowering plants (Carnicero et al 2019, Bot J Linn Soc: boz052). The first paragraph of its introduction argues for paraphyletic taxa as follows:
From a cladistic perspective, monophyly of taxa is desirable, but important evolutionary processes such as hybridization, anagenetic and anacladogenetic speciation (budding sensu Mayr & Bock, 2002) unavoidably result in non-dichotomous branching patterns (Hörandl, 2006; Hörandl & Stuessy, 2010).
I am afraid I already find this first bit confused in several details. First, from a cladistic perspective, monophyly is not merely desirable but required. That is the entire point of cladism.

Second, non-dichotomous branching patters are polytomies, meaning the branch splits into more than two sub-branches. Polytomies are no problem for making supraspecific taxa monophyletic, so on the face of it, it is not clear what the argument is. But none of the mentioned processes necessarily produce polytomies anyway, and some of them do not even produce any branching at all.

Hybridisation - presumably the authors mean hybridogenic speciation, e.g. by allopolyploidy, and not actually hybridisation per se, which is usually a dead end - is not branching, it is the opposite. The problem for the argument here is that reticulation does not just mean there is no monophyly, it also means there no is paraphyly either, as there is no phyletic (tree-like) structure. It makes no sense to argue for paraphyly in a situation where there is no paraphyly. (More on that below.)

'Budding' speciation is dichotomous, just like any other lineage split, unless an ancestral species fractures into three or more descendant species at the exact same moment, just like could happen with a non-'budding' lineage split. It is no problem whatsoever for making supraspecific taxa monophyletic.

Third, anagenetic means that something happens along a lineage without a lineage split, so it is again odd to speak of a "non-dichotomous" branching pattern. If anagenesis is happening there is by definition no branching pattern, dichotomous or otherwise. Nor is there any problem for making supraspecific taxa monophyletic. So yes, the observation that there is no dichotomy is correct, but merely in the same trivial sense as the observation that a book isn't a car. You can go around saying that, but book authors or publishers will simply say, "we know, so what?" Cladists likewise when told that anagenesis happens.
Anacladogenesis is a case of peripatric speciation, in which a population or a group of populations from a species diverge, resulting in a derivative monophyletic species (Stuessy, Crawford & Marticorena, 1990). Unlike in cladogenetic processes, the ancestral species remains essentially unchanged and often becomes paraphyletic (Mayr & Bock, 2002; Crawford, 2010).
With this the two closely related misconceptions at the heart of the paper's argumentation become clear. The first is that the cladist approach requires making species monophyletic. It doesn't. The second is that it makes sense to call species monophyletic or paraphyletic in the first place. It doesn't. (Although this is a very, very common and widespread misconception.)

As already indicated above, the concepts ending in -phyly apply in tree-like structures, such as the tree of life. The individuals of sexually reproducing species, however, do not form a tree-like but instead a net-like structure. Consequently, -phyly does not apply inside sexually reproducing species. Another attempt at an analogy: I can be asleep, but the molecules I consist of do not sleep. The concept "asleep or awake" does not apply to individual molecules, just as monophyly does not apply to individuals of the same sexually reproducing species. Fallacy of division is the keyword here.

This is not a new idea that cladists came up with only as a rearguard action, as frequently claimed by paraphyletists. We can go back all the way to the inventor of cladism, Willi Hennig. The central and best known figure in his book illustrates the different relationships that species, individuals, and life stages have to each other. Phylogenetic systematics ('cladism') is the approach to take when classifying species into supraspecific taxa, but not when classifying individuals into species. The claim that a species is monophyletic or paraphyletic is a category error.
Over time, the ancestral species may converge to monophyly through gene flow and lineage sorting (Baum & Shaw, 1995).
Same as above, but in addition it has to be unclear what is meant with 'gene flow', as on the face of it such flow would work against lineage sorting. It is possible that the authors meant to say 'restriction of gene flow'.

This sentence also makes clear where the conceptual error is located that leads a surprising number of people to the idea that species can be something or other-phyletic. Lineage sorting happens to alleles, and yes, the alleles of a gene occurring inside a sexually reproducing species can be paraphyletic to the alleles occurring inside a different sexually reproducing species. But taxonomists do not classify alleles into species, they classify individuals into species, so this would be another category error.
Far from an exception, anacladogenetic speciation has been considered to be of main importance in plant evolution (Rieseberg & Brouillet, 1994; Anacker & Strauss, 2014). As integrative taxonomy advocates that taxa should reflect evolutionary processes (Stuessy, 2009; Schlick-Steiner et al., 2010), it may be necessary to recognize certain paraphyletic entities.
The argument that Integrative Taxonomy requires paraphyly was not familiar to me. My understanding has always been that Integrative Taxonomy is about combining diverse kinds of evidence to support taxonomic decisions in species delimitation, e.g. a combination of ecological niche, population genetics, and morphology. The seminal Schlick-Steiner paper, for example, was clearly about alpha taxonomy, i.e. species delimitation. Searching it for the snippet "paraph" brings up only one entry in its reference list. (Stuessy is a different story, as he is one of the two or three most vociferous botanists still arguing for paraphyletic taxa; but then again he is not to my understanding a founding figure of Integrative Taxonomy.)

Again the central problem is, however, not what Schlick-Steiner et al may have thought about paraphyletic taxa, but that Integrative Taxonomy is about species delimitation, where paraphyly applies just as much as decibels apply to colours, and not about supraspecific taxa, where there concept properly applies.

The paragraph ends with something like an argumentum ad populum.
Indeed, examples of recognized paraphyletic taxa exist at various taxonomic levels (e.g. class Reptilia: Mayr & Bock, 2002; Pozoa coriacea Lag.: López et al., 2012; Helichrysum Mill.: Galbany- Casals et al., 2014; Plethodon wehrlei Fowler & Dunn: Kuchta, Brown & Highton, 2018; Columnea strigosa Benth.: Smith, Ooi & Clark, 2018).
The individual species used as examples are irrelevant for the reasons outlined above, because unless they are reproducing clonally, in which case they should have been circumscribed to be monophyletic, they are not paraphyletic but instead tokogenetic (net-like), and cladism does not apply inside tokogenetic structures. That leaves two supraspecific taxa that the taxonomic community has long recognised as ill-circumscribed due to their paraphyly: reptilia and Helichrysum.

One might point out that Mayr, for example, remained opposed to phylogenetic classification even as he saw it being adopted by the scientific community around him, and that recognition of reptilia as a paraphyletic taxon is not state of the art in zoology today. The vast majority of animal systematists today classify animals consistently by relatedness.

But more importantly, there is no way to base the acceptance of paraphyletic reptilia or Helichrysum on the argumentation presented in this paper, which argues entirely from the existence of hybridogenic and 'budding' speciation. This illustrates an extremely common pattern in papers arguing for paraphyletic taxa: an argument is made that applies inside a species (although even that only if we misconstrue the conceptual basis and actual practice of phylogenetic systematics), and then the entirely unwarranted jump is made to the conclusion that paraphyly should be accepted at a much higher level of classification, where the argument would not apply even if it were correct.

Thursday, September 26, 2019

Incongruence Length Difference test in TNT

Because I am fed up with figuring it out anew every time I need to use the Incongruence Length Difference (ILD) test (Farris et al., 1994) in TNT, I will post it once and for all here:

Download TNT and the script "ildtnt.run" from PhyloWiki. In the script, you may have to replace all instances of "numreps" with "num_reps" to make it functional. I at least get the error "numreps is a reserved expression", suggesting that the programmer should not have used that as a variable name.

Open TNT, increase memory, and set data to DNA and treating gaps as missing data. Then load your data matrix, which should of course be in TNT format:

mxram 200 ;
nstates DNA ;
nstates NOGAPS ;
proc (your_alignment_file_name) ;

Look up how many characters your first partition has, then run the test with:

run ildtnt.run (length_of_first_partition) (replicates) ;

There is an alternative script for doing the test called Ild.run, but I have so far failed to set the number of user variables high enough to accommodate my datasets. They seem to be limited to 1,000?

Perhaps this guide will also be useful to somebody besides me.

Reference

Farris JS, Källersjö M, Kluge AG, Bult C, 1994. Testing significance of incongruence. Cladistics 10: 315-319.

Friday, April 5, 2019

Still not convinced by Vicariance Biogeography

When reading recent methodological papers, review articles, or publications on my study group I sometimes add to the mix the odd paper that is not directly relevant for my work and maybe not even very recent but which is relevant to my broader interests. In this case I decided to take a look at Heads 2009, Inferring biogeographic history from molecular phylogenies, Biol J Linn Soc 98: 757-774.

Michael Heads is perhaps the most published proponent of Vicariance Biogeography, the school of biogeography that rejects speciation following long-distance dispersal (LDD) because... and that is where it gets interesting, because I still find that rejection puzzling. To the best of my understanding at least some vicariance biogeographers consider the conclusion of LDD to be unscientific because they believe it can explain any possible contemporary range, on the lines of 'if your hypothesis can explain every observation it explains nothing'. This does not make sense to me, because LDD would still be more or less plausible depending on the dating of cladogenesis events relative to tektonic events or island ages, prevailing wind and water currents, dispersal ecology, and many other factors. It also seems rather more unscientific to reject a possible explanation a priori, regardless of any evidence in its favour. But to get a better understanding of the arguments of vicariance biogeographers is precisely my reason for picking up this paper. So, on with it.

In a section titled "critique of founder dispersal in population genetic studies", Heads first describes the concept as "the founder individual has been isolated from its parent population by dispersing over a barrier (an apparent contradiction)". Right out of the gate this seems odd. I may be missing something, but it appears as if Heads would accept only extremes: either there is a barrier, meaning zero dispersal, or there is none, meaning panmixis. I have previously observed similar arguments in other papers from the vicariance school.

Assume I have a garden with a fence around it, and then one day a cat jumps over it. Does this mean I have no barrier around the garden? Of course not, it may still have kept various stray dogs and neighbours' children out. On the other hand, it was never a barrier to birds or insects. The same in biogeography. No barrier on this planet is absolute, and each barrier has a different force for different groups of organisms. A channel that is near-insurmountable to a monkey may be crossed by insects if blown over by a strong enough storm, and it may be no barrier at all to fern spores. Perhaps even more importantly, dispersal is a stochastic process. The Atlantic Ocean did not keep all cacti from crossing (Rhipsalis made it over to Africa), but it kept the seeds of >99.9% of them away, so it is still a barrier even if not an absolute one.

Beyond that the argument of the section relies on citing five papers that "failed to corroborate predictions of founder effect speciation", of which one is missing from the reference list. I checked three of the remaining four papers, and in all cases they are experiments on fruit flies limited to time frames on the order of ten years and designed to test the very narrow question whether severe population bottlenecks will cause pre-mating isolation. Now I may completely have misunderstood the claim made by mainstream biogeographers regarding founder speciation, but I believe it was not "ten years after an organism has dispersed to an island it will have achieved biological pre-mating isolation". The way I understand it the claim is more on the lines of the large distance from the parental population producing geographic pre-mating isolation, which enables speciation to take place subsequently. The point is not the speed with which the new population evolves (although that is an exciting research question in itself) but rather that it has become geographically isolated.

The argument consequently seems to miss the point. If there is a problem for founder speciation then it would be whether a single pregnant female or a single seed can establish a viable population. Potential problems are inbreeding and, in plants that have such features, self-incompatibility systems that cause failure to set seed. But if a population establishes, helped perhaps by herbivore release and lack of competition, subsequent speciation is not an extraordinary claim. It really does not matter if isolation has been achieved by vicariance or by LDD, the subsequent process of divergence is the same except the latter will also cause a genetic bottleneck.

The section "critique of founder dispersal in biogeographic studies" points out that there is good evidence for similar vicariance patters in many taxa. I am unaware of anybody who denies that vicariance is an important process - but it does not logically follow that LDD is therefor implausible. I can agree that a lot of white swans exist without therefore having to believe that black ones cannot possibly exist.

This is followed by "founder dispersal and new ideas on rift tectonics", where the idea seems to be that seemingly young oceanic islands do not require LDD to be colonised because they kind of have always been there. It is not entirely clear to me if the claim is that the individual islands are all much older than the oldest still observable lava flows or if, as implied by the reference to "seamounts", the local species would have constantly hopped from one short-lived and now submerged island to the next. If the first, it seems rather ad-hoc; if the second, one wonders why species that can so easily jump ten times from one disappearing island to the next island in the chain cannot simply jump a single time from continent to island. What is the more parsimonious conclusion here?

Next, molecular clocks and time calibration of phylogenies are rejected. All inferences, be it from fossils but in particular from geological events such as the formation of the isthmus of Panama, are dismissed as unreliable, but apparently present distributions are reliable evidence of ancestral distributions. Unfortunately I remain anti-convinced.

To quote the following paragraph in full:

"In Ronquist's (1997) method of dispersal-vicariance analysis, inferences of dispersal events are minimized as they attract a 'cost'. Extinction also attracts a cost but vicariance does not. It was not explained why this approach was taken and it appears to be based on a confusion of the two different concepts of 'dispersal'. Ecological dispersal in the sense of ordinary movement should not attract any cost in any model; founder dispersal would attract no cost in a traditional dispersalist model, but, in a vicariance model of speciation or evolution, it is rejected a priori."

What Heads does here is reject a formal parsimony-based inference of ancestral ranges in favour of, to judge from the second half of the paper, an informal, intuitive, pencil-on-a-map deduction process. What does he not like about Dispersal-Vicariance Analysis (DIVA)? Apparently primarily that dispersal events have a parsimony cost. It may be that he did not contemplate how such an analysis would work or if it could even work at all, if the only process having a cost would be extinction - of course it would mean that dispersal would be much too 'cheap', and every single ancestral species would always be inferred to have occupied the union of the ranges of its two descendants.

The great irony here is that even with a dispersal cost DIVA is well known for mercilessly (and implausibly) favouring vicariance as a process. I ran that analysis on two or three data sets a few years ago, and unless one restricts the maximum range size of ancestral species to something biologically plausible one pretty much always ends up with the vicariance biogeographers' preferred conclusion: the ancestor of the study group was already everywhere where any of its descendants occur today.

The second part of the paper is taken up by a large number of case studies, taxa which have sometimes been suggested to have undergone LDD but for which Heads presents a vicariance explanation instead. Some of these I find more plausible than others, but I do not want to go into each of them in detail. Instead, it seems more efficient to discuss what I see as three problems running through the entire argumentation:

First, there seems to be a lot of ad-hoccery going on. Where necessary to arrive at the conclusion of vicariance, for example to explain the overlapping distributions of African Arctotideae, 'normal ecological' range expansion is invoked as common and easy. But where necessary to arrive at the conclusion of vicariance, for example when distantly related subclades of a taxon occur right next to each other in Tasmania or New Zealand (suggesting relatively recent LDD from elsewhere), they are assumed to have been sitting in these narrow localities for tens of millions of years, apparently unable to move at all, so that a very ancient vicariance event can have taken place between their present ranges. Is that not rather convenient?

Which brings me to the second point. The text presenting the case studies certainly uses words like "may" and "might" a lot. To be honest, I sometimes found myself reminded of Erich von Daniken, whose style was to the effect of "the traditional explanation is that the pyramids were build by the ancient Egyptians - but could it not have been extra-terrestrials?" Yes, in each of these cases vicariance (or extra-terrestrials) could be the explanation. But mere possibility is a low hurdle to clear; the real question is, is that the most plausible explanation?

Third, as always with vicariance- or panbiogeography the problem is that dispersal is still required. Somehow this taxon here must have reached this volcanic island, somehow that taxon there must have spread all over the world. How does the vicariance biogeographer arrive at contemporary ranges without invoking jumps across oceans? Partly by hiding the dispersal away before the start of the analysis. To quote the present paper, "assuming a worldwide ancestor..." Well, if we can just assume that at our leisure it becomes easy to conclude few dispersal events, long distance or otherwise.

Now quite apart from the question whether a single species occurring worldwide is biologically realistic for all groups of organisms (I'd say it isn't), the problem remains that we have a lot of nested groups that would all have to have been ancestrally cosmopolitan, requiring several global range expansions in between. The daisy family is an excellent example. With reference to them, Heads writes that "through the history of the family as a whole, only a small number of widespread ancestors may have existed (groups such as Senecioneae and Astereae each require their own global ancestor)." I think that is a wee bit of an underestimate.

To walk through just one example in order of containing taxon to subordinate taxon: The Asteraceae family is cosmopolitan. The Asteroideae subfamily is cosmopolitan. The Astereae tribe is cosmopolitan. And the genus Conyza is cosmopolitan. If vicariance is the explanation for all speciation events we still need at least four consecutive cases of spreading across all continents. The same applies to a large number of the other tribes in the family: yes, that includes the aforementioned Senecioneae, but also Gnaphalieae, Anthemideae, Heliantheae, Cichorieae, Cardueae, Inuleae, and Vernonieae. And several of these include genera occurring across several continents or even (as with Senecio) all of them except Antarctica.

There is certainly a lot of dispersal required to explain that even in a vicariance approach, and unless we assume that most speciation in these groups took place before the breakup of Pangaea 175 million years ago (meaning the early dinosaurs would have known many of the same daisies as we do now, tens of millions of years before the oldest estimates for the origin of the daisy family) we will have to assume that some of that dispersal was long-distance.

Why not simply accept that organisms can sometimes, rarely but often enough to matter, cross an ocean and establish on the other side, followed by speciation? What is is so extraordinary about that conclusion, really? What is so different about it compared to being separated by vicariance, followed by speciation? I am still puzzled.

Wednesday, February 20, 2019

Review of the Aachen Memorandum

I picked this book up at a book fair after having read that it was a satire on bureaucracy and 'political correctness'. Although I am not the kind of person who believes that not being able to use sexist and racist insults is the end of the world and thus unlikely to agree with the author politically I nonetheless thought I might still find this kind of book interesting. I can, for example, read the original Conan novels through to the end without believing myself, as their author did, that all civilisation is corrupt and deserves to be destroyed.

Unfortunately, Robert E. Howard was a master of wit and subtlety compared to Andrew Roberts, and I only made it halfway through the Aachen Memorandum before giving up. Roberts took everything he dislikes - immigration, high taxes on the rich, animal protection, weed, speed limits, feminism, anti-racism, grade inflation, concern for healthy nutrition, and so much more, stuffed it all into one pot and then scrawled 'Europe' onto it.

The results are, unfortunately, not even intellectually coherent. The book has all European nations dissolved into a Euro-superstate, but somehow France is still able to buy the Channel Islands off England. The dominant culture is depicted as a caricature of feminist prudery, while the protagonist is constantly lecherous and voyeuristic, but he also complains that advertisements are all using sex to sell products. Europe is a total dictatorship with complete surveillance of communications, no free press, and continental armies stationed in England to forcefully squash nationalist protests, but (what follows is the only minor spoiler here) somehow the entire edifice collapses the moment somebody finds evidence that a referendum a generation ago was manipulated. The ruling ideology is clearly supposed to be left-wing and cosmopolitan, but at the same time Adolf Hitler is venerated in the schools.

How does that any of that even start to make sense? It seems as if the author believed that everybody who is not part of his own political sect is interchangeable and in cahoots with each other.

Underneath the visceral hatred of everybody outside of Britain oozing from the pages it is just about possible to see the outline of a potentially amusing thriller, but the problem is that I cannot maintain willing suspension of disbelief. Yes, the reader will soon understand that the author despises the European Union in general and Germany and Polish taxi drivers in particular, so well done communicating that, but novels also need an at least somewhat plausible and logically coherent setting, otherwise they don't work. And that is before even mentioning how blatant a wish-fulfillment self-insert the protagonist is.

I assume there was, and still is, a very particular audience for this book in one particular country, but at least in my eyes everybody else would be better served by doing something more entertaining than reading it, such as watching paint dry or counting how many grains there are in one kg of sugar.