PhyloBotanist: taxonomy

Showing posts with label taxonomy. Show all posts

Saturday, October 19, 2019

Arguments for paraphyletic taxa, part 543,997 or so

Although having largely moved on from blogging, I found myself writing another post on the most frequent topic of this blog, arguments for the acceptance of paraphyletic taxa and whether they make sense. A paper has recently appeared that describes a new species of flowering plants (Carnicero et al 2019, Bot J Linn Soc: boz052). The first paragraph of its introduction argues for paraphyletic taxa as follows:

From a cladistic perspective, monophyly of taxa is desirable, but important evolutionary processes such as hybridization, anagenetic and anacladogenetic speciation (budding sensu Mayr & Bock, 2002) unavoidably result in non-dichotomous branching patterns (Hörandl, 2006; Hörandl & Stuessy, 2010).

I am afraid I already find this first bit confused in several details. First, from a cladistic perspective, monophyly is not merely desirable but required. That is the entire point of cladism.

Second, non-dichotomous branching patters are polytomies, meaning the branch splits into more than two sub-branches. Polytomies are no problem for making supraspecific taxa monophyletic, so on the face of it, it is not clear what the argument is. But none of the mentioned processes necessarily produce polytomies anyway, and some of them do not even produce any branching at all.

Hybridisation - presumably the authors mean hybridogenic speciation, e.g. by allopolyploidy, and not actually hybridisation per se, which is usually a dead end - is not branching, it is the opposite. The problem for the argument here is that reticulation does not just mean there is no monophyly, it also means there no is paraphyly either, as there is no phyletic (tree-like) structure. It makes no sense to argue for paraphyly in a situation where there is no paraphyly. (More on that below.)

'Budding' speciation is dichotomous, just like any other lineage split, unless an ancestral species fractures into three or more descendant species at the exact same moment, just like could happen with a non-'budding' lineage split. It is no problem whatsoever for making supraspecific taxa monophyletic.

Third, anagenetic means that something happens along a lineage without a lineage split, so it is again odd to speak of a "non-dichotomous" branching pattern. If anagenesis is happening there is by definition no branching pattern, dichotomous or otherwise. Nor is there any problem for making supraspecific taxa monophyletic. So yes, the observation that there is no dichotomy is correct, but merely in the same trivial sense as the observation that a book isn't a car. You can go around saying that, but book authors or publishers will simply say, "we know, so what?" Cladists likewise when told that anagenesis happens.

Anacladogenesis is a case of peripatric speciation, in which a population or a group of populations from a species diverge, resulting in a derivative monophyletic species (Stuessy, Crawford & Marticorena, 1990). Unlike in cladogenetic processes, the ancestral species remains essentially unchanged and often becomes paraphyletic (Mayr & Bock, 2002; Crawford, 2010).

With this the two closely related misconceptions at the heart of the paper's argumentation become clear. The first is that the cladist approach requires making species monophyletic. It doesn't. The second is that it makes sense to call species monophyletic or paraphyletic in the first place. It doesn't. (Although this is a very, very common and widespread misconception.)

As already indicated above, the concepts ending in -phyly apply in tree-like structures, such as the tree of life. The individuals of sexually reproducing species, however, do not form a tree-like but instead a net-like structure. Consequently, -phyly does not apply inside sexually reproducing species. Another attempt at an analogy: I can be asleep, but the molecules I consist of do not sleep. The concept "asleep or awake" does not apply to individual molecules, just as monophyly does not apply to individuals of the same sexually reproducing species. Fallacy of division is the keyword here.

This is not a new idea that cladists came up with only as a rearguard action, as frequently claimed by paraphyletists. We can go back all the way to the inventor of cladism, Willi Hennig. The central and best known figure in his book illustrates the different relationships that species, individuals, and life stages have to each other. Phylogenetic systematics ('cladism') is the approach to take when classifying species into supraspecific taxa, but not when classifying individuals into species. The claim that a species is monophyletic or paraphyletic is a category error.

Over time, the ancestral species may converge to monophyly through gene flow and lineage sorting (Baum & Shaw, 1995).

Same as above, but in addition it has to be unclear what is meant with 'gene flow', as on the face of it such flow would work against lineage sorting. It is possible that the authors meant to say 'restriction of gene flow'.

This sentence also makes clear where the conceptual error is located that leads a surprising number of people to the idea that species can be something or other-phyletic. Lineage sorting happens to alleles, and yes, the alleles of a gene occurring inside a sexually reproducing species can be paraphyletic to the alleles occurring inside a different sexually reproducing species. But taxonomists do not classify alleles into species, they classify individuals into species, so this would be another category error.

Far from an exception, anacladogenetic speciation has been considered to be of main importance in plant evolution (Rieseberg & Brouillet, 1994; Anacker & Strauss, 2014). As integrative taxonomy advocates that taxa should reflect evolutionary processes (Stuessy, 2009; Schlick-Steiner et al., 2010), it may be necessary to recognize certain paraphyletic entities.

The argument that Integrative Taxonomy requires paraphyly was not familiar to me. My understanding has always been that Integrative Taxonomy is about combining diverse kinds of evidence to support taxonomic decisions in species delimitation, e.g. a combination of ecological niche, population genetics, and morphology. The seminal Schlick-Steiner paper, for example, was clearly about alpha taxonomy, i.e. species delimitation. Searching it for the snippet "paraph" brings up only one entry in its reference list. (Stuessy is a different story, as he is one of the two or three most vociferous botanists still arguing for paraphyletic taxa; but then again he is not to my understanding a founding figure of Integrative Taxonomy.)

Again the central problem is, however, not what Schlick-Steiner et al may have thought about paraphyletic taxa, but that Integrative Taxonomy is about species delimitation, where paraphyly applies just as much as decibels apply to colours, and not about supraspecific taxa, where there concept properly applies.

The paragraph ends with something like an argumentum ad populum.

Indeed, examples of recognized paraphyletic taxa exist at various taxonomic levels (e.g. class Reptilia: Mayr & Bock, 2002; Pozoa coriacea Lag.: López et al., 2012; Helichrysum Mill.: Galbany- Casals et al., 2014; Plethodon wehrlei Fowler & Dunn: Kuchta, Brown & Highton, 2018; Columnea strigosa Benth.: Smith, Ooi & Clark, 2018).

The individual species used as examples are irrelevant for the reasons outlined above, because unless they are reproducing clonally, in which case they should have been circumscribed to be monophyletic, they are not paraphyletic but instead tokogenetic (net-like), and cladism does not apply inside tokogenetic structures. That leaves two supraspecific taxa that the taxonomic community has long recognised as ill-circumscribed due to their paraphyly: reptilia and Helichrysum.

One might point out that Mayr, for example, remained opposed to phylogenetic classification even as he saw it being adopted by the scientific community around him, and that recognition of reptilia as a paraphyletic taxon is not state of the art in zoology today. The vast majority of animal systematists today classify animals consistently by relatedness.

But more importantly, there is no way to base the acceptance of paraphyletic reptilia or Helichrysum on the argumentation presented in this paper, which argues entirely from the existence of hybridogenic and 'budding' speciation. This illustrates an extremely common pattern in papers arguing for paraphyletic taxa: an argument is made that applies inside a species (although even that only if we misconstrue the conceptual basis and actual practice of phylogenetic systematics), and then the entirely unwarranted jump is made to the conclusion that paraphyly should be accepted at a much higher level of classification, where the argument would not apply even if it were correct.

Saturday, May 12, 2018

What are monotypic genera good for?

There are a lot of monotypic genera around. In the group I am currently working on the most, the daisy family Asteraceae in Australia, there are an awful lot of monotypic genera indeed. Why do we need so many of them?

I would argue that there are two different scenarios to be considered. First, however, we need to keep in mind that:

We should classify organisms by their degree of relatedness, meaning that supraspecific taxa (including genera) should be monophyletic, and
while this previous rule tells us how we should group it does not tell us how we should rank. There is no genusness to be discovered in nature. Whether it is here in the phylogeny where we call a clade a genus or four nodes deeper down the tree is ultimately an arbitrary human decision.

This may at first suggest that there is no good argument to be had against monotypic genera either. If ranking is arbitrary then a classification consisting entirely out of monotypic genera - each species in the tree of life gets its own genus - is just as valid as the current one, so why not?

It is true that this is one of many possible ranking solutions compatible with phylogenetic systematics, but to decide between those many possible ranking solutions we can bring other criteria to bear. And here I would argue that it would be useful to minimise the number of monotypic genera as far as possible. Why? Because I would consider the genus level 'wasted' in many of those cases.

The entire point of a classification is that each taxon provides a piece of information. That information is: The members of this taxon are more closely related to each other than they are to non-members of this taxon. If we have a species, the species-taxon provides this information for all the members of that species. If we now have that species classified in a monotypic genus, the genus-taxon provides... the exact same information over again. It doesn't add anything. It is wasted.

Consequently, I believe that the proper use of monotypic genera is for when they are actually required for phylogenetic classification, but that there is a good argument for sinking them into larger genera whenever things could be made monophyletic without them. Two examples may illustrate the argument.

The above presents a case where the monotypic genus in red is actually needed. There are two genera marked in blue and green, and so obviously the phylogenetically isolated lineage in red cannot be lumped into either of them without making them paraphyletic. It is 'left over' and needs its own genus.

A perfect example for this is the ginkgo tree, Ginkgo biloba, which is a phylogenetically isolated living fossil. It is here photographed as an alley tree in front of our apartment block in Zürich, back when I was a postdoc there.

In the above phylogeny, however, the monotypic genus in red is sister to another genus in blue, and that latter genus isn't very large either. Now I can understand why it might perhaps be desirable to recognise the two as different genera if their divergence happened many tens of millions of years ago and they are morphologically quite distinct. Unfortunately, however, the world is full of monotypic genera that are very young and look exactly like the slightly larger sister genus, but differ from it in a single morphological character.

In those cases, do we really need that kind of taxonomic inflation? What then is the use of the genus rank?

The species that occasioned these ruminations in me is the above Tasmanian daisy tree Centropappus brunonis, which is clearly just a Bedfordia without hairs on the leaves; otherwise the two genera are pretty much indistinguishable. And Bedfordia itself has a mere three species, so it is not as if it would get unmanageably large if they were united.

There are many, many similar cases.

Saturday, September 2, 2017

Having fun with biodiversity databases

If you have ever professionally used a biodiversity database you will soon have noticed that we still have a long way to go before they are as reliable as we would like them to be.

Today I looked into the Atlas of Living Australia records for Senecio australis (Asteraceae). Except for a rather odd specimen from South Africa the distribution records look like this:

What do we have here? First, the four Australian mainland records all appear to be misapplications of the name. The Flora of New South Wales, for example, does not even mention the species, so I think we can safely assume it does not occur in Australia at all.

Second, the record in the middle of the top of the map is right in the ocean, no matter how closely we zoom in. If we look into its details, we see that it was collected on Norfolk Island, which is the cluster of red dots to its right, so somebody must have got the coordinates rather wrong.

Third, there is a cluster around Auckland, on New Zealand's North Island. I am not sure if Norfolk Island and North Island is a plausible area of distribution for this species, but it may well be. Zooming in closer to Norfolk Island, however, ...

... it looks as if somebody had played darts after having had a few too many beers. ALA informs us dryly under the section data quality tests, "habitat incorrect for species". No kidding. Or as my wife joked, unwilling to believe that the coordinates would be so badly off for such a large percentage of the specimens, "is there a fish that is also called Senecio australis?"

These are the problems that we are dealing with, more generally.

Whenever we do a study using data from biodiversity databases, as we increasingly do, we have to be very careful about cleaning the data. The main issues are outdated taxonomy, misidentifications, spatial data entry errors (which are particularly easy to recognise if an outlier record is exactly ten degrees away from a known occurrence), and imprecise spatial data. Just think of what it would do to species distribution modeling if we uncritically accepted all the records for Senecio australis.

While we can identify obvious mistakes while using a database, the data are "ground-truthed" in the actual specimens in some herbarium or museum, and the policy is usually (and quite sensibly) that the database won't update until a correction is made to that specimen in its home institution and then filters through from there. But many institutions do not have the resources to update data just because somebody sent them an eMail pointing out that their specimen is misidentified or that they made a data entry error; many herbaria on the planet are so understaffed that even the word understaffed is a euphemism. What is more, even if a database allows a registered user to annotate a record with corrections, the information may not necessarily flow back to the institution holding it, depending on whether somebody thought to set up such procedures or not.

Overall, Australia actually has excellent data quality, the Atlas of Living Australia actually allows annotations to be made, and several important Australian herbaria actually have the staffing to update their data. What I am saying is that this is as good as you can have it at the moment. It is much more difficult in many other parts of the world, and of course it would be good if we could have the same or better data quality for those areas.

Also, perhaps it is best not think too much about records that are not specimen-based but "human observations" or photos submitted by random people. There are, obviously, non-taxonomists whose knowledge of the flora is extensive, and citizen science can be awesome, but I have also seen several cases one the lines of "aaargh ... this is so misidentified it is not even the right tribe of the family, and now the database is using it as the profile picture of this nationally significant weed species!"

Sunday, April 2, 2017

The taxonomic impediment as illustrated by journals' criteria for the acceptance of manuscripts

About two weeks ago I learned from a co-author, who in that case is the corresponding author, that a certain systematic botany journal would consider our manuscript unacceptable no matter how much we improved it simply because it was out of scope. You see, our work was only "revisionary", as in dealing with species delimitation, and it would have to be a phylogenetic study to be acceptable. A few thoughts:

I do understand why higher-profile systematics journals do not accept descriptions of taxonomic novelties that take a qualitative approach like "hey, that looks different to that other species", or papers that merely validate taxonomic changes based on evidence presented elsewhere. But I completely fail to understand what the problem is with papers that, as in our case, use integrative, quantitative analyses of morphological, genetic and environmental data to resolve difficult species complexes. I would love to understand how a phylogenetic study is more serious than that. The conservation impact is, for example, much higher in studies finding a previously unrecognised, rare species than in those that only change the circumscription of a genus.

The journal in question is TAXON. Think about it: a journal literally called "taxon" has decided to accept no more taxonomic studies going forward. No word on when Evolution will stop accepting studies dealing with evolutionary biology, or when Heredity will reject all manuscripts dealing with genetics.

Note also that TAXON is still the go-to journal for nomenclatural suggestions in botany. In the latest issue as of writing, for example, we find Brownsey & Perrie, "Proposal to conserve the name Asplenium richardii with a conserved type" and Dorr & Gulledge, "Request for a binding decision on whether Briquetastrum Robyns & Lebrun (Lamiaceae) and Briquetiastrum Bovini (Malvaceae) are sufficiently alike to be confused". Those papers are important and need a forum, and it is good that TAXON is that forum. But the same is true for revisionary studies, and I cannot help but feel that in terms of editorial policy accepting nomenclatural suggestions like these but not evidence-based revisionary studies is the equivalent of saying, "we don't serve alcohol to minors, but we make an exception if you are under six months old."

The general problem is that there are quite a few systematics journals that have made the same decision over the last few years. I have thought about what journals there are in my field, and I cannot at the moment think of one with an impact factor of more than approximately one that would still accept revisionary studies. Most of the options are local journals published by university or state herbaria, usually named after a 19th century taxonomist or a plant genus, that either do not have an IF or one that is around 0.3-0.7. As valuable as those outlets are for publishing new species or smaller taxonomic revisions they just do not seem to be the right venue and have the right audience for a two-year study using complex analyses of genomic data. Surely if we have molecular phylogenetics journals with IFs of 2 to 5 it should be possible to have journals in that range that publish what might be called molecular taxonomy? If not, why not?

If we do not have journals like that, if the only option for a researcher doing species delimitation with cutting edge, expensive methods is to publish in journals that a job or promotion committee might consider to be a liability to publish in, then it is no wonder that fewer and fewer people will be willing to figure out how many and what species there are on our planet, and that those who are willing to do it will find it hard to get a job in academia. That is known as the taxonomic impediment: There are still many species to be discovered before we are even in a position to know what we need to conserve, but the number of people, institutions and resources assigned to that task is dwindling.

Which brings me to the final point. A year and a half ago I wrote about a study published in Systematic Biology that claimed to have disproved (!) the citation impediment to taxonomy. The authors actually mentioned the non-acceptance of taxonomic papers by high impact journals as one of the arguments underlying the citation impediment, but then argued the latter does not exist. As I wrote at the time, my interpretation of their paper is that they reached their conclusion based on defining phylogenetic studies that happen to include a taxonomic act as taxonomic papers, and then comparing them against phylogenetic studies that do not include a taxonomic act. For example, they had the Botanical Journal of the Linnean Society in their data, which at that moment had officially stopped accepting taxonomic papers for several years. In other words, the study's approach seems to have been the equivalent of examining discrimination against women by comparing men who grow a beard with men who do not grow a beard.

In the light of my recent experience, that paper now seems even more upsetting.

Friday, January 6, 2017

Ochlospecies

Happy new year, everybody! Although many people are variously fed up with 2016 for the supposedly high number of celebrity deaths, Brexit and the US election, for me personally the past year was very enjoyable and successful, so I cannot really complain. Let's hope that 2017 will at least be better than so many of us expect.

Anyway, quite some time ago I wrote a series of posts on species concepts. Because a reviewer mentioned it, I had now reason to look into a concept that I was not familiar with, that of ochlospecies.

It was apparently developed by a tropical tree taxonomist called White in the 1960s, but the most useful source that is available online appears to be a 1998 review by Q.C.B. Cronk.

The idea is that there are different kinds of species that can be distinguished based on their patterns of variation. There are well-behaved, very distinctive species. There are species that are variable but in a way that is easily understood, for example because variation is nicely hierarchical, because several characters show correlation, or because there is a clear geographic pattern. And then there are the bad apples, species that show variation but without any clear structure. Several crucial characters appear to vary independently, no geographic pattern, just a mess. Those are then the ochlospecies.

What do I take from this?

Well, first to me this is a rather unexpected way of going about species concepts. The whole point, as far as I am concerned, is that before you embark on a taxonomic study you should set up clear criteria of how you will evaluate the evidence that you are going to collect.

That makes it science. So if you are dealing with the circumscription of genera, instead of making it up as you go you say in advance: I will know that a genus is acceptable if it turns out to be monophyletic. And if you are dealing with the circumscription of species, again, instead of making it up as you go you say in advance: In my study I will follow the Genotypic Cluster Species Concept, meaning I will circumscribe species based on gaps in morphological and/or genetic variation. Or whatever other species concept works for the group and the data that you can actually collect.

When I look at the ochlospecies, however, it looks to me as if it is not a criterion but a conclusion. Reading the Cronk review, it appears as if the taxonomist circumscribes species somehow (magical asterisk) and then labels some of the resulting species as ochlospecies after the fact. At a minimum that means that the concept has a different utility, if any, than the species concepts I have considered previously, than that of serving as a potential guideline.

So at the moment I am a bit at a loss as to what to do with the concept regarding my paper. As the concept is not directly useful methodologically, my options seems to reduce to name-checking it either in introduction or in discussion.

Saturday, June 18, 2016

Taxonomic terminology again

I have often complained about idiotically unusable identification keys. Unclear descriptions of species are a related issue.

Yesterday I read the description of a daisy and encountered the term 'peracute'. Acute means pointy and is usually understood to mean that the apex of an organ shows an angle of less than 90 degrees (as opposed to obtuse for more than 90 degrees). The most frequently encountered problem is somebody writing 'subacute' to indicate... well, that is a good question. Pointy but not very pointy? And then of course we have issues with some taxonomists using related terms for pointy apices that show some kind of curvature, like apiculate or attenuate, in inconsistent ways.

But I have never before come across peracute. My first reaction was to try and translate it. Again, acute means pointy. Per means through, so through-acute? That does not make any sense whatsoever. A more modern meaning is 'for each', as in twenty kilometres per hour. For each acute? Yeah, not very sensible either.

But wait, I have that old glossary I wrote about in March. But no, sadly it doesn't contain that term either. So really, what is the point if a taxonomist uses a word that simply doesn't exist? Who will understand the description under those circumstances? Wouldn't it be nice if we could define four to six clear and universally agreed-on terms for, in this case, apex shapes, and use them, and only them, consistently?

Ah well. At least I had the opportunity to peruse the glossary again. And of course peracute is nothing. This book is the mother lode of bad ideas.

Pampinus - n. Tendril.
Then why not write tendril? Ye gods.

Panduriform - a. Fiddle-shaped.
Then why not... oh, we had that already.

Parastomon - n. An abortive stamen, a staminodium.
You know, we already have a word for that. It is 'staminodium' or 'staminode'.

Phoranthium - n. The receptacle of the capitulum of Compositae.
Ah, I have heard of that structure, only everybody calls it a ... receptacle.

Argh. Argh. Argh. Argh. Headdesk. Are some people actually deliberately trying to make their taxonomic publications pointless, useless and maximally infuriating?

Thursday, April 21, 2016

The joys of single-character taxonomy

Time for a little rant. Two days ago I tried to identify an Australian native Asteraceae. I already knew that it had to belong to one of two genera, and had always wondered why those two genera were recognised as distinct in the first place. If you put a randomly chosen species from the first next to one from the second you will be hard pressed to see any difference beyond hair cover or suchlike.

I assumed there would be some fruit character, for example feathery versus smooth pappus bristles. That would be bad enough because it would probably still mean that one genus is phylogenetically nested within the other, as is usually the case when there is only a difference in one trait. This is because then one genus is defined by having the trait and the other merely by lacking it; in systematics, we call that an 'apomorphic segregate'. But okay, such a fruit character, even if evolutionarily irrelevant and phylogenetically uninformative, is at least user-friendly. You can look at the pappus (or beak, or whatever) and quickly conclude: ah yes, it must be this genus.

What was the difference in the present case? "Florets homogamous" or "florets heterogamous". Before we consider the trait itself, hands up everybody who knows what that means! Yes, that's what I thought. The identification key in question was apparently written for an end user group of about half a dozen fellow taxonomists in Australia or so, but certainly not for conservation managers, community ecologists, or plant-enthusiastic non-scientists.

Now the trait itself. It means whether the flowers in the daisy flower-head are all of the same type or if there are different types present; and here we are not talking about the presence or absence of petal-like ray florets or anything easy to see like that. We are talking about one of the two genera sometimes having a few female flowers at the edge of the flower-head in addition to the normal, bisexual flowers. In other words, get out the anatomy grade tweezers and a dissecting microscope!

And as expected we are dealing with a single character difference. It is extremely unlikely that the two genera are reciprocally monophyletic, so they probably don't make sense in modern systematics. But even from a so-called 'evolutionary' taxonomy perspective this is weird. Again, you place species from the two genera next to each other, you will not see any significant difference; and surely having or not having a few female flowers in the head is not going to put a species into a different 'adaptive zone' or something. So what is the idea?

What is weirder is that this criterion is not even applied consistently. Another closely related genus has got several homogamous and one heterogamous species.

Of course this is not the first time I have seen a situation like that. The genus I did my honours on was Suessenguthia (Acanthaceae), a group of (now) six species with four fertile stamens and little hooks on the anthers. Some of its species are pretty similar to those of the larger genus Sanchezia except that the latter has only two fertile stamens. In addition, there is a monotypic genus Trichosanchezia that looks exactly like certain hairy, northern Peruvian Sanchezias but has four fertile stamens without the little hooks. Even better, there was once a likewise monotypic genus called Steirosanchezia characterised by two fertile stamens without hooks; that one, however, has already been put out of its misery and sunk back into Sanchezia.

So once there were four genera based merely on minutiae of the androecium, for species that are so similar that they constantly get misidentified to each other's genera, and obviously all forming one tight natural group. How is that helpful? How did that ever make sense even before Phylogenetic Systematics, even before the Theory of Evolution?

Saturday, March 26, 2016

Discordant paraphylies executed

Strangely, editors of systematics journals do not appear to tire of opinion pieces rehashing the discussion about paraphyletic taxa that had already been laid to rest in the 1970s. The newest example of such a publication showing up in my alerts is Seifert et al. 2016 in the journal Insectes Sociaux, who take issue with suggested taxonomic changes in ants.

As usual I would like to tease apart the argumentation, examine it for its merits, and consider if there is anything new in it that constitutes a good reason to accept paraphyletic taxa. But first, the title of the piece, which is (sorry to say) one of the most unwieldy titles I have ever seen on a scientific publication:

Banning paraphylies and executing Linnaean taxonomy is discordant and reduces the evolutionary and semantic information content of biological nomenclature

If I may attempt to rephrase a bit, I think the authors mean the following:

Banning paraphyletic taxa is incompatible with Linnaean taxonomy and reduces the evolutionary and semantic information content of biological nomenclature

...although that is still too long, and it is not clear what is meant with evolutionary and semantic in this context. At any rate, this already suggests what the two main arguments in favour of paraphyly might be. Both of them are not exactly new and have repeatedly been dealt with at length, but of course the hope is that the present paper acknowledges the cladist responses and provides additional counter-arguments instead of ignoring them.

Unnecessarily arcane botanical terminology

Sometimes the personal libraries of retired colleagues who want to downsize, or other left-over old books, are made presented in the corridor of our herbarium so that staff can pick what they still find useful. I have in this way acquired a number of valuable books, especially old Australian or New Zealand floras.

Recently I have in the same way picked up a book with great (unintentional) entertainment value. It is Taxonomic Terminology of the Higher Plants by H. I. Featherly in a 1973 reprint of the 1959 edition, published by Hafner Press. In case it isn't immediately clear, it is mostly a glossary of botanical terminology. To illustrate what is so funny about it, let's open the book randomly on pages 62 and 63 and look at a few selected entries just from this double page. Blue text is from the book, black font are my comments.

Patrocladistics 2: What if we include ancestors?

Last post we looked at what patrocladistics is and how it works. The example case was not a real group of organisms but contrived, but it was perhaps typical in that the dataset only included extant species.

In this post I want to explore what happens to the results of a patrocladistic analysis if we add all the ancestors. There are two reasons why this is of interest to me:

First, I believe that like many other ideas for the objective delimitation of paraphyletic taxa patrocladistics relies on not having intermediate ancestors in the dataset. Not all, perhaps, but many such approaches identify a long branch or gap in variation. The problem is that such a long branch or gap is merely an illusion based on the patchiness of the fossil record. In reality, evolution is gradual. And if somebody claims to have a good approach to classification it could be argued that it should be able to deal with the discovery of intermediate fossils.

Second, proponents of paraphyletic taxa often criticise phylogenetic systematists for supposedly ignoring ancestors, or for supposedly defining them out of existence. If patrocladistics does, as I suspected, rely on the absence of ancestors, that might at least be seen as a bit ironic.

So back to our artificial phylogeny. It contains two outgroup species and five ingroup species, two of the latter on a very long branch:

Patrocladistics 1: How does it work? And a contrived example

As the approach is often mentioned in pro-paraphyly publications as an objective method of delimiting paraphyletic taxa, I thought I should look into patrocladistics again and examine it in a blog post or three. In the following I will approach patrocladistics from three different angles:

1. What is patrocladistics and how does it work? This is very straightforward.

2. How does the patrocladistic approach perform when ancestors are added?

It is often easy enough to explain how something works in the abstract, but it is perhaps more enlightening to throw different problems at a method and see under what conditions it is more or less useful or may be mislead. For example, explaining how BEAST does its phylogenetic inferences does not necessarily by itself tell us how it will perform when faced with, say, 25% missing data. I often criticise the pro-paraphyly movement for what I see as their reliance on the fortuitous absence of intermediate fossils to separate out paraphyletic groups. Conversely, members of that movement have a tendency to criticise cladists for supposedly ignoring ancestors. So in the case of patrocladistics, I wanted to see what happens if the method is provided not only with extant taxa but also with ancestors.

3. What is the rationale behind patrocladistics?

In other words, if somebody who is agnostic about the whole phylogenetic versus 'evolutionary' systematics issue were to ask why they should do a patrocladistic analysis, or what the biological or philosophical justification for such an analysis is, what would the answer be?

This post will cover the first point.

Do half the natural history specimens in the world really have the wrong name?

On Friday a colleague drew everybody's attention to a recently published study from Edinburgh and Oxford: Goodwin et al. 2015, Widespread mistaken identity in tropical plant collections, is presented in press releases and media as demonstrating that, to quote the University of Oxford's news page, "half the world's natural history specimens may have the wrong name".

As museum and herbarium specimens are used to extract DNA for phylogenetic and evolutionary studies, to draw distribution maps, to inform conservation decisions, and to examine spatial patterns of diversity, this sounds pretty dire. Luckily, in my eyes at least, this interpretation is totally over-hyped.

(Full disclosure: I know one of the authors of the relevant study, as he was involved in selecting my Diplom thesis topic and allowed me to join a field trip he had organised.)

My first reaction to reading the sensationalist headline was that perhaps they are talking about insects, which account for the vast majority of natural history specimens. That would have sounded at least remotely plausible given how difficult insect identification is and how few entomologists there are. It turned out, however, that the study was dealing with plant (herbarium) specimens.

My next thought was simply, no way; just looking at the herbarium I am working at the situation is nowhere close to that bad. 5% perhaps, okay. So how do they arrive at these seemingly shocking numbers?

The problems I have with the way these results are discussed fall into three categories: selection of example cases, hasty over-generalisation, and equivocation on the term "wrong".

Andere Laender, andere Sitten

One thing that gave me a bit of a culture shock after coming to Australia was the way in which plant taxonomists work here compared to how I had been trained by European and North and South American taxonomists.

A new classification of all living organisms

Of course, the day after I write that there is a near-unanimous consensus that taxa should be monophyletic, I get an alert that a new classification of all of life has been suggested, and it turns out to be proudly non-phylogenetic (Ruggiero et al. 2015, A higher level classification of all living organisms, PLOS One 10: e0119248).

Interestingly, the authors describe their classification as "neither phylogenetic nor evolutionary". There are two ways to read this. Either they don't know what they are talking about, because 'evolutionary' classification is what the proponents of paraphyletic taxa call theirs, and Ruggiero et al's classification has paraphyletic taxa and is consequently 'evolutionary' in that twisted sense; or they mean to indicate that their classification is pragmatic and theory-free.

The latter interpretation would fit with the repeated mention of "serving ... database providers" and "its immediate use as a management tool". In other words, this is about archiving, not science, which is fine as far as it goes. Weirdly, however, the abstract also claims that the new classification would be "immediately valuable as a reference for taxonomic and biodiversity research, as a tool for societal communication, and as a classificatory 'backbone' for biodiversity databases, museum collections, libraries, and textbooks". But that is precisely what a non-phylogenetic classification is not useful for. Naming incomplete, non-natural groups is downright misleading to subsequent taxonomic and biodiversity research, it misinforms the public, and it would misinform students if used in textbooks.

So, how much non-monophyly is there in this system? Lots. They recognise the prokaryotes, although it seems fairly clear now that the archaea are at least more closely related and more similar to the eukaryotes than to the bacteria if not paraphyletic to the eukaryotes. They recognise various groups of algae that are paraphyletic to the land plants; the crustaceans, which are paraphyletic to the insects; the paraphyletic reptiles. And that is just scratching the surface. If somebody were to interpret this as a summary of our knowledge of the tree of life they would be seriously mislead.

Giving taxonomists recognition through citations

Today we discussed the recent Zootaxa paper Fried spicy Linnaeus (Kottelat, 2015) in our journal club. It is open access and a fun read, well worth checking out.

Kottelat discusses the pros and cons and potential variants of mentioning the taxonomic authority after Latin names of plants and animals. Well, mostly animals, but the same applies in all of biology. In botany, the usual format is as follows. Anemocarpa podolepidium (F.Muell.) Paul G.Wilson means that the species was originally described by the 19th century botanist Ferdinand Mueller in a different genus, as Helichrysum podolepidium F.Muell., and then later transferred into the genus Anemocarpa by Paul Wilson.

Botanical journals usually demand the author of a paper to give this full taxonomic authority the first time any species is mentioned but to leave it out afterwards, which can lead to rather odd looking sentences in which some species have the authority and others don't. In general, Kottelat argues convincingly, these authorities make the text clunky and do not usually add any information relevant to the reader.

On the other hand, many taxonomists are concerned that taxonomists should be given more, not less, recognition for the work they do. In botany at least the authorities as part of plant names do not count as bibliographic references and thus do not increase a taxonomist's number of citations. Because we are unfortunately evaluated based on how often people cite our work (as opposed to, say, how often they use it without citing it), many colleagues would like to see them turned into proper bibliographic references and would certainly not like them to disappear altogether.

Kottelat ultimately comes down in favour of turning them into full bibliographic references in taxonomic research papers and doing away with them in all other publications, a compromise that does not entirely convince me.

However, I really want to make a different point. I believe that the taxonomists who push for a rule requiring a bibliographic reference every time a species name is used vastly overestimate the utility of such a change.

Taxonomy is more important to hunter-gatherers than to farmers

Over the past few days we had a big strategy meeting across the Australian national biological collections: land plants, tree seeds, algae, insects, land vertebrates, and fish. One thing that struck me was the following:

All collections do taxonomic research as part of their core business, that is describing new species, writing identification keys, publishing field guides, compiling lists of accepted names, etc. But whereas in the other five this research is mostly done for the purposes of informing conservation management, biosecurity and weed management, other scientific research, and the general public (e.g. native plant enthusiasts or bird watchers), fish taxonomy is the only one that really has deep-pocketed primary industry interest behind it. The fishing industry is actually asking and paying for basic taxonomic research.

Why is that the case? Are ichthyologists simply better than botanists or herpetologists at engaging commercial partners? But if you think about it, there could be a much simpler answer: Fishing is the only food sector in which our civilisation is still pretty much at the hunter-gatherer stage.

Yes, there are some fish farms these days, but what mostly happens is that somebody casts their net into the ocean and catches a chaotic mixture of organisms. And then of course they have to know: which of these are edible? Which of these are worth the bother to process them? How do we have to process them? What are their names, so that we can sell them without frustrating the consumer? All that is really taxonomic knowledge.

In other food sectors the situation is much simpler: A farmer will not really be under any doubt that they harvest wheat after sowing wheat, or that the animal going bah on their paddock is a sheep.

If, however, we still obtained our vegetables and fruits the way we obtain our fish, we would have somebody come to the market with a big bag of stuff that they collected in the bush; they would tip it out, and then they would wonder: Is this berry edible or is it poisonous? What is the name of this weird bulb I have never seen before, and what can it be used for? Are these two tubers the same species? What should I name this type of nut so that the buyer knows what they get?

If that were the case, we would also see more direct "industry" interest in plant taxonomy. Although of course a society operating like that relies on painful trial-and-error instead of formal scientific studies and on personal instruction instead of a four volume print flora, it still needs knowledge of plant and animal taxonomy to be much higher and more widespread across the population than our specialised farming culture.

Thursday, February 13, 2014

Identification keys: taxonomist-friendly versus user-friendly

An important part of my work are identification keys. Although multi-entry online keys are increasingly becoming available, the standard form they take is still after the fashion of a chose your own adventure book. The end-user - generally somebody who has a living plant, a dried specimen or, if they are unlucky, merely a photograph in front of them, is presented with a series of questions (couplets) with two possible answers each (leads). Every answer leads to the next question or ultimately to a group of organisms.

A simple example to demonstrate the principle:

1a. Vehicle has two wheels ... 2
1b. Vehicle has more than two wheels ... 3
2a. Engine present ... motorcycle
2b. Engine absent ... bicycle
3a. Vehicle has four wheels ... 4
3b. Vehicle has more than four wheels ...truck
4a. Vehicle carries few passengers ... automobile
4b. Vehicle carries dozens of passengers ... bus

Of course such a key is only as good as the taxonomist who writes it. In the above case I have, for example, glossed over the existence of small trucks with only four wheels. In reality, a plant taxonomist may have written a key that is similarly missing one species you can find in nature simply because it was unknown to them at the time.

However, that is not what I want to make the point of this post because we can rarely be sure that we have discovered all species and are aware of all variability out there in nature. What I want to write about today is a very specific way in which some of my colleagues in taxonomy fail to make their keys user-friendly.