Thursday, February 13, 2014

Identification keys: taxonomist-friendly versus user-friendly

An important part of my work are identification keys. Although multi-entry online keys are increasingly becoming available, the standard form they take is still after the fashion of a chose your own adventure book. The end-user - generally somebody who has a living plant, a dried specimen or, if they are unlucky, merely a photograph in front of them, is presented with a series of questions (couplets) with two possible answers each (leads). Every answer leads to the next question or ultimately to a group of organisms.

A simple example to demonstrate the principle:
1a. Vehicle has two wheels ... 2
1b. Vehicle has more than two wheels ... 3
2a. Engine present ... motorcycle
2b. Engine absent ... bicycle
3a. Vehicle has four wheels ... 4
3b. Vehicle has more than four wheels ...truck
4a. Vehicle carries few passengers ... automobile
4b. Vehicle carries dozens of passengers ... bus
Of course such a key is only as good as the taxonomist who writes it. In the above case I have, for example, glossed over the existence of small trucks with only four wheels. In reality, a plant taxonomist may have written a key that is similarly missing one species you can find in nature simply because it was unknown to them at the time.

However, that is not what I want to make the point of this post because we can rarely be sure that we have discovered all species and are aware of all variability out there in nature. What I want to write about today is a very specific way in which some of my colleagues in taxonomy fail to make their keys user-friendly.

There are many possible mistakes one can make in setting up an identification key, some more obvious than others, some more severe than others. One would be the failure to make the two leads of a couplet real alternatives. For example, if the first option is "fruit hairy" and the second is "fruit hairy or glabrous" then the user will be at a loss to know what to do with a hairy fruit. Another would be to have imprecise information, e.g. "leaves large" instead of "leaves longer than 4 cm". But these mistakes are not those that an experienced, professional taxonomist would make.

No, what the professionals do is sometimes even more annoying. Of course I don't want to be too specific or name any names but today I came across one of the many keys in my group of organisms that were clearly written not with utility in mind but instead purely from a perspective of taxonomic neatness. In other words, the taxonomist is writing the key for themselves, not for the end-user.

What I mean is this: After potentially years of study, the taxonomist (hopefully) understands their group well enough to figure out what species are related to each other. Consequently, they think in terms of neat little boxes of related species, and they intuitively write the key so that it separates species by these relationships. If a genus can be divided into a handful of sections, they will write the first questions of the key to lead the user to these neat little sections, and then there will be subkeys for each section.

The problem is, the characters that best reflect relationships are not necessarily the best ones to use for a key. Indeed it will often be the case that very showy and obvious characters, those that are best seen and understood by the end-user of a key, are NOT indicative of the true relationships because they are under strong ecological constraints.

Think of flower colours, shapes and sizes (adaptation to pollinator preference), growth height or leaf size (adaptation to environment) or hairs on leaves and stem (potentially adaptation to radiation exposure, moisture, or herbivory). Because they are strongly selected upon, these characters change frequently and do not well reflect evolutionary relationships, but at the same time they are easy to see and easy to explain.

What we get instead from some colleagues is the exact opposite, obscure characters that are perhaps informative of relationships but either hard to see without a stereomicroscope or utterly opaque to any non-specialist. There are lots of keys to certain genera of Australian daisies, for example, where the first few questions are all about the ornamentation of the fruits. Quite apart from the specialist terminology involved, what do you do if your specimen was collected before maturity? If the key started with ray floret colour or leaf shape one could at least narrow things down a bit!

The key I ran into today started with a question about the veins in the bracts of the daisy flowerhead, because that is what defines two natural groups in the present genus. Seriously, that was the first question, the one that a park ranger, horticulturalist or field ecologist would have to pass first before they have any chance of narrowing down which species they could be dealing with. And that in a genus in which about half the species have yellow flowerheads and half of them have white ones; a genus further in which about half the species have unbranched stems with solitary flowerheads and half have branched stems with several flowerheads. I quickly got fed up and have now written my own key.

So if I could make a suggestion to all other taxonomists out there:

When writing a key, don't think systematic relationships. Ask yourself: If I had little knowledge of specialist terminology and no microscope or strong hand lens available, what would be the questions most useful to quickly narrow down the number of possible species? Those need to come first. Microscopic stuff like anther or style appendages, no matter how phylogenetically informative, should be moved to the end, when there are only two or three taxa to be decided between, and best drop those characters entirely unless there are no better ones.

The end-users of your research will thank you.


  1. Supposedly the ideal key divides the group in half with each couplet. On the other hand, something to be said for getting the easy obvious ones out of the way first. I have had experience with helping a colleague develop a fish key by having my students use it. We did the same with our key in "Fishes of the Continental Waters of Belize". Having non experts use the key readily exposes problems.

  2. My hunch is that the taxonomists in question never even consider the idea that their keys might serve some purpose for non-experts. At least otherwise I cannot make sense of the way they are written.