Sunday, May 29, 2016

Botany picture #228: Ledum palustre


It seems as if I have rather neglected posting plant images over the last few weeks, and given that how busy the last few days were something simple like that fits very well at the moment.

Above is Ledum palustre (Ericaceae), from a swamp in eastern Germany in 2008, when I was co-supervision a student excursion. On that occasion one of the students stepped onto some floating moss and sank waist-deep into the water. Fun times.

---

On a totally unrelated note, I really wonder why flight attendants have developed the habit of looking at everybody's ticket when they enter and then explaining where the seat is. And yes, I mean the seat explaining part. Two or three times in the past I have already tried covering that (and only that) information with my finger, so that they could still see that I am on the right plane, only not read the seat number out to me. And in two cases they got really upset about that, trying to pull the ticket from my hand, shouting at me, stuff like that.

I mean, when I am flying Canberra - Brisbane or something like that we are not actually using large and complicated aeroplanes. There's A-C on one side and D-F on the other; there aren't even two corridors. I don't need somebody to explain to me that A is a window place on this side. All seats are clearly labelled. Surely anybody who needs that kind of instruction to find their seat would not have made it to the aeroplane in the first place because they'd still be at home wondering why their trousers don't fit over their head?

Wednesday, May 25, 2016

Species delimitation once more

Today I attended a kind of mini-symposium featuring five short talks from an ecology focused university department. The main topic was the use of molecular data in ecology, and I learned a lot of fascinating things especially about recovery of animal populations after bush fires and about Antarctic ice age refugia.

The discussion, however, turned very strongly towards species delimitation, and this is where things got a bit weird. It was remarkable how self-confidently some, say, colleagues specialising on the behaviour of a few selected fluffy mammals pronounce some black and white opinion on a problem that systematists have been struggling with for centuries.

Let's just say that if the situation were reversed - perhaps me being asked about fire ecology after giving a talk on plant systematics - I hope I would be able to stick to something on the lines of "well, this is what I think, but really you should ask an ecologist". I sure hope I would not simply say "based on what I observed in one plant species, fire has no effects whatsoever on the genetic diversity of animals". Some of the ecologists here had no such reservations.

At one end of the spectrum several audience members appeared to expect that now that we have genomic data we should finally figure out an objective cut-off for how much genetic difference between two individuals puts them into different species. This is one of those ideas that look superficially attractive but reveal new layers of wrongness the longer one thinks about them.

The easiest answer was provided by one of the speakers: Difference in what molecular marker(s)? Different genes or regions evolve at very different speeds. Taking the whole genome also seems a bit suspect as most of it is junk DNA, the exact amount varying by species. But again, there are layers. The speed of molecular evolution would also differ vastly between different lineages depending on their generation times, mode of reproduction, perhaps other life history traits, and the environment they find themselves in. The genetic diversity within populations is also vastly different from species to species.

But what mostly blows the idea out of the water is quite simply that there is necessarily a grey zone between being one species and having split into two species. ANY cut-off would have to be entirely arbitrary.

That brings me to the other end of the spectrum. After the talks, one of the speakers declared over snacks all of the following: (a) we should use the Biological Species Concept, (b) species are arbitrary human constructs and have no empirical reality, and (c) genetic data will always give you two clear groups, only we must be careful how we interpret that.

My first observation is that it should not really be possible to believe these things at the same time, because (a) and (c) are in direct contradiction to (b).

The second is that I consider all three to be wrong. It is clear that the BSC simply does not apply in many cases, especially asexual species and fossils.

As for the arbitrariness of species, well, it depends. (This is also the only answer to the species problem that I find useful.) I wrote above that there is a grey area, that a cut-off for "moment of speciation" is arbitrary. That being said, the same is true for many other things that we happily classify. We can probably all agree that the cut-off between child and adult is arbitrarily placed at the eighteenth birthday; it could just as well be a month earlier or three years later.

Now here is the question: Would you say that toddlers should be drafted into the army? No? Seems as if the difference is not so arbitrary after all, even if there is a gradient between the two categories. Likewise, there is nothing arbitrary about the question whether humans and horses are the same species or not.

Finally, genetic data always giving two discrete clusters? Ha, I wish.

In summary, species are complicated. I would be very skeptical of any claim that somebody has sorted it all out.

Thursday, May 19, 2016

Life, consciousness, free will: words were defined to describe something

From time to time, even in groups where you'd think people would be a bit more sceptical, somebody will suddenly pipe up and say something on the lines of
Is matter really alive? Maybe the distinction between inorganic and organic is false. If we push further into the nature of matter, would we find that matter is alive and conscious in a monistic, naturalistic, materialistic sense?
Or perhaps
I don't know about "alive" - I would think not - but "conscious", yes, that has been my hunch for many years. I think fundamental particles simply must be conscious in some sense. Not in the same full sense as humans, presumably; but to build that full-sense consciousness in organisms like humans, it seems to me that you need building blocks that have some sort of consciousness themselves.
Subatomic particles alive and/or conscious? Right. The easy answer to this is somewhere on the spectrum from hysterical laughter to shocked silence. But as so often, isn't it much more interesting to explore where exactly the reasoning has gone wrong here?

I think what is going on here, and in some other cases one of which at least I will come to, is a fundamental confusion about what concepts are good for and how they arise. There seems to be this idea that there is an essence of life-ness or consciousness, and we just randomly happen to have a name for this essence, and now we can start speculating about whether this or that item is imbued with the relevant essence.

As far as I can tell that is not how it works.

It was surely not the case that when language developed to the degree of complexity that concepts like "alive" and "conscious" were first expressed and named some metaphysicist named Ugh sat down, deduced from first principles that such an alive-ness essence must exist, made up a word, and then went about applying it to stuff. It seems fairly clear that it must have been the other way around (not least because that other way is how we still coin new terms today):

Pragmatic hunter-gatherer Ugh would one day have observed that his grandfather, Ogh, who just yesterday was still walking around, eating, drinking, and having a chat about the weather, has over night gone cold, stiff, and unresponsive, and will now start to decompose. Ugh is growing concerned about these goings-on that he has previously observed in other older people (and younger people who have been clubbed on the head), and he wants to discuss the matter with his partner Agh. So he needs a word or two to describe the difference between, in this case, being able to walk around and have a conversation, and being forever incapable of doing that. So he must have a term for alive and another for dead.

Considering the matter further, we can then start arguing about how these terms apply to other things. Rocks are certainly quite like dead-Ogh. Are plants more like alive-Ogh or more like dead-Ogh? They don't walk around or talk, okay, but obviously plants can also die, just look what happens to the tree uprooted by the storm three weeks ago. Perhaps movement and speech are not the best criteria after all? Perhaps it should be growth and being able to make children? Maybe once we consider viruses (which would not have been known to Ugh) we find that it is really hard to find a clear cut-off between the two categories. But the starting point, the rationale for having a word like "alive", was the need to describe an observed sharp difference.

Same for "conscious". Ugh-awake versus Ugh-asleep, or perhaps Ugh-having-eaten-too-many-fermenting-berries. Useful concept. Then we look around us and ask if a beetle or chicken is ever conscious like we are. Maybe we find that consciousness is not a binary state but a smooth gradient, with chickens having a bit of it. Maybe we can even develop weird speculative thought experiments about people being just like us in every regard except consciousness. But the starting point, the rationale for having a word like "alive", was the need to describe an observed sharp difference.

And that is the problem with going "duuude, what if all matter is conscious?" Yes, we could redefine these terms to apply in that way. Only then they would become empty, content-free, utterly useless for their original purpose. We would at best have to discard them and invent new terms to describe the very same concepts over again. In the worst case we'd just confuse our language.

The same situation obtains, by the way, in the perennial discussion whether Free Will is an illusion. At least in the everyday languages I am familiar with the concept was invented to describe the difference between being able to act on your own true desires or rational preferences and being forced, compelled or tricked into acting against them. If the term is discarded because of the entirely irrelevant observation that our brains follow cause-and-effect we will need a new term to describe the same difference.

Eliminating "out of my own free will" is as unhelpful as watering down "alive" and "conscious". They weren't invented as abstract essences that we now realise are everywhere or nowhere. They were invented to describe empirically observed differences, differences that very definitely do exist.

Monday, May 16, 2016

Biogeography with or without Pan: Science or not?

Quite some time ago I wrote about two papers by panbiogeographer Michael Heads in the journal Australian Systematic Botany. In the latest issue of the same journal, Matt McGlone has now published a rebuttal.

He focuses on arguing the following:

First, it does not appear to him as if there was institutional or hiring bias against panbiogeographers in New Zealand in the 1980s and 1990s, as claimed by Michael Heads. Instead he sees a general tightness of the job market as the likely reason that panbiogeographic PhD students went overseas, as did many others. (I am of course entirely unable to assess this either way.)

Second, there is a lot of indirect evidence for long distance dispersal: time-calibrated phylogenies, observed dispersal events happening around us today, groups with easily dispersed propagules are more likely to show trans-oceanic disjunctions than groups with heavy and/or short-lived propagules. Panbiogeographers tend to doubt that the first are reliable, argue the second away as 'normal dispersal' that does not demonstrate that speciation will follow, and explain the latter with widespread ancestors.

Third, McGlone argues that the model of speciation underlying the panbiogeographic assumption of the widespread ancestor is at best ill-defined and at worst incompatible with the current best understanding of speciation processes in evolutionary biology.

What I find most interesting about the paper is, however, its analysis of the motivation behind panbiogeographers' rejection of long distance dispersal as an explanation in biogeography. It cites Michael Heads himself as follows: long distance dispersal "is unfalsifiable and explains all distributions and none at the same time". In other words, mainstream biogeography is not a science because it is at heart ad-hoccery. No matter where something occurs, it can just be assumed to have got there through dispersal, end of story.

Conversely, and as seen above, mainstream biogeographers do not consider panbiogeography a science because it always assumes the ancestor to be widespread, building the conclusion of vicariant speciation into its premises. Another way of saying the same thing is that the panbiogeographic method has no way of falsifying the hypothesis of vicariance; instead it is designed to merely find a rationalisation of that assumption.

So which is it? Or is all of biogeography just-so stories? Unsurprisingly, given what I wrote before on this issue, I feel that the panbiogeographic analysis method at least as presented in the two Australian Systematic Botany papers by Heads does indeed come across as circular reasoning, and thus I find it hard to accept it as scientific.

On the other hand, I believe that a rejection of dispersal as ad-hoc misses an important point. It sees the issue as binary: either dispersal or no dispersal, for any given group of organisms. Instead, it becomes a matter of statistics when we shift our perspective to larger numbers of cases, and here it should become clear that we can again formulate testable hypotheses and even models or 'laws'. For example about how certain traits of a species influence its range size or its likelihood of establishing in a distant area, or how the distance and size of two landmasses and the wind and current patterns between them influence the likelihood of organismal exchange in both directions.

In summary, the panbiogeographic critique of mainstream biogeography as unscientific does not convince me. One could just as well argue that science cannot study radioactivity if it permits a random element to when exactly an atom decays.

Monday, May 9, 2016

When do we need voucher specimens?

Yay, in the last few days this blog passed 100k views! And just when I had a whole week of not being able to find the time to add anything...

Anyway, on Friday I have been considering voucher specimens. First, in case somebody from outside the field reads this, what are they?

Imagine somebody did a study of essential oils found in the South American mint genus Minthostachys, and they published a paper reporting a pulegone-dominated oil for Minthostachys glabrescens, a menthone-rich oil for M. verticillata, and a carvone-rich oil for M. mollis. Twenty years later, a taxonomist revises the genus and finds, for example, that the name M. glabrescens had for decades been misapplied to a completely wrong species (as per the type), and that the circumscription of M. mollis needed to be changed.

A new taxonomic treatment of the genus is published, and you might now, if you were interested in its ethnobotany, biochemistry or commercial exploitation, be interested in knowing how the old oil data relates to species as currently circumscribed. What those guys who did the oil study called glabrescens definitely wasn't true glabrescens, but what was it instead? Which currently accepted name applies to the sample that had the pulegone-rich oil?

If all there was in the oil paper were names and biochemical data you're stuck. This is where voucher specimens come in. For good scientific practice, the authors of that study should have deposited a herbarium specimen of each sample they analysed in an officially recognised and accessible research herbarium (the kind of institution serious enough to be listed in the Index Herbariorum), so that we can examine them even fifty years later and figure out what exactly it was that they had in their study.

So, in short: a voucher specimen is a herbarium / museum / biodiversity collection specimen that is cited in a publication to allow later scientists to verify the taxonomic affiliation of a sample used in a scientific study. It could be a dried and pressed plant, a needled insect, a fish skeleton or a stuffed bird; it could be connected to a morphological data set, a DNA sequence, a biochemical profile or a new species name. (In the latter case it would not be a mere voucher but a type. Although types are even more valuable, the principle is the same.)

Among biodiversity researchers the importance of vouchers is well understood. It is, or should be, virtually impossible to publish a study in a good botanical journal without citing a list of voucher specimens underlying your data either in a table, in an appendix, or as part of the paper's online supplement. And we are often rather exasperated that not all colleagues in related fields have the same approach. The biochemical example from above was chosen deliberately, as I have run into many essential oil or ethnobotanical studies that neglected to cite vouchers, meaning that their results are pretty much unreproducible and scientifically near worthless.

That being said, however, I have started to wonder whether some colleagues don't go a bit overboard with this. The occasion is discussing a seed reference collection, about which several people have asked me "is it vouchered"? So the idea is, we can only use the seed samples that have a herbarium specimen as a reference somewhere.

But we are not talking here about biochemical data, a DNA sequence uploaded to GenBank, or a morphological description. The seeds themselves are biological specimens, aren't they? Can't they be their own voucher?

Granted, there may be many cases where only having the seeds is not good enough to narrow taxonomic affiliation down to species level. But is that really different in principle from a perfectly acceptable herbarium specimen of a flowering plant ... in a genus where you need fruits to identify to species?

So to me the point of a voucher is that it is a lasting, biological reference specimen for a piece of data. But a lasting biological specimen should not necessarily need another lasting biological specimen as its reference. In some cases it may be good enough to be a specimen in its own right. Or to look at it another way, there can be botanical specimens that are not dried and pressed whole plants.

Sunday, May 1, 2016

That editorial in BMC Evolutionary Biology

On Friday I looked at the website of the open access journal BMC Evolutionary Biology, after a colleague mentioned it as an option. Apart from the whopping article processing fee I noticed the little field "submitting a phylogenetic study? Please consult our editorial for guidance on the methodologies we consider to be of a suitable standard". That sounded interesting.

The editorial published in 2013 lists "common pipeline steps" as follows:
  1. Detecting homologs
  2. Multiple Sequence Alignment
  3. Quality control
  4. Model selection
Ah. What if one is not using a model-based approach? At that point I pressed ctrl + F and entered "parsimony" to see what they had to say on it. I found this:
Until the early nineties, parsimony and distance-based tree-building methods were preferred. More recently, probabilistic model-based methods, namely the maximum likelihood (ML) and the Bayesian approaches have grown to prominence due to their statistical properties and inferential powers. Moreover, these approaches go beyond simple phylogeny inference, providing a convenient statistical framework for further model selection and biological hypothesis testing. While parsimony is sometimes justified as model-free, it has mathematical properties and is not assumption-free; therefore explicit models should be generated for many biological problems. Likewise, distance-based methods may be unreliable for highly diverged data, yet they are often model-based and have nice mathematical properties and thus they may enable very fast and relatively accurate estimation of relevant biological parameters. Distance-based methods for tree reconstruction, such as neighbor joining, are extremely fast, and can provide reasonable solutions for extremely large data sets, something that would be much more computationally challenging with ML or Bayesian methods, even with recent computational advances.
Well, they say "many" biological problems instead of all, and maybe I am missing some nuances here - I am not a native speaker of English, after all is said and done - but to the best of my understanding this seems to say that BMC Evol Biol accepts any phylogenetic method except parsimony analysis.

I want to make perfectly clear that personally I have nothing against model based, statistical approaches. My first instinct when faced with a single, small DNA sequence alignment would be to run it through PhyML as packaged in my version of SeaView. For large supermatrices I use RAxML, and for smaller multi-gene datasets BEAST. For morphological datasets parsimony analysis in PAUP is my default approach, and for population genetic type data I would use distance methods. Really, I am a methods pragmatist and not irrationally attached to parsimony analysis as the proverbial hammer that makes everything look like a nail.

So that being out of the way, I have to say that I just do not see how the above section is anything but the mirror image of the much-maligned Cladistics editorial from earlier this year.

Thursday, April 28, 2016

Keybase

Although I had been aware of its existence for quite some time, I have only recently seriously tried out KeyBase. Under the motto "teaching old keys new tricks", this website hosted by the Royal Botanic Gardens Victoria is a repository of dichotomous identification keys. The following post is a quick note on my experiences so far.

What is in KeyBase?

Coverage so far seems to be mostly Australian plants, but there are also some keys to plants of California, Sri Lanka and New Zealand, and very few non-botanical projects. Maybe coverage will expand over time, it all depends on whether the taxonomic community in other taxa will take it up.

At least for Australia, many of the keys appear to be either previously published in traditional print floras or draft keys meant to be published in upcoming floras, then together with full descriptions, drawings and suchlike. Having them on KeyBase is of course great. In the former case, it makes identification aids freely available to people who cannot afford to buy, say, a four volume state flora for several hundred dollars, or to people who are currently in the field and did not bring those books along, as long as they have an internet connection. In the latter case, it makes keys available that one would otherwise have several years to wait for, and in the ideal case they could be tested and improved in the meantime.

How does it look from the end-user perspective?

If you want to use the key repository, you can enter the scientific name of the group you are interested in into the search field in the upper right corner. Usually it works well. If, for example, you were to enter "Asteraceae", you would be presented with links to keys covering the states of Victoria and New South Wales, all of Australia, and California. Sometimes, however, it doesn't. I know that there is a separate subkey to only the Gnaphalieae tribe in Australia, but searching for "Gnaphalieae" currently draws a blank. You have to go into the Australian Asteraceae key, search the web page for that name, and then click on the little triangle next to it.

This brings me to the next point. Many keys are very helpfully inter-connected. You can use a family key to get to a genus, and then you may find such a little triangle leading you to the genus key. At the moment a species may link to other, external resources, for Australian plants often the Atlas of Living Australia page of the species, which is very helpful as they usually feature a photo and a distribution map.

Each key can be presented in three different ways. The obvious ones are bracketed and indented, terms that I have explained in a separate post. Unfortunately, Firefox has a tendency to freeze on me when I try to have a very large key displayed as indented, but bracketed always seems to work, and indented still works well for smaller keys.

Given that all the keys in KeyBase are dichotomous, it is a bit surprising to find that the third display option is "interactive". As far as I can tell, there is nothing interactive about it though, and it is certainly not multi-access. It just seems to walk through the questions in the usual pre-determined order, displaying only a single question at a time. It thus seems to have only downsides compared to the other two options, and I see no point in using it.

As for the keys themselves, there is obviously very little quality control going on, but that is to be expected in a crowd-sourced project, even if the crowd is largely professional taxonomists. For example, I recently found that a daisy genus containing perennial species can only be reached by answering a couplet as "annual", and then there are the usual problems of hard to judge characters and user-unfriendly, arcane terminology.

How does it look from the contributor perspective?

That is a very good question. I have two small genus keys that I might consider contributing, so I got myself an account today and started looking around for information on how to submit a key and what format it needs to have. The experience was a bit disappointing.

The Help page reads "coming your way soon". Under Manage Account I found the following: "Sorry, we ran out of time and have not been able to create this page yet. In the not too distant future, you'll be able to reset your password and apply to become a contributor to a KeyBase project here." Yeah, that is not very helpful. In that way the repository will not really grow very much.

In conclusion

Although more and more online keys in the future will be multi-access, I see a market niche for KeyBase, especially for collecting in one point lots of previously published, otherwise hard to obtain dichotomous keys, from paper floras and obscure taxonomic publications. And there are also people who actually prefer dichotomous keys.

Already I have made some use of the keys that are available in the repository, so that is good. On the other hand, it definitely needs to become easier to contribute, otherwise it will largely remain restricted to Australian plants, as it is now.