Thursday, April 30, 2015

CBA Conference 2015, last day

The final day of the 2015 CBA conference on species delimitation featured another set of very interesting talks.

Sally Potter presented her and colleagues' exome capture pipeline for the study of skinks. Admittedly this was already so specific and applied that one would not expect to walk away with many concrete ideas for one's own work unless one wants to use pretty much the same method, but she also mentioned a species tree method that had so far escaped my attention.

ASTRAL is supposedly very fast for large numbers of loci; it is said to be "improving on MP-EST and the population tree from BUCKy, two statistically consistent leading coalescent-based methods". Statistical consistency sounds nice, but when I played around with them for a bit back when those precursor methods didn't really convince me. Still, ASTRAL may be worth a try at some point in the future.

Wednesday, April 29, 2015

CBA Conference, day 2

Continuing with the 2015 CBA conference on species delimitation.

Today's first speaker was Sasha Mikheyev, who presented genome sequencing data on hybridisation zones in social insects. Most of his talk focused on Africanised bees in America, where he was lucky enough to find collaborators who had a time series of samples in the freezer capturing exactly the moment when the African genes swamped populations in the southern United States. Important research, but the results were still preliminary, so he could not go into what he ultimately wants to find out about the dynamics of hybridisation.

He ended his talk with a really weird story about an ant species in which queens are clones of their mothers, males are clones of their fathers (how does that work - is there a zygote but the female chromosomes are discarded?), and only the workers are half-half. This means that the males and females of the same "species" are genetically completely isolated, and indeed the male lineage is more closely related to a different species than to the females they have sex with.

You just can't make anything up that is more bizarre than what reality holds in stock for us.

Tuesday, April 28, 2015

CBA Conference 2015, day 1

Most of this week I am at the 2015 CBA conference Species delimitation in the age of genomics.

Although the title suggests that you'd die of alcohol poisoning in short order if you took a drink every time you heard somebody claim that genomic data are going to solve all our problems, the actual talks happily do not fit that stereotype. Maybe people have seen enough genomic data now to dispense with the hyperbole.

Today started off with a talk by the Australian philosopher of science John Wilkins. He argued that the traditional approach taken by most people who are explicitly grappling with the species problem is to declare a theory-based species concept, try to force it onto reality, and then consider the species-ness to be an explanation for what we see in nature: This individual is the way it is because it is a member of that species. In Wilkins' opinion, that is precisely the wrong way around.

Sunday, April 26, 2015

Botany picture #200: Xeranthemum inapertum

Xeranthemum inapertum (Asteraceae), France, 2014. Here in Australia we have lots of Asteraceae in the paper daisy tribe Gnaphalieae that have lost the otherwise fairly daisy-typical petal-like ray florets but then went through the evolutionary equivalent of regretting that loss, so they evolved papery colourful radiating bracts to serve as pseudo-petals. In the Mediterranean area we can find two other groups of Asteraceae that have done precisely the same thing, only they belong to the thistle tribe Cardueae instead.

The above genus Xeranthemum consequently looks very unlike a thistle (and it doesn't even have spines). In fact the well-known golden everlasting Xerochrysum bracteatum, a typical Australian Gnaphalieae, was originally described as a Xeranthemum, the systematic equivalent of describing an Acacia as part of the clover genus Trifolium. Of course that was in 1803, and the mistake was corrected only two years later.

Thursday, April 23, 2015

Giving taxonomists recognition through citations

Today we discussed the recent Zootaxa paper Fried spicy Linnaeus (Kottelat, 2015) in our journal club. It is open access and a fun read, well worth checking out.

Kottelat discusses the pros and cons and potential variants of mentioning the taxonomic authority after Latin names of plants and animals. Well, mostly animals, but the same applies in all of biology. In botany, the usual format is as follows. Anemocarpa podolepidium (F.Muell.) Paul G.Wilson means that the species was originally described by the 19th century botanist Ferdinand Mueller in a different genus, as Helichrysum podolepidium F.Muell., and then later transferred into the genus Anemocarpa by Paul Wilson.

Botanical journals usually demand the author of a paper to give this full taxonomic authority the first time any species is mentioned but to leave it out afterwards, which can lead to rather odd looking sentences in which some species have the authority and others don't. In general, Kottelat argues convincingly, these authorities make the text clunky and do not usually add any information relevant to the reader.

On the other hand, many taxonomists are concerned that taxonomists should be given more, not less, recognition for the work they do. In botany at least the authorities as part of plant names do not count as bibliographic references and thus do not increase a taxonomist's number of citations. Because we are unfortunately evaluated based on how often people cite our work (as opposed to, say, how often they use it without citing it), many colleagues would like to see them turned into proper bibliographic references and would certainly not like them to disappear altogether.

Kottelat ultimately comes down in favour of turning them into full bibliographic references in taxonomic research papers and doing away with them in all other publications, a compromise that does not entirely convince me.

However, I really want to make a different point. I believe that the taxonomists who push for a rule requiring a bibliographic reference every time a species name is used vastly overestimate the utility of such a change.

Tuesday, April 21, 2015

Subsidence of the Pacific

A few days ago I expressed skepticicm about Michael Heads' claim, advanced in defence of the idea that the plants and animals of Hawaii did not get there through recent long distance dispersal but instead evolved in place for tens of millions of years, of the Pacific having undergone thousands of metres of subsidence over that time. Intuitively I was confident that that would have been impossible because most of the continents would have had to be under water.

I now stand corrected. I made a back of the envelope calculation using admittedly very conservative parameters, and it turns out that there is more continent than I thought:

The Pacific Ocean is 165.25 million km2 = 1.6525 x 10^14 m2 large. Assuming that a quarter of that area needed to have a 2,000 metres higher sea floor, the water displacement would be 2,000 m x 1.6525 x 10^14 m2 = 8.2625 x 10^16 m3. Assuming further that this needs to be distributed evenly across the surface of the planet, we would have the sea level higher by 8.2625 x 10^16 m3 / 5.1 x 10^14 m2 = 162 m.

Punching that number into, we see that by far most of the land is still above the sea:

Remember, of course, that I am leaning over backwards here. I have no idea if 25% of the Pacific and 2,000 metres would be enough or if Heads' claim involves larger numbers; that might change something (if you assume all of the Pacific, or half the Pacific and 4,000 metres, there is very little land left). Further, the Pacific was larger 65 million years ago than it is now, with the continents having drifted further apart. Of course the water would not be equally distributed either because the land displaces it too, so it would raise the oceans by more, but because most of the world is ocean that wouldn't matter so much. Perhaps more importantly though, if the floor is higher under the Pacific it has to be lower elsewhere, as there is a limited amount of Earth, and that would likely mean lower land masses at least somewhere.

For various other reasons I remain unconvinced of the vicariance scenario for Hawaii, but the subsidence of the Pacific is not as implausible as I naively assumed at the beginning, at least when using the above conservative estimates.

Saturday, April 18, 2015

Botany post #199: Dawsonia superba

Dawsonia superba (Polytrichaceae), Melbourne Museum, just two weeks ago. The Polytrichaceae have always been my favourite moss family, perhaps because they are the mosses that have got closest to becoming vascular plants; you just have to appreciate somebody who tries to shoulder into a niche space already occupied by hundreds of thousands of seed plant and fern species. Polytrichaceae have simple vascular bundles in their stems and rather complex leaves with lamellae on top. And thanks to these adaptations they can also become quite large.

Dawsonia is the largest of them all and certainly the moss that produces the most robust leaves and the tallest living stems. (Peat mosses - Sphagnum - can form massive spongy cushions but everything but the uppermost layers is dead.) Most people would not recognise it as a moss; they are often taken for vascular plants, for pine seedlings for example.

Having only ever seen it as a herbarium specimen at university (in an accordingly oversized cryptbox), I was extremely gratified to see a living colony in the Melbourne Museum's forest gallery when we visited the city over Easter.

Friday, April 17, 2015

Wow. It worked

Finally got fastStructure installed a few days ago, so that supersedes my previous post about, well, not being able to install it. Also, it is really, really, really fast, and the clusters it infers seem to make more sense than the last time I used it, then on a work computer.

So that's good.

Still, I stand by what I wrote: if it needs somebody with my level of computer knowledge to make it work on the fourth attempt and if it only works on Unix/Linux anyway, then the number of end-users will be sharply limited compared to the original STRUCTURE software that comes with Windows executables and a GUI.

And there are a few other issues:

Just like the last time I used it, if infers basically no admixture. All samples are assigned to a population with >99% except one, and that one is assigned to >95% to one population. And this is a dataset where everything else - morphology, neighbor joining phenogram, old STRUCTURE - screams that one population of our samples is a hybrid swarm. So fastStructure may be of limited use to those studying introgression.

In addition, the output is fairly user-unfriendly. First, to know which of several numbers of clusters (K) to accept, one has to compare the likelihood values for each. The STRUCTURE GUI can display them in a nice table, but when using fastStructure you have to look at each log file individually and perhaps copy-paste the relevant values out into a table of your own making. I wrote a Python script to automate that, but not every user can do that.

Second, now that you know that K = 6, for example, is your preferred result, you want to see what that population structure looks like. fastStructure comes with a little tool called 'distruct' that helpfully draws one of those typical little STRUCTURE bar plots. Without sample names.

Yes. Without sample names. You can supply it with a little file containing the population names, but how do you know which population should have which name without knowing which samples belong to each? So you could just as well have a toddler draw a random bunch of colourful patches, because that would have the same information content as the distruct output.

So time to go back to the actual output and draw a bar plot with other means. Here we hit another snag. Where STRUCTURE outputs a file like so:
Yoursample1  0.340  0.660  0.000
Yoursample2  0.999  0.001  0.000
Yoursample3  0.000  0.000  1.000
... the equivalent fastStructure output file with the extension meanQ looks like this:
0.340  0.660  0.000
0.999  0.001  0.000
0.000  0.000  1.000
Again, which sample is which? To make sense of the results I really need to know that, don't I? Well, they are in the same order as in your input file, but as that is usually going to be a '.str' file with two rows for each sample, it is not a trivial exercise to copy the results and the sample names next to each other in a way that you can use for producing the bar plot. Also, the columns are separated by two spacers instead of one tab, making it even harder to copy into an Excel or LibreOffice sheet, but if I remember correctly that was already the case in STRUCTURE.

From a user friendliness perspective this all looks rather poorly thought through. But again, the program is free and several orders of magnitude faster than STRUCTURE, so one can't complain too much.

Monday, April 13, 2015

Botany picture #198: Stenocarpus sinuatus

Stenocarpus sinuatus (Proteaceae), the Firewheel Tree, in the Royal Botanic Gardens Melbourne, just a few days ago. Our non-botanist hosts called it the 'bird tree' because its flowers are apparently very attractive to birds.

Friday, April 10, 2015

Chance dispersal, 'normal' dispersal, and long distance dispersal ... confused yet?

Having now picked up Michael Heads' second contribution to the recent issue of Australian Systematic Botany, Biogeography by revelation: investigating a world shaped by miracles, I am glad that I read the other one first. On the one hand, the present paper is really just a 23 pages long criticism of a single book, Alan de Queiroz' The Monkey's Voyage: How Improbable Journeys Shaped the History of Life; on the other, its argumentation is remarkably redundant with the first paper. It even contains another discussion of the ratite birds! Was it really necessary to write both of these papers, and for the same journal issue at that, considering that this criticism of de Queiroz is merely a special case of the criticism of mainstream biogeography that has been expressed in the first one?

And because it is all about one book, the paper is also, at least in my eyes, a remarkably uninteresting contribution to the discussion. I have no intention of reading The Monkey's Voyage, so I cannot judge if any quote mining is going on or not. But let us for assume the sake of argument that Heads' criticisms are, in this case, right on the mark; that de Queiroz really is that most elusive of straw men, a biogeographer who not merely sees an important role for founder effect but who actually rules out the possibility of vicariance as a speciation mechanism. Would that make the panbiogeographic approach, here strangely called 'vicariance theory', any more defensible?

Of course not. If you could find a person who irrationally rejects the possibility of vicariance a priori, that would still not in any way whatsoever make the a priori rejection of speciation after long distance dispersal less irrational. So in the grand scheme of things there is really not much point in addressing anything the present paper argues about de Queiroz.

Similarly, a very large part of the paper is taken up by case studies where the author argues for vicariance scenarios. Again, let us assume for the sake of argument that vicariance is the correct answer in all these cases, every single one of them. Now would that make the panbiogeographic approach any more defensible?

Again, no. The real question of interest is whether a panbiogeographic analysis is science, and because it always builds the conclusion into its starting assumptions (the ancestor was already distributed everywhere) there are good reasons to argue it isn't.

As much as Heads tries to frame the controversy as 'dispersal theorists' versus 'vicariance theorists', to the best of my knowledge at least there are really hardly any, if any, dispersal theorists anywhere, at least not any fitting his characterisation. Who I have met and read are numerous mainstream biogeographers who accept vicariance, range extension, long distance dispersal, local extinction, range shifts and (depending on who you ask and what geographic scale we are talking) sympatric speciation as possible biogeographic processes, deciding between them case by case depending on the available evidence.

And on the other side "the panbiogeographic approach, as illustrated here, explains allopatry by vicariance and overlap by dispersal" (Heads, 2015). Notabene: One single process to explain speciation, all others are forbidden. There is no symmetry to these two positions.

Still, there are a few things that occurred to me when reading this paper that didn't when I read the first one.

Thursday, April 9, 2015

The ideal software tool is one that no end user can install

Did I write that I would not comment any more on fastStructure after my other two posts?

I just spent much of the evening trying to install it at home, and banging my head onto the desk for two hours would probably have been a more pleasant experience. After all that effort I got all the prerequisites - Numpy, Scipy, Cython and GSL installed, and even running the build seemed to have worked. At least when I try to do it over, it says, nope, I'm already done, no need.

But then when I try to run fastStructure, it says that it cannot find the libraries that are just sitting there in the same folder.

To all potential end users: Don't try this at home. Preferably ask your institution's IT people to put it onto some machine for you. Or use a different software.

To all bioinformaticians who hope that scientists will use their software: This is how not to do it. If you write a program that only another programmer can even so much as install, then hardly anybody will use it, and thus hardly anybody will cite your paper. Wasted effort. How about providing executables, like BEAST? Or at least something that can be installed by running a simple makefile, like r8s? Or an R package perhaps, like BioGeoBEARS? It's not as if there aren't people who get that right.

Update: It worked!

Tuesday, April 7, 2015

Royal Botanic Gardens Melbourne

We spent the Easter Weekend in Melbourne, visiting family. One of the attractions of the city we visited was the Royal Botanic Gardens. They are free to enter and easily accessible from the city centre, especially via tram. The gardens are next to the Shrine of Remembrance, a war monument.

One of the main entrances is the Observatorium Gate, named after an old astronomical telescope.

Behind this gate on the right you will soon find the Ian Potter Childrens Garden. It features sand and a watercourse, hedge labyrinths of aromatic plants, a vegetable and herb garden, a lookout in a grove of massive bamboos, and a hidden elephant sculpture. I am told it is not usually as full as it is in the above picture; again, it was Easter.

Botanic gardens can serve a myriad of functions - education and training, research, science communication, plant conservation, recreation, and much more, and accordingly they may look very differently. The RBGM looks at first sight very much like a public park, so one feels that the recreation aspect is a big focus. Shown above is the central lake, and when we were there several weddings were taking place around it, probably because it was 4 April, an easy day to remember.

However, there were also science communication activities going on; we came past a big stall presenting different wooden plant fruits, ethnobotanical information, and photographs of the Amorphophallus that was unfortunately just past flowering. And of course the RBGM are also one of Australia's top botanical research institutions and feature one of the largest herbaria on the continent. I have yet to visit it, because so far I have only made it to Melbourne for a conference or on public holidays...

Finally, this is what I find particularly fascinating about the vistas of the RBGM: The city centre of Melbourne is so close that one will often see the towering skyline directly behind an open park landscape. It looks downright photoshopped sometimes, like one of those surreal SF book covers where a futuristic city sits directly in the middle of wilderness.

Saturday, April 4, 2015

How not to convince me that panbiogeography is science

A few days ago, the newest issue of Australian Systematic Botany appeared, and I was rather surprised by the contents, for two of the four papers of the issue were opinion pieces by panbiogeographer Michael Heads. One of them is even, to the degree that it discusses anything specific at all, about ratites. Why would a journal of systematic botany accept a zoological paper from a fringe school of biogeography? And why would a panbiogeographer submit his manuscripts to a regional taxonomic journal as opposed to an international one focusing on biogeography? One wonders.

I have written about panbiogeography before, and my perception was and still is that its proponents are mostly characterised by an irrational hostility to long distance dispersal. In my second post I discussed an example of Heads himself trying to force a biogeographic story into vicariance although that would mean that the plant family in question would have had to originate before the evolution of multi-cellular life.

But as in the case of 'evolutionary' systematics I remain interested in the arguments of the other side, and thus I decided to read the two papers. I have started with and will make this post about Heads' Panbiogeography, its critics, and the case of the ratite birds because it promises to provide an example of the panbiogeographic method in action.