Wednesday, April 30, 2014

Leptokurtic distributions do not magically end at a water's edge

For some reason, my post on panbiogeography has become quite popular over the last two weeks or so; no idea why. Maybe it is accessed by a lot of people who want to obtain information on this school of biogeography, or maybe it has been discovered by panbiogeographers, although in the latter case I would expect them to be unamused and express themselves in the comments. Be that as it may, it may be useful to expand on an argument that I made in that original post.

As pointed out then, vicariance as the only explanation for current biogeographic patterns, or even just as the null model, is utterly self-defeating because dispersal is indispensable for the ancestor to cover the wide area that can subsequently be sub-divided by vicariance events. Another way of putting it is that if there were no dispersal, all of life would still be sitting in whatever single spot in the ocean it originated some three billion years ago. (And of course it is likely that that one spot has long disappeared under layers of silt, been subducted into the mantle of the planet, or undergone some other traumatic events.)

So really there is no way to tell an intellectually defensible biogeographic story without dispersal. The panbiogeographer now has two options to justify their rejection of dispersal: First, dispersal was only possible in the far past when various ancestors spread across the world, and then somehow became impossible. I hope it is obvious that does not make any sense whatsoever, and sincerely hope that no panbiogeographer would argue for it. Second, dispersal is admitted to be possible on the same land mass but suddenly stops completely once the land mass has broken up. (Subsequently I will focus on terrestrial organisms, but in the case of aquatic organisms simply swap "land mass" and "water body" and you have the same situation.) This option seems at least somewhat plausible.

In the previous post I cited numerous observable cases of organisms crossing from one land mass to another across appreciable distances. Today I want to make a more general point: Do we have any reason to assume that dispersal suddenly drops off completely at the edge of a land mass? Not really.

We would expect dispersal to follow a leptokurtic distribution: most of the propagules end up very close to the parent, and soon there is a steep decline followed by a long tail of increasingly rare dispersal events across longer distances. What we would not expect is a sudden and complete drop-off the moment a bit of water gets in the way; certainly not for seeds or flying animals, but even land animals can generally swim a bit, and tiny ones can be blow across by the wind.

Sure, the panbiogeographer could reply that the larger the distance, the less likely it becomes that a colonization event can take place across it. But that is just the point: less likely, not impossible. For their dogma of vicariance to make sense, we would have to assume that water forms a magical barrier at some point.

But where do you draw the line? If you accept that a group of organisms can sometimes cross 10 m of water, why would they be completely unable to ever get across 100 m? If you accept that a group of organisms can sometimes cross 100 km of water, why would they be completely unable to ever get across 500 km? There is simply no reason whatsoever to accept the magical barrier model unless you assume that some life forms turn to ash the moment they lose contact with land. Instead, it is all about probabilities, not about dispersal being impossible.

Of course, it depends on the organisms in question. Some groups definitely have a harder time surviving long travel, and thus their leptokurtic distributions may have a relatively short tail and they will not be able to cross 500 km of ocean with a sufficient probability of survival for a successful colonization event to happen within a realistic time window. This is especially true of large animals with high metabolic rates - where a lizard may survive rafting, a mammal of the same size is likely to fatally starve or dehydrate.

On the other hand, if a species is capable of powered flight, such as bats, birds and larger insects, it becomes reasonable to assume that the likelihood of them colonizing distant islands is actually higher than what we would expect from the leptokurtic distribution because individuals accidentally blown into the general area of such an island may actively seek it out. We would then get a bump on the distribution wherever there is an island. This ability would also extend to animal-dispersed plants whose seeds they carry in their stomachs or on their bodies.

A very illustrative example is New Zealand, ironically one of the strongholds of panbiogeography. The native vertebrate fauna before human arrival did not include any mammals except oceanic ones ... and bats. So we can assume here that its isolation has for a long time been too great far for a rat or possum to make the trip. But unless it is assumed that bats had evolved before New Zealand separated from Gondwana, which given their phylogenetic position means that many other groups of mammals would also have been there and then rather strangely all died out where only bats survived, the only possible explanation is that bats arrived in the country through long distance dispersal.

Tuesday, April 29, 2014

Some news items

I have conducted most of my field work before I came to Australia in South America, and most of the time there I spent in Bolivia. In total, I travelled to Bolivia three times, once for my undergraduate thesis, once for my Ph.D. project, and once as a postdoc, and always entering the country via La Paz. Although it has been several years since I was there last, I still have a soft spot in my heart for this chaotic, sprawling, high elevation metropolis, and fond memories of the Bolivian National Herbarium and its friendly staff.

Only now have I heard of the most astounding infrastructure project planned by the Bolivian government: a cable car system of three lines to connect the centre of La Paz (ca. 3,700 m.a.s.l.) with El Alto (ca. 4,000 m.a.s.l.). This is just awesome. The lines will reduce traffic pressure and pollution, and imagine the view across the city and the peaks of the Andes! The video in the linked article gives a bit of an impression, but you must have seen the city yourself to really appreciate what this means.


In news that might interest those dealing with creationists, scientists at the University of Cambridge have demonstrated that complex biochemical reactions can be catalysed by metal ions under conditions potentially similar to ancient oceans.

In the article, one researcher expresses scepticism because "the reactions observed so far only go in one direction; from complex sugars to simpler molecules like pyruvate". However, the latter reaction is still one of the core pathways used by living cells, and as one of the authors correctly points out all such reactions are in principle reversible, given the right conditions.

Sunday, April 27, 2014

Mushrooms in Tidbinbilla

On the weekend we made another trip first to the Canberra Deep Space Communication Complex, and then on to Tidbinbilla Nature Reserve.

The CDSCC is financed by the NASA and operated for the Australian government by the country's premier science agency CSIRO. It is one of three stations worldwide that maintain communications between Earth and space vessels. The dish above is the largest of the complex, and the largest in the southern hemisphere. It is about 70 m in diameter, a bit more than 70 m tall, and the movable part weighs approximately 4,000 tons (!).

At Tidbinbilla, there were more mushrooms and other fungal fruiting bodies out than I have ever seen before. I have no idea what any of them are called, so I will just post a bunch of pictures. Enjoy!

It must simply have been the right time of the year...

Saturday, April 26, 2014

Methods overload

The conference I attended this week was a very methodological one, with lots of modellers in attendance. Looking back over the talks I heard and the workshops that were offered, two things have occurred to me.

First, while everybody appears to assume that we are in the era of big data, especially due to genomic sequencing and the increased availability of  biodiversity databases, it seems more to me that we are instead currently drowning in methods.

Thursday, April 24, 2014

That is very forward of them...

As mentioned previously I get a lot of science spam soliciting the submission of manuscripts to some fifth rate for-profit journals, as probably does everybody in science. Still, this is probably a first for me: They have gone to the trouble of getting themselves set up in Editorial Manager, a legitimate manuscript management system also used by many good journals. And then they actually already registered an account for me, with username and password and all.

This is most unusual. Generally the decent journals that you want to publish in sit there waiting for you to sign up and submit. If they do set up an account for you it is because they want you to peer-review for them. The bogus journals, however, are normally happy with submission of the manuscript as an e-mail attachment. The approach taken in the spam message above is kind of half-half, and thus a bit weird; it looks a bit more professional, but at the same time it betrays their desperation to get somebody to submit something, anything, so that they can earn a few bucks, something that would be beneath professional scientific journals.

And of course, how can you take a journal serious that spams a systematic botanist with a solicitation for articles in proteomics?

Tuesday, April 22, 2014

Conference today

I am visiting a conference this week, Understanding Biodiversity Dynamics Using Diverse Data Sources. But as it takes place in the city where I work I don't have to go far - the talks are just down the hill from my office. Some of the talks today were inspiring and utterly fascinating.

Two in particular I want to mention very quickly, although for very different reasons. The first was, from my perspective, perhaps the highlight of the day, but more importantly it was highly relevant to my recent discussion of the merits of phylogenetic diversity. Among other things, Professor Marc Cadotte talked about the claim, already advanced by Charles Darwin, that, all else being equal, a biome containing more phylogenetic diversity (PD) is more productive than a biome with less.

And he presented an ingeniously simple field experiment to test this claim quantitatively: They planted numerous experimental plots with one, two or four species of varying evolutionary relatedness, taking care to replicate close relatedness in various groups (two grasses, two Lamiaceae, and so on). The result was a beautiful and clear positive correlation of grassland productivity with PD.

As I wrote above, this is highly relevant to the question of whether PD is correlated with feature diversity, only in this case there is another mental step involved. The underlying idea is that a community of four distantly related species would be more productive than one of four closely related ones because it would have higher feature diversity in the sense that the species are more divergent and can thus utilise more different resources than the same number of supposedly more similar, closely related species.

The positive result of the experiment can thus be read as another vindication of PD, and surely the unseen characters that allow these plants to complement each other in the utilisation of local resources are "conservation relevant", to use the words of the PD critics I discussed a few days ago.

The other talk that I briefly want to mention was sadly somewhat less sparkly. The thing is, there are many snobbish molecular and evolutionary biologists who consider natural history collections and the research they do as mere "stamp collecting" and a tedious description of patters instead of something that increases our understanding of relevant processes.

I will admit there are some colleagues who fit that description, who would go and bore the socks off an audience by showing them half an hour worth of insect mouthparts or leaf shapes without betraying any underlying research question.

But bizarrely, at the moment I have examples showing the exact opposite. In my backpack is a paper I am peer reviewing where the authors examined a highly topical but supposedly hard to test evolutionary question through the simple expedient of measuring a few macroscopic characters on dusty old herbarium specimens and doing some simple statistics. Conversely, today at the conference I had to sit through a 30 minute talk by a scientist working with cutting-edge genomics and bioinformatics techniques who filled virtually the entire time with one "and this transcription factor has this effect on the insect wing" after the other. It was purely descriptive, and I never understood why anybody should care to know this stuff.

That just goes to show that it is not what tools you have but what you do with them.

Monday, April 21, 2014

Parsimony reconstruction of species trees from gene trees

Continuing my series on the uses of parsimony analysis in phylogenetics and biogeography, we come to the inference of species trees from gene trees.

I have written about the problems with inferring species relationships directly from the relationships of genes that are sample from these species before. In short, healthy species contain genetic diversity with potentially several different alleles for any given locus (e.g. how human eyes can be different colours). The same was true for all ancestral species in evolutionary history, and at first their two descendant species in a speciation event may each have inherited a part of that genetic diversity.

Because there is limited space available for alleles in any given species, even merely through the random process of genetic drift some of them will be lost in the descendant species. However, it takes some time for this loss to happen, and so it is possible that by the next lineage split resulting in more descendant species one gene may still have alleles that diverged in the previous ancestor. If that is the case, then the descendants may inherit a random selection of alleles that show different relationships to each other than the real species relationships, potentially misleading our phylogenetic inference.

For example, although we know from multiple lines of evidence (including most genetic data) that the chimpanzees are our closest living relatives, a minority of our genes is more closely related to those of the gorilla than to those of the chimpanzees. So if only one of those genes were sampled and all other evidence ignored, one might mistakenly infer that the gorillas are our sister species. And in some plant and animal groups mistakes like this can easily be made.

The solution is to use more samples per species than one, to use more genes for the molecular analysis than one, and to use species tree methods. As indicated in my earlier post on species tree software, there is a parsimony approach to this issue. In fact there are two different ways of doing species tree parsimony, depending on what kind of gene trees we are dealing with.

Sunday, April 20, 2014

Easter at Guerilla Bay

Happy pagan fertility festival everybody!

Over the Easter weekend, we were very generously invited to a colleague's holiday home on Guerilla Bay. The weather was absolutely perfect, and we had a great time.

Guerilla Bay itself. The surf was fantastic, but admittedly I did not make much use of it myself because I have a cold.

A rock arch in the bay; I just love the weird shapes produced by erosion on the Australian coasts.

My and a colleague's daughter floating pumice in a bucket. The beach was covered in the material. I was told that a volcano had erupted under the ocean near New Zealand a few weeks ago, and that the rocks had drifted ashore ever since.

There were few native plants still in flower at the coast, but on the way there I photographed this specimen of the paper daisy Coronidium scorpioides. Actually, it may have a different name now because the species group it belongs to has just been revised by Neville Walsh of the Royal Botanic Garden in Melbourne, but I do not have a copy of the manuscript at home so I cannot check...

Wednesday, April 16, 2014

Another good one

Because I show a "level of intricacy in my work" which makes the journal "even more proud", they believe that my works "should be known to the mankind of science". Aha. No idea why I should "capitulate" my manuscript though. In terms of the language, this is one of the worst journal spam messages I have ever seen. Perhaps the chap who runs this business from a garage in Pakistan or Nigeria or whatever should consider taking an introductory English course before pretending to be a scientific publisher?

Oh, and peer review of only seven days from submission to publication? Yeah, right...

Tuesday, April 15, 2014

Is phylogenetic diversity flawed?

As indicated in my previous post on phylogenetic diversity (PD), what I really want to discuss is a recent paper, Kelly et al. (2014), that casts doubt on the utility of PD for conservation decisions. Again, understanding what this is about will require some excurses and explanations, but as we will see the main point is ultimately surprisingly banal.

As illustrated with the example of many species of grasses versus fewer species of oaks, lilies, grasses and ferns, the background is that PD is supposed to provide a metric for a form of diversity that we want to conserve. Now many of us would say that this form of diversity is evolutionary distinctness and be happy; we intuitively consider isolated lineages to be worthy of special protection. However, PD is often advertised as a proxy of what Kelly et al. call "feature diversity" (subsequently FD). That is, what we want to conserve in an area is not phylogenetic diversity per se but instead maximum diversity of some (morphological? ecological? genetic?) features. However, it is assumed that the more distant two species are in their evolutionary relationships, the more different they are likely to be in any given feature, and that is why we assume that high PD means high diversity of conservation relevant features.

Kelly et al. set out to test this assumption. Maybe one of their thoughts was that it may not actually be true because of convergence. One could argue that it doesn't make a lot of difference whether we protect a grass in the Poaceae family as long as there is a very similar looking grass-like Cyperaceae around. They have converged on the same ecological niche, so same thing really, right? But that is already the interpretation, perhaps it behoves me to stay with the methodology for the moment.

Friday, April 11, 2014

Phylogenetic diversity

Yesterday we discussed in our local journal club a recent paper arguing that the concept of phylogenetic diversity is flawed, or at a minimum not useful as a proxy for what the authors call "feature diversity".

Obviously to make sense of what I just wrote, a bit of background is needed. I will therefore use this post to explain what phylogenetic diversity is, and then discuss the actual problem (if there is one) the next time.

Thursday, April 10, 2014

Botany picture #152: Mentha requenii

Due to a conversation with somebody at work, the true mints have recently featured on my mind. They have always been one of my favourite genera, and I actually had a nice collection of species and hybrids on my balcony before I came to Australia. Due to quarantine restrictions, I had to leave them all behind and gave them away to colleagues. This one is rather unusual Mentha requienii (Lamiaceae), a very small, creeping species with a pungent minty scent. The picture was taken in 2009 in a botanic garden in Europe; unfortunately I don't remember which.

Tuesday, April 8, 2014

Character optimisation in parsimony phylogenetics

As mentioned in my last post on parsimony analysis, there are different forms of parsimony that are used in the reconstruction of phylogenetic relationships. We could describe them as different ways of counting the necessary number of character changes to explain a given phylogenetic tree.

Monday, April 7, 2014

Botany picture #151: Tagetes lemmonii

Tagetes lemmonii (Asteraceae), Royal Botanic Gardens of Sydney, 2011. This species comes from North America and is apparently a popular ornamental in some areas. What I found particularly striking about it was the pungent aromatic smell of the foliage; I am always fond of aromatic shrubs, probably because I worked on a genus of Lamiaceae in my Ph.D. project.

Friday, April 4, 2014

An addendum on Zander's Framework

Back in January and February I wrote a few posts on Richard Zander's A Framework for Post-Phylogenetic Systematics while I was reading the book:

Why we don't consider supraspecific taxa to be ancestral

No arguments from authority please, even if it is Charles Darwin

Two possible meanings of the term "pseudoextinction"

Can parsimony analyses be mislead by 'budding' speciation?

Can we trust molecular phylogenetics?

Although I admit that my reading got a bit less attentive towards the end, partly due to the rather repetitive style of the work, I considered myself to be done with the book. Today on the bus, however, I took it with me to deposit it on the bookshelf at work, and while stuck in a traffic jam I read once more over the chapter titled Contributions of Molecular Systematics.

On page 58, Zander argues that one should assign a support value to entire phylogenetic trees (which he strangely insists on calling cladograms although most of them have branch lengths):
Whole cladograms are seldom provided with confidence intervals (here including posterior probabilities) that reflect their perceived chance of being correct. In the literature, however, many cladograms are used in their entirety to model broad conclusions, e.g., many genera grouped into multiple families. These cladograms are commonly viewed as "mostly correct." But what does "mostly correct" mean? The binominal confidence interval (BCI) is here advanced to provide a measure of confidence in whole cladograms that are used for broad conclusions. It provides the proportion of nodes (or internodes) with Bayesian support measures that one can expect to be correct all at once than total nodes being correct at once, defining "correct" as joint probability of at least 0.99.
Maybe it is a language issue because I am not a native speaker of English, but my understanding of what the terms "confidence interval" and "to model" mean differs from how they are used here, and the last sentence does not appear to be complete. After mentioning some example numbers, he continues as can be expected:
Therefore, for most cladograms that are published and used for broad conclusions, the confidence in those cladograms, each used as a whole, seldom reaches 0.95, a standard for confidence in statistics.
I have given this issue some thought and I really do not understand why I should care about the overall support for the phylogeny as a whole. A simple thought experiment should get the point across.

Thursday, April 3, 2014

Botany picture #150: Mniodendron comosum

Another picture from our holiday in Tasmania last year. Patrick Dalton, bryologist at the University of Tasmania, was kind enough to identify this pleurocarpous moss as Mniodendron comosum, commonly known as 'palm moss'. He added that it was one of the "most handsome mosses in our flora", and I think I agree.

Tuesday, April 1, 2014

Phylogenetic analysis using the parsimony criterion

One of the simplest ways to reconstruct the phylogenetic relationships between different organisms is parsimony analysis. As explained in the previous post of this series, the principle as applied to tree inference is very straightforward: compare possible solutions by counting the number of events in each and accept the solution that needs the smallest such number.

Now what are the events, and how does that work in practice?