Monday, September 18, 2017

Some quick, handy references

I just read that an Australian Senator called Fierravanti-Wells said the following:
I believe that marriage is between a man and a woman ... coming together in one unique union. That is what it has been for every culture, in every ethnicity, in every faith in every corner of the world for thousands and thousands of years.
Suggested reading:
The primary literature is cited in those entries. Also:
It would appear that the premise of this particular argument is demonstrably false. It would further appear that it is also not a logically valid argument, regardless of the truth or falseness of its premise:

Thursday, September 14, 2017

Botany picture #253: Coronidium waddelliae


Coronidium waddelliae (Asteraceae), Blue Mountains, 2016. This is very pretty perennial everlasting paper daisy that likes a bit of elevation. It can, for example, also be found in the lower parts of the Australian Alps.


Although only created in 2008 as part of the dismantling of the formerly polyphyletic Helichrysum, the genus Coronidium is unfortunately polyphyletic itself; the species that can be seen in an earlier post is a representative of the group that makes it so. What is more, even if that group were kicked out, the rest (to which C. waddelliae belongs) would still be paraphyletic to the well known golden everlasting genus Xerochrysum.

We are working on it.

Monday, September 11, 2017

What exactly is new about New Atheism?

On Sunday I went back to the book fair with my family, and of course I bought another few books. One of them is a collection of essays written by Bertrand Russell. As its title is Why I am not a Christian it is unsurprising that its first chapter is his talk of the same title, which he originally gave in 1927.

Summarising in order, the talk makes the following points:

He starts by giving his definition of Christian. For Russell this requires at a minimum belief in the existence of a god, in immortality, and that Jesus Christ was "the best and wisest of men".

Next, Russell disposes of several common arguments for the existence of God, observing along the way that the most frequently used arguments have become less respectable over time. The first cause argument falls flat the moment somebody asks "who made God?", because if God is allowed not to have an explanation then one could just as well allow the universe not to have an explanation.

The natural law argument does not work because it conflates human laws, which are prescriptive and indeed have law-givers, with natural laws, which are merely descriptive, merely scientific descriptions of what happens instead of prescriptions of what should happen. As such they do not need a law-giver. Russell also points out that science has shown them to be largely "statistical averages such as would emerge from the laws of chance; and that makes this whole business of natural law much less impressive than it formerly was." Finally he adds a Euthyphro style argument, that laws are not really laws if God just made them up, but that God is not required if they are truly laws of nature.

The argument from design was destroyed by Charles Darwin, and in that context Russell also introduces the argument from evil to show that the world does not look as if it was created by a benevolent, omnipotent being.

The moral argument is quickly disposed of by applying the Euthyphro dilemma.

Russell calls the argument for the remedying of injustice, i.e. the idea that god must exist or else there would be no ultimate justice in the world, very "curious", and I can only agree. I have only once seen it used in seriousness, and it is such blatant wishful thinking that it hardly needs refutation.

Having dealt with the existence of God, Russell transitions to the character of Christ. He calls "excellent" several of Jesus' teachings that I would consider unrealistic, for example 'turn the other cheek', but sarcastically points out that Christians do not actually follow those teachings. ("I have no doubt that the present Prime Minister, for instance, is a most sincere Christian, but I should not advise any of you to go and smite him on one cheek. I think you might find that he thought this text was intended in a figurative sense.")

As an aside, Russell mentions that "historically it is quite doubtful whether Christ ever existed at all".

More importantly, Russell says that there are "defects" in the teachings of Jesus the character of the gospels, most prominently that he mistakenly believed that the end of the world was imminent and that he believed in and took "a certain pleasure" in hell, i.e. eternal torture. The undeserved killing of a fig tree also gets a mention.

At this point Russell has explained why he is not a Christian. He now deals with the idea that even if religion is wrong it should still be promoted because it makes people behave morally by pointing out that it does the exact opposite. "You find as you look around the world that every single bit of progress in humane feeling, every improvement in the criminal law, every step towards the diminution of war, every step towards better treatment of the coloured races, or every mitigation of slavery, every moral progress that there has been in the world, has been consistently opposed by the organised Churches of the world."

The talk ends by arguing that fear of the unknown and of death is the foundation of religion, and that it is time to dispose of it and build a good world on a new foundation: "Science can help us to get over this craven fear in which mankind has lived for so many generations."

-----

Russell was certainly an excellent writer, at least to my taste. He was concise, clear, and to the point. But really what struck me most when I read this talk / essay is that there really is no New to what has been called New Atheism these past fifteen years or so, i.e. the movement often considered personified by Richard Dawkins, Sam Harris, Daniel Dennett and Christopher Hitchens.

Because what really is its claim to novelty? Perhaps the first thing that comes to mind is the claim that religion is not just wrong but harmful, and that its influence should be reduced. But go back a few paragraphs and you will see that Russell said the same in 1927.

Another idea is that its novelty might be in the view that science in particular has made belief in gods untenable, a position that is often derided as 'scientism' by philosophers who believe that they have a monopoly on refuting religious beliefs. Again, nothing new: where today some New Atheist might argue from evolution, astrophysics and neuroscience, a hundred years ago an atheist like Russell argued from evolution and astrophysics. And to be honest, neuroscience has found nothing in the last thirty years that refutes the concept of an immaterial soul more thoroughly than what people could already observe in the bronze age, for example that a strike to the head or drinking alcohol confuses our thinking.

Even rather specific side-issues have remained surprisingly unchanged. Richard Carrier et al. have in recent years made a lot of waves with the argument that Jesus never existed, and would you not know it, ninety years ago Russell mentioned this idea in a tone that suggests it was fairly widely accepted among educated people.

Really I don't think that arguments for or against gods have made much progress since 1859, and if somebody wanted a short but reasonably thorough introduction to atheist thought they would even today be well served with reading Bertrand Russell's Why I am not a Christian.

Saturday, September 9, 2017

Book fair o'clock

It is the time of the Lifeline charity book fair again. Unfortunately I had to go alone today, but tomorrow we hope to get the whole family there. The loot so far:

Susanna Clarke, Jonathan Strange & Mr Norrell. Fantasy alternate history, as in the Napoleonic times with magic. I have read good things about this book, so I'm happy to give it a try.

Bertrand Russell's Best, edited by Rogert Egner

Terry Jones, Douglas Adam's Starship Titanic. If I understand correctly this is a book after a computer game with which I am not familiar.

Anne McCaffrey, Dragonflight. First novel of the Dragonriders of Pern series. The series has lots of volumes, and I could have bought more of them, but who knows how they are?

Anne McCaffrey, Dragonquest. Second novel in that series.

Kirk Mitchell, Cry Republic. As a teenager I read the trilogy of which this is the third novel in German translation. It is an alternate history story in which the Roman Empire never collapsed and has discovered electricity, steam and flight. Just noticed that the praise blurb on the front cover quotes Anne McCaffrey.

Great Dialogues of Plato, translated by W.H.D. Rouse. Continuing my education in classics, which perhaps should have happened in late high school but didn't.

Sean Williams, The Stone Mage and the Sea. I am afraid part of the reason I bought it is that I got confused and thought the author was Tad Williams. Ah well.

And for work:

H.T. Clifford & Gwen Ludlow, Keys to the Families and Genera of Queensland Flowering Plants. From the 1970s, but will still be useful.

Nicholas Gotelli, A Primer of Ecology. Having been trained as a systematist I am hoping to get a bit more insight into ecological modeling, and it includes a chapter on island biogeography that looks promising.

Andrew Young & Geoffrey Clarke, Genetics, Demography and Viability of Fragmented Populations. Because of a project I am currently involved with.

Two observations. First, as always I come home with loads of books but could only bring myself to donating two. Some day we will have to expand our book shelves, and I have no idea where. Second, it is astonishing how there are numerous copies of some books (e.g. McCaffrey's Nerilka's Story) but none whatsoever of other, one would think, equivalent books (e.g. the third volume of the same series).

Also, are the frequent ones frequent because they were so much more popular when they were published, or because nobody wants them now? On that note, it was interesting to see that the most frequent book in the "all faiths" section was The God Delusion. It was all the rage a decade ago, and I assume now lots of people think they don't need it anymore.

Tuesday, September 5, 2017

Botany picture #252: Stapelia


With spring in the air we bought some plants lately, including some succulents, and that brought my mind back to some of the plants I had in Europe but gave away when moving to Australia. This is a Stapelia, presumably S. grandiflora (Apocynaceae), in the Old Botanic Garden of Göttingen, Germany, 2016. I had a plant of the same species but of course much smaller.

Unfortunately they seem to be hard to get in Australia. The closest I saw were online stores that had them in the catalogue in theory but not actually in stock.

What is so amazing about these cactus-like plants that are not cacti at all is, of course, their flower morphology, and that morphology is dictated by their pollination ecology. The flowers mimic rotting meat in colour, scent and sometimes even hairiness and are pollinated by confused carrion flies. Some people wonder why one would want to have a stinking flower, but I believe that is an indispensable part of its wonderful weirdness.

Saturday, September 2, 2017

Having fun with biodiversity databases

If you have ever professionally used a biodiversity database you will soon have noticed that we still have a long way to go before they are as reliable as we would like them to be.

Today I looked into the Atlas of Living Australia records for Senecio australis (Asteraceae). Except for a rather odd specimen from South Africa the distribution records look like this:


What do we have here? First, the four Australian mainland records all appear to be misapplications of the name. The Flora of New South Wales, for example, does not even mention the species, so I think we can safely assume it does not occur in Australia at all.

Second, the record in the middle of the top of the map is right in the ocean, no matter how closely we zoom in. If we look into its details, we see that it was collected on Norfolk Island, which is the cluster of red dots to its right, so somebody must have got the coordinates rather wrong.

Third, there is a cluster around Auckland, on New Zealand's North Island. I am not sure if Norfolk Island and North Island is a plausible area of distribution for this species, but it may well be. Zooming in closer to Norfolk Island, however, ...


... it looks as if somebody had played darts after having had a few too many beers. ALA informs us dryly under the section data quality tests, "habitat incorrect for species". No kidding. Or as my wife joked, unwilling to believe that the coordinates would be so badly off for such a large percentage of the specimens, "is there a fish that is also called Senecio australis?"

These are the problems that we are dealing with, more generally.
  • Whenever we do a study using data from biodiversity databases, as we increasingly do, we have to be very careful about cleaning the data. The main issues are outdated taxonomy, misidentifications, spatial data entry errors (which are particularly easy to recognise if an outlier record is exactly ten degrees away from a known occurrence), and imprecise spatial data. Just think of what it would do to species distribution modeling if we uncritically accepted all the records for Senecio australis.
  • While we can identify obvious mistakes while using a database, the data are "ground-truthed" in the actual specimens in some herbarium or museum, and the policy is usually (and quite sensibly) that the database won't update until a correction is made to that specimen in its home institution and then filters through from there. But many institutions do not have the resources to update data just because somebody sent them an eMail pointing out that their specimen is misidentified or that they made a data entry error; many herbaria on the planet are so understaffed that even the word understaffed is a euphemism. What is more, even if a database allows a registered user to annotate a record with corrections, the information may not necessarily flow back to the institution holding it, depending on whether somebody thought to set up such procedures or not.
  • Overall, Australia actually has excellent data quality, the Atlas of Living Australia actually allows annotations to be made, and several important Australian herbaria actually have the staffing to update their data. What I am saying is that this is as good as you can have it at the moment. It is much more difficult in many other parts of the world, and of course it would be good if we could have the same or better data quality for those areas.
Also, perhaps it is best not think too much about records that are not specimen-based but "human observations" or photos submitted by random people. There are, obviously, non-taxonomists whose knowledge of the flora is extensive, and citizen science can be awesome, but I have also seen several cases one the lines of "aaargh ... this is so misidentified it is not even the right tribe of the family, and now the database is using it as the profile picture of this nationally significant weed species!"

Wednesday, August 30, 2017

Botany picture #251: Schizaea bifida


I have been meaning to post something more substantive, but very busy with other things. So here another plant, and a strangely looking one: Schizaea bifida (Schizaeaceae), a little fern from the Blue Mountains, 2011. Yes, except for the rootstock that is it, that's the whole fern. Some species of Schizaea have finely divided sterile leaves, but as the Flora of NSW Online remarks, in this species they are "rarely present", so what you see is a cluster of green lines with sporangia on top, the fertile leaves in all their splendour. But who is complaining? I love weird plants.

This was one of the plant groups I had only read about before coming to Australia, so when I saw it during a family holiday I was very happy. Sadly I have not again run into the genus since that one time six years ago.

Friday, August 25, 2017

Phylogenetic trees in XML format

I recently had the extremely frustrating experience of having had to look into how phylogenetic trees are coded in XML format. To illustrate why this was frustrating, let us start by considering a small phylogenetic tree as an example. I got one sequence each for the genera Nassauvia, Erigeron, Xerochrysum, Matricaria, Lactuca, Senecio, Ursinia, Calycera, Kippistia, and Synedrella from Genbank, and produced a likelihood tree in PAUP. In graphical representation it looks as follows.



So how is it generally saved? The most concise way of scoring a phylogenetic tree in plain text files is the venerable and widely accepted Newick standard. It consists of OTU names separated by commas and grouped into clades by round brackets. There may be numbers after colons, which are branch lengths, and if there is a number directly after a closing bracket it indicates some kind of support value, such as bootstrap or Bayesian posterior probability. The Newick representation of my little example tree is as follows.



Again, very concise. If we want just a tiny bit more bells and whistles we can use the Nexus format. In the context of phylogenetic trees it is just the Newick format plus "#nexus [line break] begin trees;" at the beginning and "end;" after the trees, and then each Newick tree has "tree [name of tree] =" in front of it and another semicolon at the end. The main advantage is that multiple trees in the same file can now have informative names, whereas in a Newick file they cannot.

If we want to find out how this would look in XML format, we can head over to the nexml.org website, where we will find an online tool that can transform our boring old Newick or Nexus trees into shiny, exciting, newfangled NeXML trees (for Nexus-inspired XML I guess, although as we will soon see there isn't really any similarity at all). Of course for this post I have done that with the example tree.



So, what do we see? As the name XML implies, the format is similar to HTML in that it consists largely of nested sets of tags starting with is-smaller-than signs and ending with is-larger-than signs. But those are just the optics. What about functionality?

As a Newick file, my phylogenetic tree was 448 bytes in size. After transformation into NeXML, the new tree file is now 2645 bytes in size, an increase by 490%. This has several obvious benefits in particular for the results of Bayesian analyses where thousands of trees have to be saved and may take up megabytes even in Newick format, for example I can't think of any right now.

And I am not even going to go into how NeXML scores data matrices beyond observing that it appears to require a tag assigning character type for every individual character. In other words, instead of saying something like "characters 1-9000 are anonymous genome-wide SNPs with the possible states 0, 1, 2 and ?", as in Nexus files, you would have 9000 lines of code (!) each saying "character 4306 is a SNP character" and then "character 4307 is a SNP character", and so on, wasting enormous amounts of disk space and/or bandwidth. Efficiency!

More generally, the structure of the tree coded as NeXML is extremely convoluted compared to what it looks like in Newick format. Newick is, as mentioned above, a set of nested brackets indicating clades; consequently it can be examined and read relatively easily, and even allows the user to copy subtrees in or out in manual editing (it helps if you have a text editor like SciTE that shows which brackets belong together). In fact I have often produced hypothetical example trees to illustrate a point on this blog by typing them out in Newick format and then opening them in a tree viewer. NeXML, however, has a list of nodes and edges that are referring to each other via obscure identifiers, making it virtually impossible to read, type out and edit manually, especially for larger trees. But I am sure XML makes life easier for the end user because please insert reasons here.

Next, imagine writing a program that should be able to read a phylogeny. If you want it to read a Newick tree, you merely need to parse nested brackets, recognise taxon names, and deal with branch length and support value annotations; this is relatively straightforward. If you want it to be able to read NeXML trees, on the other hand, it needs to be able to handle a large number of possible tags in varying order, plus various parameters in each tag that can appear in varying order (<node id="ne16" otu="ou27" label="Senecio_vulgaris"/> could just as well be <node otu="ou27" label="Senecio_vulgaris" id="ne16"/>, for example). This makes life easier for programmers because I'm sorry I really have no idea. But I mean, the nexml.org website says that this format is "more easily validated and processed", so that must be true, right? Otherwise they wouldn't claim so, would they?

While on the topic of phylogenetics software, to the best of my knowledge none of the programs that I currently use or have seriously used in the past can read or write phylogenies in XML format. BEAST, PAUP, and MrBayes produce Nexus files, TNT exports its own idiosyncratic format or Nexus, and RAxML produces Newick files. (BEAST uses famously convoluted XML input files, but even here the assumption is that most users import Nexus data matrices into the GUI BEAUTi. At any rate it does not save its output as NeXML.) Mesquite, which uses Nexus as its default format, is supposed to be able to export into NeXML format once we install a certain add-on library, but when I tried to do such a conversion I merely got an incomprehensible crash report.

Perhaps more to the point, if NeXML phylogenies produced by some obscure phylogenetics software that I never employ myself are supposed to be of use they have to be displayed, so how are we doing for tree viewers? The very popular cross-platform software FigTree expects Nexus or Newick phylogenies, and as far as I know the same is true for TreeView. DendroScope claims to read NeXML files but then only gave me an error message when I tried to import the simple example phylogeny after conversion by the official nexml.org website. To quote from that same website, "the future data exchange standard is here!"

While on that topic, standardisation is one of the main benefits claimed by NeXML or by XML more generally. As Simon St. Laurent wrote already in 1998:
XML allows developers to set standards defining the information that should appear in a document, and in what sequence. XML, in combination with other standards, makes it possible to define the content of a document separately from its formatting, making it easy to reuse that content in other applications or for other presentation environments. Most important, XML provides a basic syntax that can be used to share information between different kinds of computers, different applications, and different organizations without needing to pass through many layers of conversion.
I guess at this stage it should come as no surprise at all that there are already at least two different XML standards for phylogenetic trees, which is another way of saying that there is no XML standard for phylogenetic trees. In addition to NeXML, which I have discussed in detail above, there is phyloXML. Where NeXML describes trees using lists of nodes and edges phyloXML uses nested clade tags, which I find more intuitive and useful because it allows easier parsing and easier manual editing, and which is also more similar in spirit to Newick and Nexus and would thus be more deserving of a name like NeXML than NeXML. Otherwise it appears to be just as inefficient and convoluted though.

So concerning standardisation I guess the reality is that XML is flexible enough that anybody could come up with a new, XML-based standard. Just think of a few words, put is-smaller-than and is-larger-than signs around them, convince a handful of colleagues to adopt this standard, and off you go. Yes, if it is so easy to do then everybody will do it, and then we achieve the exact opposite of standardisation, but I guess that is where XML proponents can switch to touting its "flexibility". Heads XML wins, tails all other data standards lose.

As far as I can see Newick and Nexus work just fine. Compared to XML phylogenies they are easier to parse, are already standardised, are accepted by virtually every phylogenetics software and tree viewer, and take up a fraction of the disk space. Why fix what isn't broken?

Sunday, August 20, 2017

I still don't get area cladistics, and 'geographic paralogy' in particular

Since I started looking into panbiogeography and area cladistics, I have been curious about the concept of geographic paralogy. The word is used by area cladists (in the widest sense), and I have so far been doubtful about whether the analogy to gene paralogy fits.

To recap, area cladistics attempts to infer biogeographic area relationships from the patterns that species' areas of distribution show on a phylogenetic tree. If, for example, several plant or animal groups show distributions on a phylogeny that are ( Africa, ( South America , Australia ) ), i.e. sister lineages are endemic to South America and Australia, and more distantly related lineages are endemic to Africa, then an area cladist would conclude that South America and Australia are "more closely related" biogeographically than either is to Africa, or even that they form a "monophyletic biogeographic area".

Whatever that is supposed to mean, given that the word monophyletic only applies if we presuppose tree-like relationships. But I am getting ahead of myself.

The problem is now that phylogenies do not necessarily show such a simple pattern. Some species may be widespread and occur in several of the areas in the analysis, and of course the same area may occur repeatedly in different parts of the phylogeny. This is what area cladists call 'geographic paralogy', and they 'solve' the problem it poses for their analyses by selecting 'paralogy-free' subtrees from a phylogeny.

Again, two questions: Does it make sense to call this geographic paralogy, in analogy to gene paralogy? And does it make sense to do area cladistics by cherry-picking 'paralogy-free' subtrees, effectively ignoring these patterns?

I started a conversation with a colleague at the IBC, and he recommended I read Ladiges (1998, "Biogeography after Burbidge", Australian Systematic Botany 11: 231-242) as an introduction to the relevant concepts and approaches. So this I have now done. Unfortunately, the paper did not really solve my conceptual problems. I will start with a few quotes:
In cladistic biogeography, nodes of a cladogram for organisms (1,2 and 3) are potentially informative about the geographic areas (A, B and C) in which they occur: node 2 in Fig. 3 shows that areas B and C are related more closely to each other than to area A.

Such statements of relationship, the nodes of the cladogram, are explained by a variety of historical theories. One is dispersal from a restricted ancestral area, for example from area A to areas B and C, a pattern that may match fossil ages and distribution. An alternative explanation is vicariance of a widespread ancestral species coincident with physical breakup or climatic differentiation of the general area. A vicariance explanation is favoured by evidence of biogeographic congruence: finding the same pattern for other groups of organisms.
So far so good, although I do wonder whether the concept of area relationships makes sense if dispersal is the right answer. It seems to me that even calling it relationships only makes sense if there is no frequent floristic or faunal exchange, if near-everything is due to vicariance. And as I have mentioned before, there are good alternative explanations for congruence that do not imply vicariance, in particular prevailing directions of wind or ocean currents, common routes of migratory birds, etc.

Now come the complications:
Data for any one group of organisms are rarely as simple as the example shown (...). Some taxa are widespread, and some areas have more than one taxon. When combining data for different groups of organisms, not all areas are represented in each taxonomic group. Such complications are obstacles to development of analytical methods for determining area cladograms and general area cladograms.
Well yes, either that or, alternatively, they prove that the concept of an area cladogram is as incoherent as a 'species-level phylogeny' with only human populations as the terminals, and that the research program of area cladistics is a non-starter. Two pages on, the term at the centre of this post is introduced.
I offer two conclusions: (1) that evidence of historical geographic relationship is associated with nodes (not the distribution per se of terminal taxa) and (2) that some nodes of cladograms of organisms are paralogous. (...)

What is geographic paralogy? It is evidenced by duplication or overlap in geographic distribution of taxa related at a node (references). The term has its origin in molecular biology, geographic paralogy being analogous to gene duplication, with each gene copy subsequently tracking a separate evolutionary history.

(...) There is duplication of biogeographic regions across the clades (e.g. South America is in three), which is evidence of geographic paralogy. In other words, the major lineages shown in the cladogram existed prior to the breakup of Gondwana and each potentially reflects that geological history.
Consider what is claimed here. First, as we have seen earlier, simple area relationships that are congruent across lineages are claimed as evidence for vicariance. Now the fact that the same area shows up in several parts of a phylogeny is seen as evidence for paralogy; and this paralogy is also seen as evidence for vicariance and against dispersal. I cannot say that this makes a lot of sense to me.

Having gone through these quotes, I now want to carefully examine the analogy between gene paralogy and geographic paralogy. Let's start with the former. It works like this:



In this and the following figures, we see a grey species tree with species 1, 2 and 3. Within it we see the gene trees, as genes evolve inside the species. Here an originally single gene lineage (blue) was duplicated in the common ancestor of all three species, creating a red gene and a black gene. We now call the alleles A and Y paralogues of each other, because while they are distantly related they are not really the same gene anymore. In contrast, A and B are orthologues of each other. They are really the same gene, only in two different species.



The above figure now shows the problem that gene paralogy can cause in phylogeny reconstruction. If in this case Z is wrongly assumed to be an orthologue of A and B, we will infer the wrong species relationships, i.e. ((1,2),3) instead of the true (1,(2,3)). However, there are also other causes why we may get conflicting or complicated patterns.



In the above case we have the gene tree contradicting the species tree, but nonetheless there is no paralogy because there is only one gene involved. What has happened here is that two versions of the gene arose in an ancestral population, and that subsequent populations were large enough and/or speciation events happened so close after each other that both copies were carried through to the ancestor of 2 and 3. We call this incomplete lineage sorting (ILS) or ancestral polymorphism. We could also still find all gene variants in all three species. Point is, this is not paralogy.



Something different has happened in the above scenario. We get the same pattern of a gene tree showing ((1,2),3) despite the species phylogeny of (1,(2,3)), but this time because of a hybridisation or introgression event between 1 and 2. Of course, we could also still find the original gene variant in species 2 along with the introgressed one. Again, this is not paralogy.



Now the same for biogeography. Above the scenario where I think the analogy works: There are two clades that arose before continental breakup, and they both independently trace the 'area relationships'. In this case it makes sense to use the two clades or subtrees as independent data points for inference in area cladistics.



Here is the same problem for area cladistics as for phylogenetic inference. If we do not realise that we are treating paralogues as orthologues, we may get species phylogenies and, by analogy, area relationships wrong. So in the case of phylogenetics, people have developed methods for orthologue inference and to exclude paralogues from the data.

What I don't really see is how area cladists do the same. They claim they pick 'paralogue-free subtrees', but that merely means that they search for a statement like ((1,2),3) and remove statements like (1&2&3,(1&2,2&3)). It does  not mean that they actually have any way of recognising that ((1,2),3) is an instance of paralogy while (1,(2,3)) isn't. They can merely hope that it comes out in the wash because the true relationship will be more frequent than the wrong ones.

This appears to be rather problematic, unless I am missing something equivalent to orthology inference in phylogenetics. But on top of that we have the other scenarios, those where there really is no paralogy.



The above is the biogeographic equivalent of incomplete lineage sorting. We could imagine here that species C stayed endemic to a part of South America while its sister species was more widespread. If we now also had some species occurring in two areas, area cladists would speak of paralogous nodes, but again, there does not appear to be any paralogy involved.



But really crucial is the biogeographic parallel to gene introgression: dispersal. The above scenario shows what area cladists call paralogy and, as we saw in the quotes above, consider evidence of vicariance, but what reason is there to exclude dispersal as a possible explanation? This is, of course, precisely the pattern that dispersal would produce!

And it is clearly not in any way comparable to gene paralogy anyway, because there are no paralogues involved. It makes no sense to use a term that assumes the existence of two genes independently tracing the species phylogeny (and, by analogy, two species-lineages independently tracing 'area relationships') to refer to any difficult pattern, even where there are no such two deep species-lineages.

In summary, I am still not exactly convinced that area cladistics makes sense. The assumption that pretty much any pattern - congruence as well as the contradictory data from paralogy! - is evidence of vicariance seems particularly hard to swallow.

Tuesday, August 15, 2017

This is not how this works, religious freedom edition

Sometimes it is interesting what raises one's hackles. Today one could get upset about domestic terrorism in the USA or the level of discourse regarding dual citizenships in Australian politics, but what really annoyed me was an opinion piece on the ABC website with the title "same-sex marriage is more complex than the Yes campaign admits".

Basically, the author, one Peter Kurti identified as "a research fellow at the Centre for Independent Studies and the author of The Tyranny of Tolerance: Threats to Religious Liberty in Australia", argues that any changes to laws that will allow same-sex marriage will have to ensure that those who don't like that change can still discriminate against homosexuals:
Freedom of religion extends far beyond the walls of a church or a synagogue.

Schools, charities, and other faith-based not-for-profits, as well as ordinary business people such as bakers, florists, and photographers who wish to uphold the traditional meaning of marriage need to be protected from discrimination and attack if the law on marriage does change. [...]
   
If the law is eventually changed to allow same-sex couples to marry, it should not create an additional entitlement enabling some citizens to force other citizens to act against their religious beliefs or conscience by making them help celebrate same-sex marriages.
The usual disclaimer applies here as to how what I write is my private, non-professional opinion and not necessarily that of any other person or institution that I may be associated with, but I believe I am not stating anything particularly controversial or revolutionary when I now write:
This is not how freedom of religion works.

Let's consider how this would have to work in other cases, if it wasn't complete nonsense.
  • If the law is eventually changed to allow mixed-race couples to marry, it should not create an additional entitlement enabling some citizens to force other citizens to act against their religious beliefs or conscience by making them help celebrate mixed-race marriages.
  • If the law is eventually changed to allow women to seek employment, it should not create an additional entitlement enabling some citizens to force other citizens to act against their religious beliefs or conscience by making them hire women.
  • If the law is eventually changed to allow people to wear yellow shirts, it should not create an additional entitlement enabling some citizens to force other citizens to act against their religious beliefs or conscience by making them provide services to customers wearing yellow shirts.
The logic, and I am using this word loosely because any more appropriate alternative would be impolite, is exactly the same in all four cases. There is not one iota difference between them.

Freedom of religion, unless it is intended to destroy all personal freedom and tolerance, cannot mean that people get to discriminate against whatever they personally don't like. It means that they are allowed to follow their religious rules, for example by not marrying somebody of the same sex themselves, but it cannot mean that they are allowed to discriminate against others who do not follow those rules.

What really frustrates me is that this is not some random dude who has never looked into how rights and freedoms work and made some thoughtless off the cuff remark during lunch break. This is somebody who has carefully written an opinion piece and got it published by Australia's public broadcasting company on the strength of their authority as a scholar. I assume it would be silly to ask if the piece underwent peer review.

Monday, August 14, 2017

Botany picture #250: Aristolochia clematitis


Aristolochia clematitis (Aristolochiaceae), the only member of its genus in Germany, 2008. I rather like this genus with its weird flowers, but unfortunately I very rarely see them in the wild. It took me years before I ran into this one in Europe, and otherwise I have only seen the odd one or two species in South America. There are certainly none near where I live now.

Friday, August 11, 2017

Does philosophy produce knowledge?

Recently I jumped, rather rashly, into a discussion about the purpose of philosophy. On his blog, philosopher Daniel Kaufman commented on the suggestion by a different philosopher that philosophy PhD students should be banned from publishing and took the opportunity to argue that there were too many PhDs hunting too many jobs, partly because graduate students were exploited as cheap labour, everybody was publishing too much, many of the leading philosophers weren't doing enough undergraduate teaching, and the field had gone down a dangerous and misguided path by assuming that it was a knowledge-generating exercise like science.

In Kaufman's telling, (a) philosophy does not generate knowledge ("these are not the sorts of questions that will ever admit of conclusive answers") but its purpose is to enrich our lives, (b) universities should fund things that enrich our lives even when they do not generate knowledge, and (c) admitting that philosophy does not generate knowledge is the best strategy for minimising future funding cuts, while trying to play scientist will backfire.

I could clearly have expressed myself better, especially in my first comment, but my position is that (a) philosophy can generate knowledge, (b) I do not see why universities should teach things that are self-admittedly non-academic, and (c) I strongly doubt that pitching philosophy as intellectually futile is going to work in its favour.

Note that I am not saying in any way whatsoever that universities should only train people for jobs, quite the opposite. I am merely saying that they are there to produce, manage and transfer knowledge (e.g. history of music or theory of music), while mere amusement or practical skills (e.g. appreciating music or learning to play the piano) are better accommodated in other ways, for example by buying a music CD or paying a private piano teacher. I am also not saying that anything that doesn't produce, manage or transfer knowledge is useless, merely that such a non-academic activity could perhaps better be accommodated outside of the university.

The main point I want to discuss now is, however, the first: can philosophy produce knowledge? The example that I would like to use is that of divine command theory and the Euthyphro dilemma. It may be said that that is very low-hanging fruit, but well, if somebody wanted to show how science can produce knowledge they would also choose something simple like the shape of the earth as opposed to the minutiae of population genetics or quantum physics.

As most people will know, divine command theory is the claim that "what is moral is determined by what God commands, and that for a person to be moral is to follow his commands" (quoted from Wikipedia, 11 Aug 2017). As most people will also know, Plato challenged this idea with the Euthyphro dilemma, which in modern terms is perhaps best summarised as follows:

There are two possibilities. Either the gods command that some action is moral because it is moral by an independent, objective standard. If that is the case, then we can cut out the middleman and conclude that morality does not actually flow from the gods. Alternatively, whatever random thing the gods declare moral is moral merely because the gods say so. If that is how it works, what if the gods commanded you to torture an innocent person to death? Clearly the first option is incompatible with divine command theory, but if the alternative is accepted then the theory is shown to have absurd consequences.

Religious people have, of course, tried to find answers to this dilemma. They seem to fall mostly into two categories, either stating that god would never command something evil, which even if they do not realise it grants that there is an independent standard and thus divine command theory is false, or claiming that the answer is "both", that there is no dilemma. As I have written on this blog before, that latter rebuttal does not work because both you can't avoid a bad outcome by accepting two bad options. If a judge asks whether you have murdered your neighbour or whether you got him killed through recklessness replying "both, your honour" won't clear you either.

I would consequently argue that Plato's philosophy has in this case generated a piece of knowledge: divine command theory does not work. And it was generated through philosophy as opposed to science, as no empirical data were involved.

What possible objections could be raised?

First, this is merely what we might call 'negative' knowledge. We still don't know what to base our moral reasoning on, merely that we cannot base it on "but the gods said so". To this I would respond that there is a clear parallel in science, which can also test and reject hypotheses and models but only ever tentatively (!) keep the ones that are currently not superseded and rejected.

Second, it could be argued somebody could find a solution to the dilemma or perhaps already has found a solution to the dilemma. Again the parallel to science should be clear: knowledge is always tentative until somebody comes along and disproves it and/or suggests an even better idea. The fact that we are never omniscient does not mean that we are as ignorant after somebody has thought through a problem as we were before.

Third, it could be observed that there are still plenty of people working as philosophers who accept divine command theory. And once more I would like to point towards the example of science. There are plenty of creationists, even some (if few) biologists; does that mean that biology does not produce knowledge? The only difference is that even the professionals in philosophy share less consensus on what is right than professional biologists.

Here I would argue that how much of a consensus can be achieved in a field depends on two main factors: whether the knowledge produced by the field is important in some kind of practice, and whether there is a lot of motivation to continue accepting a falsehood. Engineering, for example, has immediate and crucial practical applications. If an engineer accepts nonsense, they may construct something that fails embarrassingly, and consequently engineers are very likely to reject nonsense in their field of expertise (this qualifier is obviously important).

Economists, on the other hand, work in a field where things do not just work or fail, but they generally work in favour of either this interest group or that interest group. Even if raising wages would be "better" for the economy as a whole, it might still not be in the interest of individual investors; and even if lowering wages would be "better" for the competitiveness of an economy, it might still not be in the interest of an individual employee who wants more money now. It therefore seems entirely unsurprising to me that there would be a lot of motivated reasoning in economics, making it hard to discard false beliefs.

Philosophy has no immediate applications on the lines of keeping a bridge up, but it certainly deals with a lot of questions that are dear to people or affect closely held beliefs, for example ethics or epistemology. It therefore seems entirely unsurprising to me that the field would have it harder to discard false beliefs than chemistry or geography, for example. The take-home message here is that individual practitioners disagreeing does not demonstrate that there is no knowledge to be had; it may merely indicate that some practitioners reject that knowledge due to personal biases.

In conclusion, I remain convinced that philosophy does, or at least can, generate knowledge. It does so, among other approaches, by thought experiment, showing claims to lead to absurd consequences, or showing claims to involve a self-contradiction. Much of that may be rejection as opposed to proving of claims, but again, science is also mostly rejection of false ideas. The (always tentative) understanding we have now is what remained after myriads of mistakes were corrected.

Wednesday, August 9, 2017

Discussions of diversity and equality are generally very depressing

Somebody at Google circulated an opinion piece on Google's diversity efforts, which was ultimately published by Gizmodo. A public discussion ensued. And as always with what is called "cultural" issues I find the way it goes very depressing. Perhaps surprisingly that is not because of some particularly backwards or intolerant position taken by this or that participant (although that too, see #6 below), but rather because much of what goes on in these kinds of discussions seems so futile.

One of the most fundamental problems is that there is not actually one controversy, there are numerous controversies going on at the same time, and people mix them all up. Just checking out two articles or posts and following their links to perhaps another three, it seems to me as if at least all of this is being discussed at the same time, in no particular order:

1. Whether there are psychological differences between men and women.

2. If such differences exist, to what degree they are genetic/developmental or socially conditioned.

3. Whether there are cognitive differences between men and women to the degree that the average man is objectively better at abstract problem solving and thus more suited for being a software engineer than the average woman.

4. Whether there are cognitive differences between men and women to the degree that the average man is objectively better at abstract problem solving than the average woman, but because software engineering is really a collaborative and thus people-oriented activity, at which women are said to excel, the average woman makes a better software engineer than the average man.

5. Whether different levels of representation of men and women in different fields of work are now largely due to job preferences as opposed to discrimination, meaning that trying to achieve parity in all fields is futile.

6. Whether women are, and I quote, "inferior" in sports. Yeah, I have no idea what that has to do with anything either, but I believe the choice of terminology speaks volumes.

7. Whether Google (and by extension many other companies) now has been captured by "the left" and has adopted "political correctness" to the degree that nobody dares to speak their mind for fear of being shamed, ostracised, and fired.

8. Whether Google was justified in firing the author of the memo for being disruptive and/or violating its code of conduct.

9. Whether circulating this memo to colleagues falls under the Free Speech guarantee of the US constitution.

And I am sure I have missed some. For what it is worth, the way I understand the original memo it was clumsily trying to argue mostly #5 and #7 and potentially #3, or at least it is widely read as arguing the latter.

In light of this it is unsurprising that so little is achieved and that so many people are at each others' throats. Of course there are many other topics where people will have heated discussions, but it is because their opinions differ very strongly (e.g. economic policy, environment, energy), not merely because they are completely talking past each other.

But with these equality / diversity issues I regularly see people go ballistic at each other who seem to pretty much agree on policy goals (e.g. better representation of currently underrepresented groups), general political outlook and acceptance of empirical reality (e.g. differences in mean innate cognitive abilities between groups of humans are negligible compared to variance within those groups) and should consequently be able to hash their differences out in a more rational manner.

One person says "maybe it is mostly job preferences now" but the other hears it as "I want to excuse under-payment and harassment of women"; or one person says "what he wrote could be read as if women don't make good engineers, and that creates a hostile work environment" and the other hears it as "nobody is allowed to have a different opinion than me; burn, heretic!" Makes me despair of political discourse.

Tuesday, August 8, 2017

Botany picture #249: Nothofagus of Patagonia


Deciduous Nothofagus (Fagaceae) trees near Puerto Blest, Argentina, in 2009. Or whatever the current genus name for this subgroup of southern beeches is after Nothofagus has been split up. In this case the reason for taxonomic changes was not phylogenetic systematics, because Nothofagus in its wider circumscription was also monophyletic. If I understand correctly, the idea was to make the age of the genera more comparable to Quercus, Fagus and suchlike. Either way, I like how this picture came out.

Sunday, August 6, 2017

Undergraduate resumes / CVs

I don't seem to have one of those files on my current computers any more, but I know that my CV as an undergraduate looked something like this:
Name
Address

Picture taken by professional photographer while I was wearing a formal jacket and perhaps a tie

Formation

Studying biology at [university], 1996 - now
Non-military service, 1995 - 1996
[Public grammar school] (high school & college in one), 1988 - 1997
[Yet another public school], 1986 - 1988
[Public primary school], 1982 - 1986

Undergraduate scholarship of [foundation], 1997 - now
And... that was that. Black on white, Times New Roman size 11 point, 1.2 line spacing, one page, easy to see all relevant information at a glance.

Now, an Australian undergraduate's CV today appears to look something like the following:
Name
Address, e-mail

Either no picture (which is what is expected in Australia) or a selfie taken at a party

Personal details

My name is [name], I am 24 years old and live in Woolalla, New South Wales. I am currently in my third year at Ned Kelly University studying a combined degree Bachelor of Science / Bachelor of Arts majoring in biology and journalism. I hope to pursue a career in science and apply what I learned in university to better the world.

Personal attributes

Effective communicator
Reliable and trustworthy
Ability to work in team as well as independently
Hard worker
Leadership skills demonstrated by frying burgers at McDonalds
Organisation talent demonstrated during waitressing by correctly taking customer's orders

Skills

Word, Excel, PowerPoint, Internet Explorer, Google

Employment history

Sales assistant at some supermarket, 2009 - now
Waitressing at Happy Hogan's bar, 2008-2012
Frying burgers at McDonalds, 2013 - now

Volunteer work and leadership

Friends of the State Zoo, 2011 - now
Church Youth, 2007 - 2009

Education

Ned Kelly University,  Bachelor of Science / Bachelor of Arts majoring in biology and journalism, 2014 - now
Catholic College of South-eastern Western North Sydney, 2012 - 2014
Little Sisters of Perpetual Misery Private Catholic High School, 2008 - 2012

Other activities

Raising money for YUZN charity
Wildlife rescue
Greening Australia
Surfing
Blood donor for Red Cross
Debate club

Achievements

Consistently excellent marks in university*
Award for high placement
Mentor for other students
Talent Award
Award for outstanding job as house head
President of debate club
Dean's letter of recommendation
They are often carefully formatted in a fancy sans serif font with about 50% white space, perhaps a red bar at the top or a blue bar along the left margin of the page. They are often three to four pages long.

A few thoughts. First, it is not as if we didn't have extracurricular activities and hobbies back then in Germany. It just would never have occurred to most of us that a potential employer or scholarship provider would care the least bit about our participation in a badminton club. And as far as I can tell they wouldn't have, and I certainly don't. This is wasted space that merely makes it harder to find the truly relevant information.

Second, the personal attributes also seem a bit pointless. Will anybody actually truthfully write "I am lazy" or "I am a poor communicator"? Presumably not, everybody will claim the positives, honestly or not. So this is wasted space that merely makes it harder to find the truly relevant information.

Third, I assume somebody tells Australian students to put all their work experience in there to demonstrate ... well, this is where it breaks down for me. That they will show up for work if you give them a contract? That's kind of a low hurdle to clear. But beyond that, how is flipping burgers or waiting tables a relevant qualification for a job or scholarship in science? I don't get it. This is wasted space that merely makes it harder to find the truly relevant information.

Fourth, all those achievements? When I was a school or university student in Germany, we did not have even just a tenth of those awards. Here half the students seem to have lists of awards that look seriously impressive; but given how many of them have lists like that I do wonder how easy they are to get. If there is no term like award inflation (in analogy to grade inflation) then we need to create it.

Of course, given the length of the time since I left I also wonder how German undergraduate students' CVs look these days. Do they now also mention every little thing they did, no matter how irrelevant to the job or scholarship they are applying for? Do they also now try to look as if they had been written by a graphic design graduate?

Footnote

*) From what I can tell the likelihood of somebody explicitly claiming to have consistently high marks in the achievement list seems negatively correlated with the actual quality of their marks. The people who actually have near-straight high distinctions tend to have only an understated line in the CV providing their point average.

Thursday, August 3, 2017

Basal and transitional taxa

Shortly before I left for China I received an alert on an interesting paper:

Bronzati M, 2017. Should the terms 'basal taxon' and 'transitional taxon' be extinguished from cladistic studies with extinct organisms? Palaeontologia Electronica 20.2.3E: 1-12.

As can be expected from this title, Bronzati argues that the terms are misleading and confusing, and that they should not be used. I find myself tending to disagree, at least in part, and not only because of an allergic reaction to being told what words I am not supposed to use because it might confuse 'the public' (cf. free will debate). Before I go over the arguments, however, I would like to clarify where I agree:

First, there are clearly cases where it would be desirable not to use a concept or term because it is really wrong or incoherent, and in some cases even because it is misleading. At the recent conference I flinched at a speaker who said "this individual is paraphyletic". Although I understand what they meant (the utterly trivial and commonplace observation that an individual had two different alleles at a gene locus they had sequenced) such a sentence is Not Even Wrong and has to be based on confusion about, well, pretty much everything that matters in molecular and phylogenetic systematics beyond perhaps how to hold a pipette the right way and click "run analysis" in a few programs. But it is not necessarily the case that the terms basal and transitional suffer from the same problems.

Second, I obviously agree that supraspecific taxa should be monophyletic.

Third, I also agree that evolution is not teleological (with a caveat I will go into below) and that terms such as primitive or advanced are to be avoided, in particular when talking about organisms that live(d) in the same time-slice. And in fact there are very few people left who still think that e.g. mosses are primitive compared to seed plants. Both lineages as they exist today have evolved for precisely the same time. The mosses are certainly not more primitive as mosses than seed plants would be as mosses, they just went completely elsewhere in terms of morphospace and adaptive peaks.

Evolution is a story not of progress but of diversification, and it only looks to us as if there was progress from morphospace position A to position B because life necessarily had to start in some position, and even after a pure random walk some extant organism may still (or again) occupy that starting position or something close to it. A good analogy I once read is to imagine a bunch of people all starting in front of a wall and then milling about aimlessly. Although their movement is random the group will still expand in one direction, away from the wall, because they cannot go in the opposite direction; conversely then, the fact that they are now further away from the wall does not mean that they meant to move in that direction specifically.

It is consequently important to keep in mind the "studies with extinct organisms" part of the paper title, because on the question of sorting extant organisms into a ladder of progress all competent evolutionary biologists are agreed anyway. Okay, but what now of extinct taxa, which had in their time not yet undergone the same amount of evolution as the taxa we have today? Are they basal or transitional to the latter?

Bronzati starts by examining whether basal taxa are those that are older than the non-basal ones and observes that fossil ages do not necessarily reflect the ages of the lineages they belong to. He suggests instead to use "'early' and 'late' in an explicit comparative framework". That is very clear, but I do not think that this is how the word basal is meant by most people anyway, and it is certainly not how I would use it. As Bronzati soon observes himself, "'basal' is a relative term regarding the base of the tree" and thus refers to the relative age of lineages, not to the age of fossils.

I am not quite sure I understand the next part, where he writes that "different people certainly have different assumptions of what a 'basal' taxon is" and discusses whether something outside of clade A can be a basal A or not. I'd say not, but again, I think basal is a relative term along a tree topology and not something that I would use in this way.

Now Bronzati turns to the question that I consider the most relevant: "Basal taxa are closer to the root" - precisely that is how I understand the word - "but how to measure it?" But this is also where I think the argumentation becomes a bit odd, because he argues against the use of the term by comparing apples and oranges, and then throwing incomplete sampling into the mix. This will now need an illustration. Consider the following phylogeny:



Bronzati argues against the use of 'basal' by looking at species A, which people would supposedly (?) consider to be basal because they read the tree like a ladder from left to right, and then observing that this species is actually more distant from the root in terms of internal tree nodes than species B. I hope the problem is immediately obvious: species A is not the unit we would be talking about when saying "more basal than B". Would anybody ever actually say that A is basal in the tree? It is clearly fairly nested. Instead, the only use of such terminology that makes sense would be to say that the entire genus Ales (the red box) is basal in the entire family also consisting of the other genera Beles, Celes and Deles (the other three boxes), or more basal than Beles.

And this is where I am willing to be convinced otherwise but at the moment happy to continue using the term basal: if and only if we are talking about the branching order along a phylogeny backbone, along a grade. I will be the first to agree that all supraspecific ranks are arbitrary, but we also have to appreciate that we are using them, or alternatively unranked clade names, nonetheless. This is not so much about evolutionary theory as about having at our disposal non-atrocious language to describe a tree topology. When talking about these genera, what is so problematic about saying "Ales is basal in its family" compared with "Ales is sister to the rest of its family"? At least in my eyes the two statements are equivalent and neither is more misleading than the other. Making it about species A feels like a red herring.

And this is then also all that needs to be said about sampling, because it is based on a similar argument. Bronzati describes a hypothetical tree of all dinosaurs with all of the huge bird clade represented only by the chicken and then jokes he "would hope that no one would suggest that the bird is a basal dinosaur ... based on the number of intervening nodes to the root". No, I don't think anybody would. But maybe we would say the birds (!) are. If, hypothetically, part of the topology were (birds, (dinos2, (dinos3, dinos4) ) ) then yes, I would not have any problem saying that the birds as a whole are more basal in the tree than that other named clade dinos2, for example, because the birds as a whole are quite simply branching off one more inclusive ancestor closer to the root than dinos2.

What is really puzzling to me is that Bronzati himself makes the same point two paragraphs later: "it is not terminal taxa (...) that can be 'more basal' in relation to other terminal taxa, but the nodes (i.e. hypothetical ancestors) of the tree in relation to other nodes".

Concluding his discussion of basal 'basal', Bronzati examines the question whether basal taxa have more plesiomorphic traits and concludes no, but again this is based on considering in isolation a very derived descendant of the entire clade I would call 'basal'.

He then turns to the term 'transitional'. Here he appears to make two main arguments against its use. First, that evolution has no goal, and second, that phylogenetic trees are branching diagrams instead of ladders.

I have already mentioned above that I agree completely that evolution is non-teleological, but with one caveat, which is this: lineages may discover, for the first time, a new peak in the adaptive landscape, and when that happens we can expect them to evolve up that peak, so that earlier forms would be more poorly adapted to the new situation than their later descendants. Bronzati himself mentions the colonisation of dry land, focusing of course on vertebrates, his specialty. Using the group that I am more familiar with as an example, it seems clear that the early vascular plants started out without roots, and that the lineages that descended from them evolved roots because having those was a pretty good idea on dry land. In fact there are none left that are primarily without roots, presumably because they were out-competed (although there are a few secondary losses under unusual circumstances, e.g. Cuscuta).

I would argue that this, and only this, and only along a time axis, is where we can perhaps meaningfully speak of primitive and advanced, but that is not even the point here, because the term we are dealing with is transitional. More important seems the second argument. Yes, phylogenetic trees are branching diagrams, but they do not merely consist of terminals, they also consist of hypothetical ancestors. It is a bit unclear to me where Bronzati stands on the question of those; on page 6 of his paper, as mentioned, he talks about hypothetical ancestors himself, but here he spends considerable time arguing in a way that suggests that he does not want to identify actual species or fossils as ancestral:
It is important to stress that the absence of autapomorphies in taxa [sic] B does not indicate that it is transitional between A and C-F. Firstly, this might be just a reflex [sic] of the lack of ability to translate different morphologies into phylogenetic characters. Furthermore, the study of living species shows us that even if there is no recognisable morphological difference between [sic], they can differ at the genetic level.
Of course they can, but remains unclear to me what should keep us from tentatively concluding that some fossil may represent an ancestor until we get additional evidence that shows otherwise, just like pretty much every other conclusion in science is also tentative. And if we have a presumed ancestor we can say that it is transitional between an even earlier presumed ancestor and descendants further down the line. There is no teleology involved here, but the internal nodes of a phylogeny can indeed be read as a ladder of ancestor-descendant relationships.



I am sorry to say I just don't see the problem here either.

Bronzati ends with making four recommendations:

Tree toplogies should be described with sister-group statements, avoiding terms like basal or early diverging. My concern is that this will lead to very ugly and repetitive language when describing anything but a very small phylogeny: "our results indicate that A is sister to the rest of the study group. B is then sister to the rest of that rest, and then C is sister to the rest of that rest we just mentioned; now D is sister to the rest of that last rest ...", and so on for another four clades. That is just not very aesthetic. So why not a much more concise "the earliest diverging lineage is A, followed by B, C, and D"?

Instead of calling a terminal taxon a basal member of clade A, we should say it is a non-A member of the next larger named clade around it, as in non-avian dinosaurs. That makes sense, but again, I would never have used basal for a deeply nested terminal anyway but only to discuss the relative position of several clades along a grade.

We should say "this taxon fills a gap in the fossil record" instead of "this taxon is transitional". As mentioned above, I don't see it, perhaps because I have a different approach to internal nodes and species without autapomorphies.

Finally, we should avoid teleological language. No disagreement from me on this one!

Monday, July 31, 2017

IBC and Shenzhen impressions

Seen at my hotel.

Shenzhen Convention and Exhibition Centre.

Exhibition area, with poster area in upper right corner of picture.

One of the larger lecture halls.

After dark.

Poster area.

Tea break in the hall between the smaller lecture rooms.

Wednesday, July 26, 2017

International Botanical Congress 2017, Shenzhen

A few thoughts after the first two full conference days of the IBC 2017. (Sunday was only the opening and a few public events, and today Wednesday is a bit of a break without symposia).

Shenzhen has really pulled out all the stops to host the conference. It is huge, it has a huge presence in the city, and everything is well organised. The city itself is also a good location. I am not personally a fan of skyscrapers, but of course only a large city like this has the infrastructure for a congress of more than 6,000 delegates. The subway system in particular is amazing.

What I am less convinced of at the moment is the way the scientific program has been organised. In what is apparently custom for IBCs, all the symposia were set in advance, and for each symposium the organisers were required to arrange four set talks and fill the remaining two slots competitively from non-invited submissions.

I find that unfortunate. Why not set perhaps half the symposia, call for free submissions, and then group all the latter thematically into additional symposia? If you get twelve talk submissions for palaeobotany, for example, simply arrange for one of the more accomplished speakers to chair the whole thing. I have not yet heard an argument against such an approach beyond "but that means that they would have less time to organise the symposia", but other meetings do not seem to have any problems with it, from what I hear even very large ones.

I have also heard from many others, and observe myself, that there are too many parallel symposia, and often symposia that would clearly attract the same clientèle are placed in parallel to each other. One of the reasons this happens to such a high degree is that only a small part of the day - four hours in the afternoon and, again, even excluding today - is reserved for the general symposia, while the entire morning is given to keynotes and plenaries.

Now I may be in the minority here, but I am not to be counted among those who primarily go to a conference to hear a famous person give a keynote lecture. At their best they are the oral presentation equivalent of review papers, but I see the benefit of meetings mostly in networking and learning about new results and methods, the oral presentation equivalent of research papers. And for this the general symposia are the places to be. I am consequently wondering if it would not be more productive to increase current symposium time slots by 50% and to cut the number of keynotes and plenaries in half. This would also mean that less symposia would have to run in parallel.

Overall I have seen some very amazing talks, but also some that appeared a bit uninspiring and could have benefited from a more explicit explanation of what the research question even was and why the audience should care. I have heard of software that I should look into and learned about tools that have become more powerful since I last tried them, but also groaned about "paraphyletic individuals" and the assumption that the traits of the smaller of two clades automatically represent the ancestral states. And of course I have met colleagues from overseas who in many cases I haven't seen since the last IBC in 2011.

Now I just have got to get my own talk down from a rambling 25 min to 15-20 and all will be good...

Saturday, July 22, 2017

The Prince

I have arrived in Shenzhen, China, for the International Botanic Congress. I meant to upload a few pictures of the Luohu district today but it seems as if my cell phone does not want to talk to my laptop, so perhaps I can do that when I am back.

On the flights I was unable to get much work done beyond making corrections to a manuscript, so I read a book and watched movies of varying quality: Star Wars Rogue One, Suicide Squad, and Throne of Elves. It is all a matter of expectations; they weren't high, so I enjoyed all three, although the last of them partly for being so different from how a European would have done it, and while happily ignoring the humongous plot holes of the second. The funny thing about Rogue One is that it is actually in part a reasonably good attempt at rationalising why the heck the Empire would have built the Death Star with such an idiotic weakness, although it still remains implausible that nobody else noticed it during construction and just added another wall on the way.

Ah well. Anyway, the book I read nearly through on the flights - because it is not actually all that long - is Machiavelli's Il Principe. The book needs no introduction as it is a classic, but I had never read it until I happened to pick it up in a German retranslation at the last book fair I visited.

The scholar who wrote the foreword stresses that Machiavelli's reputation is undeservedly bad, that his work is really a groundbreaking piece of political philosophy. With Il Prinicipe and its sister work on republics he is considered to have pioneered political writing that sees humans as capable of influencing history within certain realistic limitations instead of being the passive objects of divine providence, and that argues for a pragmatic approach to politics instead of an unachievable spiritual ideal or political utopia.

And yes, I can see where that is coming from, although given my political socialisation I always remain sceptical of seeing history as a chain of outstanding people having influential ideas. (I think it is much more likely that if Machiavelli had not written this book others would still have organically moved towards more pragmatic political philosophy, as that was simply the Zeitgeist.)

But I can also see clearly where his bad reputation comes from. Not only is he fairly open about criticising past politicians and military leaders, including popes, for their personal and public failures, which would obviously invite opprobrium. He also matter-of-factly advises the audience to betray their allies for political gain and to murder the entire family of a previous ruler so that their bloodline is extinguished and no remaining heir can challenge the new order.

Again, both Machiavelli and the author of the foreword argue that this is just realistic. If you want to secure power and strengthen your state then this is what you must do. Machiavelli also doesn't see any issues with such behaviour because he has a very dim view of humanity in general. For example, to him it is no problem to break treaties because your treaty partners are, well, humans and as such should be expected to break the treaty themselves at the first good opportunity. That's just how dastardly humans are, fide Machiavelli at least.

Now realism is one thing. I can understand Machiavelli's advice in many cases, for example when he considers whether it is more important and easier to have the general population on one's side or the nobility (in today's context, the one percenters), and how to achieve either. And I also understand that one has to be realistic about the established rules one is subject to; if everybody habitually lies then a single honest person will indeed perish where another liar may have prospered. But I think he and that modern scholar miss to what a large degree opportunistic breaking of rules changes the rules for the worse, and what the consequences are.

Be it keeping true to treaties or showing mercy to one's enemies, the point of following rules or gentlemen's agreements is that only then can you expect that others will follow them to your benefit. When, for example, it became customary in the early to high middle ages of Germany that nobles competing for the crown would not eradicate the opposing family but instead force the male members to retire into monasteries, and only kill them if they blew that chance by coming back and raising another army, the idea was presumably that once the shoe is on the other foot one would also be given the chance to leave politics instead of coming home to find one's wife and underage children face down in puddles of blood, as Machiavelli would have it.

In other words, I am coming away from Il Principe with the impression that he was too clever by half. He took pragmatism just far enough to come out on the other side and fall back into short-sightedness.

Wednesday, July 19, 2017

How to use the reference manager Zotero

(Updated 19 July 2017 regarding Zotero 5.)

Inspired by an eMail exchange with a colleague I thought I would write a longer post on how to use the open source reference manager Zotero. Obviously all the information will in some form be available in its documentation, but at least to me the things I really need to look up are often the needle in the haystack of the obvious and the irrelevant. So here is what I think is what one needs to know to start using Zotero in what is hopefully a logical order.

All of this is based on Zotero 4, as I have not yet used the newest, version 5. From version 5 the standalone program is required instead of Zotero running through the browser alone. See the relevant comment under installation below.

How it works

Zotero is available for Win, Mac and Linux and for LibreOffice or MS Word. It integrates into the browser (I use Firefox, but it also seems to work with Chrome, Safari and Opera), and the usual way to use it is through the browser. This means that the browser has to be open at the same time as the word processor, but there is also a standalone version that I am not familiar with.

Installation

To be honest, this is the only point on which I am a bit confused at the moment because it has been a bit since I installed one of my instances. There are three items that may have to be installed: Zotero itself on the download page, for which there are specific instructions. Then there is the "connector" for the browser you have. Finally, it may be necessary to install a word processor plug-in for your browser. What confuses me is that I seem to remember only installing the latter two last time, so either I misremember or something has changed with the newest Zotero version (?).

Either way, while the installation of Zotero into the browser itself is easy, I have noticed that sometimes the word processor plug-in does not take on the first attempt. In that case I merely repeated the installation and restarted everything, and then it worked.

Update: The colleague who has now started using Zotero has, of course, installed the newest version and kindly adds the following:
The new version of Zotero needs the standalone program installed. This is because Mozilla has dropped the engine that supported a lot of extensions (like Zotero). It was deemed that allowing the browser to carry out low-level functions on the host computer introduced inherent vulnerabilities, and so Firefox versions after 48 have very much restricted what the browser is allowed to do (no longer can it communicate directly with databases, and carry out a lot of file handling functions). The problem is not restricted to Zotero, for example Gnome desktop extensions used this functionality and have also had to change the way they do things. The long and the short is that nowadays you need the standalone application installed.

Using Zotero in the browser and building your reference library

When you have Zotero installed there are two new buttons in the browser: a "Z" that opens your reference library and a symbol right next to it that you can click to import a journal article into that library. Given that the library will at first be empty let's look at that latter function first.

The Zotero buttons in the browser
For example, I may have found a journal article through a Google Scholar search. Ideally, I am now looking at the abstract or HTML fulltext on the journal website, because that page will have all the metadata I want. I now click on the paper symbol to the right of the Z, and Zotero automatically grabs all the fields it can find and saves a new entry into my reference library; if it can get a PDF it will even download that.
Viewing paper abstract
Now I click on the Z button to bring up my library, if I didn't have it open yet. In some cases I may notice that something went wrong. The typical scenarios are that the title of the paper is in Title Case or ALL CAPS. This is easily rectified by right-clicking on the title field and selecting "sentence case".
A reference has been added to the library
If something needs to be edited manually we can do so by left-clicking onto the relevant field. For paper titles, the usual problems would be having to re-capitalise names after correcting for Title Case or adding the HTML tags for italics round organism names. In the present case, however, I find that the title contains HTML codes for single quotation marks instead of the actual quotation mark characters, so I quickly correct that. Manual entry is, of course, also possible for an entire reference, for example if it isn't available online. In that case simply click on the plus in the green circle and select the appropriate publication type.

So much for importing references. It is also possible to bulk-import from the Google Scholar search results, but I would not recommend that as Google sometimes mixes up the metadata.

The style repository

The first time we try to add a reference to a manuscript, we are asked what reference style should be used. Zotero comes with only a few standard styles installed, but many more are available at the Zotero style repository. One of the in my eyes few downsides of Zotero is that it has less styles than Endnote, but often it is possible to get the relevant one under a different name. If, for example, you are preparing a manuscript for PhytoTaxa the ZooTaxa style should serve just as well.

Installing a new style is as easy as finding it in the style repository, clicking on its name, and confirming that it should be installed.

Selecting a reference style
Using Zotero in the word processor

Again, note that the browser needs to be running while we are adding references to a paper. The following assumes LibreOffice, but except for where to find the buttons everything is the same in MS Word.

In LibreOffice you will have new buttons for inserting and editing references, for inserting the reference list, and for changing document settings, in particular the reference style. To insert a reference, click on the button that seems to read r."Z. You can now enter an author name or even just a word from the title, as in my example here, and Zotero will suggest anything that fits.
Adding a reference to a manuscript
Another downside of Zotero, at least as of version 4 which I am still using, is that it doesn't do a reference like "Bronzati (2017)". Instead you can either have "(Bronzati 2017)" or reduce the reference to "(2017)". For this click on the reference in the field where you were asked to select it (if you have already entered it simply use the edit reference button showing r." and pencil) and select "suppress author". Then you have to type the author name(s) yourself outside of the brackets, which is obviously a bit annoying.

Author names outside of brackets have to be added manually
Once we have added a few references, we obviously need to add the reference list. This is as easy as clicking the third button in the Zotero field. The only others that are usually important are the two arrows (refresh) to update the reference list (although it does so automatically when the document is reloaded) and the cogwheel (document preferences) that allows changing the reference style across the document.

In LibreOffice I have sometimes found that adding or updating references changes the format of the entire paragraph they are embedded in. This seems to happen if the default text style is at variance with the text format actually used in the manuscript. Selecting a piece of manuscript text and setting the default style to fit its format has always rectified the situation for me.

Syncing

It is useful to get an account at the Zotero website and use it to sync one's reference library across computers. Again, this works cross-platform. I do it between a Windows computer at work and my personal Linux computer at home. Note, however, that it only syncs the metadata, not any fulltext PDFs that have been saved.

To sync, go into the browser and click on the Z symbol to open Zotero. Now click the cogwheel and select preferences. The preferences window has a sync tab where you can enter your username and password. Do the same on two computers and they should share their reference libraries.