When we want to know whether a taxon, for example a genus, is monophyletic (and thus acceptable in a phylogenetic classification), the first thing to do is to infer a phylogeny. We may then find a topology like the following:
Genus A is non-monophyletic on this tree because B is sister to A2. However, the support value for clade A2 + B (the red number, 63 of 100) is not exactly stunning; if this is Bayesian Posterior Probability, you would want 95 or higher, and if it is bootstrap or jackknife you would still want to see at least 80 or so, preferably more. With so little support it could just as well be that these relationships are wrong and genus A is monophyletic after all.
(It is amazing, by the way, how many people find it hard to intuit that when discussing the status of A the red support value is indeed the relevant one. If it is high then precisely that number provides support for the non-monophyly of A while the black value is irrelevant - it merely shows that A1, A2 and B belong together but doesn't tell us anything about A versus B. Some time ago I even had a peer reviewer who got that wrong at first. Perhaps people are just so trained to look for how strong the evidence for monophyly is that they can get confused when they need to look for evidence for non-monophyly.)
So yes, the tree shows A as non-monophyletic, but we can't be sure if the evidence is strong enough. Is there another way of testing whether, let us say, A is "significantly" non-monophyletic?
This is where, for parsimony analysis, the Templeton Test comes in, which by the way doesn't have anything to do with the Templeton Foundation.
It works like this. First, you need the overall best tree (or one randomly chosen from all the equally most parsimonious ones) and an alternative phylogenetic hypothesis, in our case the hypothesis that A is monophyletic. In the original publication, Templeton was dealing with very small trees, so he just compared several set topologies. In contemporary practice, however, we may be dealing with a large phylogeny of dozens of species and be interested in the status of a genus of twenty species without caring about the relationships inside that genus.
What we do then is conduct a parsimony analysis under a constraint: we tell the phylogenetics software to retrieve the best tree in which the genus in question is forced into monophyly. Again we may randomly choose one of several equally parsimonious trees from the constrained search, so that we have exactly two trees to compare. We now want to know if the tree from constraint search is significantly worse than the one from unconstrained search.
The second step is to go through all characters one by one and to calculate the difference in character score between the two trees, which in parsimony terms means the number of changes. Some character may have 1 change on the unconstrained tree and 2 on the constrained tree, so the difference is 2 -1 = 1. Another character may have 4 on the former and 2 on the latter, so the difference is 2 - 4 = -2. In other words, you get negative differences in the few cases where the score is actually better for the overall worse (constrained) tree. Differences of zero are ignored.
Third, sort the differences in ascending order by their absolute size, i.e. all the ones go together, then all the twos, and so on.
Fourth, assign ranks to the differences. Here tied differences get the average rank number. For example, if you have four differences of 1 or -1, they cover ranks 1 to 4, so each gets the rank score (1 + 2 + 3 + 4) / 4 = 2.5. If they are followed by two differences of 2 and -2, they have ranks 5 and 6, so they each get rank score 5.5, and so on.
Fifth, add up the rank scores of all negative differences to get the test statistic. The positive differences (the ones where the unconstrained tree has the better score) are ignored.
Finally, this test statistic is compared against the table of critical values of the Wilcoxon signed rank test. If the statistic is less or equal than the critical value for the number of total differences (negative and positive ones), the difference between the trees is significant. The test is one-sided because it is already clear which tree is better, and we just want to know if it is significantly better than the one that has genus A as monophyletic. If it isn't, if the test comes up as non-significant, then unfortunately we do not have strong evidence that the genus is really non-monophyletic despite it being non-monophyletic on the best tree.
I have read people arguing that one should not see this as a test of hypotheses because only trees are compared, but I am not quite sure what to make of that. As far as I can see, a phylogeny is a phylogenetic hypothesis, so it seems like a distinction without much practical relevance.
Anyway, at the moment the Templeton Test is, as far as I know, only implemented in PAUP, which software not everybody has available. I will make another post on the practicalities of how to conduct it with PAUP and, hopefully, with an alternative.
By the way, for those who think that parsimony analysis is outdated there are also likelihood counterparts to the Templeton Test. The most popular is the Kishino-Hasegawa Test. It is likewise implemented in PAUP but also in RAxML.