# Maximum parsimony

Maximum parsimony (or MP) is a simple but popular technique used in cladistics to predict an accurate phylogenetic tree for a set of taxa (commonly a set of species or reproductively-isolated populations of a single species).

The input data used in a maximum parsimony analysis is in the form of "characters" for a range of taxa. A character could be a binary value for the presence or absence of a feature (such as the presence of a tail), or it could be the protein or nucleic acid residue at a particular site in the organism's genome.

There are a number of probabilistic and deterministic algorithms available for constructing phylogenetic trees. However, what we describe here is the criterion that is used to select the best tree, not the specific algorithm used to find that answer.

The trees used in maximum parsimony analysis are, in a general way, unrooted trees (there is no indication of time in the tree, only the relations between taxa). All the taxa used in the analysis are leaf nodes (often called tips) in the tree (so have only one edge into them). Internal nodes are inserted into the tree to represent the inferred ancestral species. Each internal node has exactly three edges into it. Transitions between characters are placed on any of the edges on the tree.

The maximum parsimony tree is the tree which explains the characters, and which has the less than or equal to the number of transitions on any other tree that also explains the characters. To do this, all trees are given a length, equal to the minimum number of transitions which can explain the tree. The tree with the lowest length is the maximum parsimony tree. This assumption that "the simplest possible explanation is the best" is a generalisation of Occam's Razor.

## Problems with maximum parsimony

Maximum parsimony is a very simple approach, and is popular for this reason. However, it is not always very accurate.

Although decreasingly of concern as geneotyping technology provides more characters, a major problem for paleontology with maximum parsimony is that it assumes that the only way two species can share the same character is if they are genetically related. This is not always the case. For example, birds and bats have wings, while crocodiles and humans do not. Based on this data, maximum parsimony would tend to group crocodiles with humans, and birds with bats. However, humans are actually more closely related to bats (a mammal) than crocodiles (a reptile) or birds. A similar problem occurs to back-mutation, where, due to a small mutation, a character is removed, but then it later comes back. Because maximum parsimony uses unrooted trees, independent innovation and back-mutation are actually mathematically equivalent.

As demonstrated in 1978 by Felsenstein, maximum parsimony can systematically fail on some types of trees, where there are a limited number of characters. This situations is called long branch attraction, and occurs, for example, where you have long branches (a high level of substitutions) for two characters (A & C), but short branches for another two (B & D). A and B diverged from a common ancestor, as did C and D.

Assume for simplicity that we are considering a single binary character (it can either be + or -). Because the distance from B to D is small, in the vast majority of all cases, B and D will be the same. Here, we will assume that they are both + (+ and - are assigned arbitrarily and swapping them is only a matter of definition). If this is the case, there are four remaining possibilities. A and C can both be +, in which case all taxa are the same and all the trees have the same length. A can be + and C can be -, in which case only one character is different, and we cannot learn anything, as all trees have the same length. Similarly, A can be - and C can be +. The only remaining possibility is that A and C are both -. In this case, however, we group A and C together, and B and D together.

As a consequence, when we have a tree of this type, the more data we collect (i.e. the more characters we study), the more we tend towards the wrong tree.

To overcome this problem, a different approach, such as maximum likelihood, neighbour joining or the quartet method, can be used.

Another problem is that finding the most parsimonious tree is a NP-Hard problem, thus the only available, at the moment, efficient way of obtaining a solution, given a arbitrarily large set of taxa, is by using an approximation algorithm, a heuristic method or just trying to restrict the problem to a subset of taxa.