Bootstrapping

From Christoph's Personal Wiki
Revision as of 01:27, 31 December 2005 by Christoph (Talk | contribs) (Started article)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Bootstrapping, when applied to phylogenetics, tests whether your entire dataset is supporting your phylogenetic tree, or if the tree is just a marginal winner among many nearly equal alternatives.

"[Bootstrapping is accomplished] by taking random subsamples of the dataset, building trees from each of these and calculating the frequency with which the various parts of your tree are reproduced in each of these random subsamples. If group X is found in every subsample tree, then its bootstrap support is 100%; if it is found in only two-thirds of the subsample trees, its bootstrap support is 67%. Each of the subsamples is the same size as the original, which is accomplished by allowing repeat sampling of sites; that is, random sampling with replacement. It is a simple test, but bootstrap analyses of known phylogenies (viral populations evolved in the laboratory) show that is is a generally dependable measure of phylogenetic accuracy, and that values of 70% or higher are likely to indicate reliable groupings." — by Sandra L. Baldauf (2003).

Statistics

In statistics bootstrapping is a method for estimating the sampling distribution of an estimator by resampling with replacement from the original sample. It is distinguished from the jackknife procedure, used to detect outliers, and cross-validation, whose purpose is to make sure that results are repeatable. There are more complicated bootstraps for sampling without replacement, two-sample problems, regression, time series, hierarchical sampling, and other statistical problems.

See also particle filter for the general theory of Sequential Monte Carlo methods, as well as details on some common implementations.

References

Phylogenetics

  • Baldauf SL (2003). Phylogeny for the faint of heart: a tutorial. TRENDS in Genetics 19(6):345-351.
  • Hillis DM and Bull JJ (1993). An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analyses. Sys Biol 42:182-192.
  • Felsenstein J (1985). Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783-791.

Statistics