Difference between revisions of "Superpose"
Line 27: | Line 27: | ||
Most details of protein fold may be expressed in terms of just two types of SSEs, namely helices (including what type of helix) and strands. | Most details of protein fold may be expressed in terms of just two types of SSEs, namely helices (including what type of helix) and strands. | ||
+ | |||
+ | Usually the connectivity of SSEs is significant; however, there are situations where it may or should be neglected (e.g. comparison of mutated or engineered proteins, or geometry of active sites). This is the case I am interested in. That is, one can have three-dimensional SSE graphs that are geometrically identical yet have a difference in connectivity between the SSEs. Flexible connectivity is handled in the following ways: | ||
+ | *Connectivity of SSEs is neglected; | ||
+ | *"Soft" connectivity: The general order of matched SSEs along their protein chains is the same in both structures, but any number of missing or unmatched SSEs between the matched ones is allowed; and | ||
+ | *"Strict" connectivity: Matched SSEs follow the same order along their protein chains and may be separated only by an equal number of matched or unmatched SSEs in both structures. | ||
==Keywords== | ==Keywords== | ||
− | secondary-structure elements (SSEs) | + | secondary-structure elements (SSEs), singular value decomposition (SVM; of the correlation matrix, following the method described by Lesk, 1986), [[RMSD]] |
==See also== | ==See also== | ||
Line 39: | Line 44: | ||
===Related=== | ===Related=== | ||
*''PROMOTIF'' algorithm (Hutchinson & Thornton, 1996) — aids in calculating SSEs | *''PROMOTIF'' algorithm (Hutchinson & Thornton, 1996) — aids in calculating SSEs | ||
+ | *''[http://www.ccp4.ac.uk/dist/html/contact.html CONTACT]'' bricking algorithm (e.g., Tadeusz Skarzynski in ''CCP4'' suite) — computes various types of contacts in protein structures. | ||
+ | *''[http://www.ccp4.ac.uk/dist/html/ncont.html NCONT]'' — analyses contacts between subsets of atoms in a PDB file. | ||
==References== | ==References== |
Revision as of 23:39, 31 July 2007
superpose - structural alignment based on secondary structure matching and is based on the Secondary Structure Matching (SSM) advanced graph-matching algorithm. It is part of the CCP4 package and was written by Eugene Krissinel of the European Bioinformatics Institute, Cambridge, UK.
Contents
Background
"While high sequence similarity almost always implies structural similarity, the opposite is not true. It is therefore expected that three-dimensional alignment will provide more significant clues to protein function and properties than sequence alignment alone".[1]
Most similarity measures are based on the evaluation of the size of common substructures, for example the length of alignment (the longer, the better), and a measure of the distance between them, such as r.m.s.d. (the lower, the better).
The graph-theoretical approach typically includes three major steps:
- graph representation of the objects in question;
- matching the graphs representing the objects; and
- evaluating the common subgraphs found in order to form conclusions about similarity.
Several approaches to protein structure alignment have been explored over the past decade. The techniques used include:
- comparison of distance matrices (DALI; Holm & Sander, 1993);
- analysis of differences in vector distance plots (Orengo & Taylor, 1996);
- minimization of the soap-bubble surface area between two protein backbones (Falicov & Cohen, 1996);
- dynamic programming on pairwise distances between the proteins' residues (Subbiah et al., 1993; Gerstein & Levitt, 1996, 1998);
- secondary-structure elements (SSEs) (Singh & Brutlag, 1997);
- three-dimensional clustering (Vriend & Sander, 1991; Mizuguchi & Go, 1995);
- graph theory (Mitchell et al., 1990; Alexandrov, 1996; Grindley et al., 1993);
- combinatorial extension of alignment path (CE; Shindyalov & Bourne, 1998);
- vector alignment of SSEs (VAST; Gibrat et al., 1996);
- depth-first recursive search on SSE (DEJAVU; Kleywegt & Jones, 1997); and
- many others (Zuker & Somorjai, 1989; Taylor & Orengo, 1989; Godzik & Skolnick, 1994; Russell & Barton, 1992; Sali & Blundell, 1990; Barakat & Dean, 1991; Leluk et al., 2003; Jung & Lee, 2000; Kato & Takahashi, 2001).
Most details of protein fold may be expressed in terms of just two types of SSEs, namely helices (including what type of helix) and strands.
Usually the connectivity of SSEs is significant; however, there are situations where it may or should be neglected (e.g. comparison of mutated or engineered proteins, or geometry of active sites). This is the case I am interested in. That is, one can have three-dimensional SSE graphs that are geometrically identical yet have a difference in connectivity between the SSEs. Flexible connectivity is handled in the following ways:
- Connectivity of SSEs is neglected;
- "Soft" connectivity: The general order of matched SSEs along their protein chains is the same in both structures, but any number of missing or unmatched SSEs between the matched ones is allowed; and
- "Strict" connectivity: Matched SSEs follow the same order along their protein chains and may be separated only by an equal number of matched or unmatched SSEs in both structures.
Keywords
secondary-structure elements (SSEs), singular value decomposition (SVM; of the correlation matrix, following the method described by Lesk, 1986), RMSD
See also
Web servers
- DALI
- VAST
- CE
- DEJAVU
Related
- PROMOTIF algorithm (Hutchinson & Thornton, 1996) — aids in calculating SSEs
- CONTACT bricking algorithm (e.g., Tadeusz Skarzynski in CCP4 suite) — computes various types of contacts in protein structures.
- NCONT — analyses contacts between subsets of atoms in a PDB file.
References
- ↑ Krissinel E, Henrick K (2004). "Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions". Acta Cryst, D60:2256-2268.
Further reading
- Rouvray et al., 1979, and references therein — addresses the problems of structure comparison and recognition by the graph-theoretical approach.