Difference between revisions of "Skittles"
(→SKTTLS Report) |
|||
Line 9: | Line 9: | ||
Several kinds of tests on the TLS model (SKTTLS) are reported following the tensor analysis; if a SKTOUT file is given, a copy of this SKTTLS report is also written to that file. Three residuals are calculated for each bond between protein or nucleic acid residues: | Several kinds of tests on the TLS model (SKTTLS) are reported following the tensor analysis; if a SKTOUT file is given, a copy of this SKTTLS report is also written to that file. Three residuals are calculated for each bond between protein or nucleic acid residues: | ||
− | ;CCuij : the correlation of anisotropic ADPs from Merritt (1999)<ref> | + | ;CCuij : the correlation of anisotropic ADPs from Merritt (1999)<ref>Merritt EA (1999). "[http://www.ncbi.nlm.nih.gov/pubmed/10666575 Comparing anisotropic displacement parameters in protein structures]". ''Acta Crystallogr D Biol Crystallogr.'', '''55'''(Pt 12):1997-2004. pmid:10666575.</ref> (ranges from 1 down); |
;rSIMU : residual of the SIMU (BFAC) restraint, rmsd of ANISOU values (ranges from 0 up); and | ;rSIMU : residual of the SIMU (BFAC) restraint, rmsd of ANISOU values (ranges from 0 up); and | ||
;rDELU : residual of the DELU (RBON) Rosenfeld rigid bond restraint (ranges from 0 up). | ;rDELU : residual of the DELU (RBON) Rosenfeld rigid bond restraint (ranges from 0 up). | ||
− | The 95th and 99th percentile values of these three residuals were calculated from a survey of the REFMAC refinements with segmented TLS in the PDB as of | + | The 95th and 99th percentile values of these three residuals were calculated from a survey of the REFMAC refinements with segmented TLS in the PDB as of September 2009. Extreme values of these residuals, e.g. CCuij below the 99th percentile, may indicate problems with the ADPs for the atoms in the bond: the TLS segment boundary was misassigned, or something went wrong during refinement, or one or both atoms have non-positive definite ADPs. |
The SKTTLS report has summaries and the full table of these residuals plus a table of the distribution of anisotropies. Summaries include: | The SKTTLS report has summaries and the full table of these residuals plus a table of the distribution of anisotropies. Summaries include: | ||
Line 22: | Line 22: | ||
*a list of up to 100 outliers, bonds in this structure which are beyond the 95th percentils for any residual. If there are more than 100 such outliers, this analysis should be run again after whatever caused those problems is fixed. | *a list of up to 100 outliers, bonds in this structure which are beyond the 95th percentils for any residual. If there are more than 100 such outliers, this analysis should be run again after whatever caused those problems is fixed. | ||
− | The table of all residuals, SKTTLS TABLE 1, and the table of anisotropy distributions, SKTTLS TABLE 2, are formatted for loggraph or xloggraph plotting. In addition, the files are formatted so that simple extracts of these tables can be plotted by programs such as gnuplot: all non-blank lines other than table data lines are prefixed with '#' and each table is bracketted by lines of the form: "### START OF SKTTLS TABLE N ###" and "### END OF SKTTLS TABLE N ###". | + | The table of all residuals, SKTTLS TABLE 1, and the table of anisotropy distributions, SKTTLS TABLE 2, are formatted for <code>loggraph</code> or <code>xloggraph</code> plotting. In addition, the files are formatted so that simple extracts of these tables can be plotted by programs such as [[gnuplot]]: all non-blank lines other than table data lines are prefixed with '#' and each table is bracketted by lines of the form: "<code>### START OF SKTTLS TABLE N ###</code>" and "<code>### END OF SKTTLS TABLE N ###</code>". |
− | SKTTLS TABLE 1 contains bond identifiers, TLS group identifiers, and residual values. Bond identifiers are residue number, chain ID as text and as a number (to facilitate plotting of separate chains in gnuplot) and bond name e.g. C1-N2 or, if there are alternate atoms, C(A)1-N(A)2. TLS group IDs are TLS group number (numbered sequentially from 1 by their order in TLSIN, rather than the given group number) and a point at each segment break. The opposite of rDELU is included after the residual values so that all 3 residuals can be plotted by loggraph with minimal overlapping. The first two lines of the table contain values for the 95th and 99th percentile of each residual. | + | ;SKTTLS TABLE 1: contains bond identifiers, TLS group identifiers, and residual values. Bond identifiers are residue number, chain ID as text and as a number (to facilitate plotting of separate chains in gnuplot) and bond name e.g. C1-N2 or, if there are alternate atoms, C(A)1-N(A)2. TLS group IDs are TLS group number (numbered sequentially from 1 by their order in TLSIN, rather than the given group number) and a point at each segment break. The opposite of rDELU is included after the residual values so that all 3 residuals can be plotted by loggraph with minimal overlapping. The first two lines of the table contain values for the 95th and 99th percentile of each residual. |
− | SKTTLS TABLE 2 contains anisotropy bin minimum, maximum and center, and fractions and counts for protein, nucleic acid or other non-solvent atoms. Non-positive definite ADPs are counted in the first bin, -0.05 to 0.00; isotropic ADPS are counted in the last bin, 1.00 to 1.00. | + | ;SKTTLS TABLE 2: contains anisotropy bin minimum, maximum and center, and fractions and counts for protein, nucleic acid or other non-solvent atoms. Non-positive definite ADPs are counted in the first bin, -0.05 to 0.00; isotropic ADPS are counted in the last bin, 1.00 to 1.00. |
− | SKTTLS TABLE 3 is the list of outliers, containing bond identifiers and all three residuals values for each outlying bond. Marks (? or !) show which residuals were beyond the 95 or 99th percentile. This table is not formatted for plotting. | + | ;SKTTLS TABLE 3: is the list of outliers, containing bond identifiers and all three residuals values for each outlying bond. Marks (? or !) show which residuals were beyond the 95 or 99th percentile. This table is not formatted for plotting. |
− | NOTE: This is a change to the previous behaviour (CCP4 6.1.1 and earlier) where no ANISOU records from XYZIN were written to XYZOUT. | + | ''NOTE: This is a change to the previous behaviour (CCP4 6.1.1 and earlier) where no ANISOU records from XYZIN were written to XYZOUT.'' |
==Equations== | ==Equations== |
Revision as of 02:33, 9 May 2012
Skittles (aka SKTTLS) is a program I helped to write for the analysis of results of several kinds of tests on the TLS model (SKTTLS) comparing ADPs for the atoms in bonds between residues.[1] The code was written in C, Fortran, and Python. The same output is written to the log file after the analysis of tensors. The SKTOUT file can be used for plotting with programs such as gnuplot. It generates a "SKTTLS Report" file.
Description
Abstract: The use of TLS (translation/libration/screw) models to describe anisotropic displacement of atoms within a protein crystal structure has become increasingly common. These models may be used purely as an improved methodology for crystallographic refinement or as the basis for analyzing inter-domain and other large-scale motions implied by the crystal structure. In either case it is desirable to validate that the crystallographic model, including the TLS description of anisotropy, conforms to our best understanding of protein structures and their modes of flexibility. A set of validation tests has been implemented that can be integrated into ongoing crystallographic refinement or run afterwards to evaluate a previously refined structure. In either case validation can serve to increase confidence that the model is correct, to highlight aspects of the model that may be improved or to strengthen the evidence supporting specific modes of flexibility inferred from the refined TLS model. Automated validation checks have been added to the PARVATI and TLSMD web servers and incorporated into the CCP4i user interface.[1]
SKTTLS Report
Several kinds of tests on the TLS model (SKTTLS) are reported following the tensor analysis; if a SKTOUT file is given, a copy of this SKTTLS report is also written to that file. Three residuals are calculated for each bond between protein or nucleic acid residues:
- CCuij
- the correlation of anisotropic ADPs from Merritt (1999)[2] (ranges from 1 down);
- rSIMU
- residual of the SIMU (BFAC) restraint, rmsd of ANISOU values (ranges from 0 up); and
- rDELU
- residual of the DELU (RBON) Rosenfeld rigid bond restraint (ranges from 0 up).
The 95th and 99th percentile values of these three residuals were calculated from a survey of the REFMAC refinements with segmented TLS in the PDB as of September 2009. Extreme values of these residuals, e.g. CCuij below the 99th percentile, may indicate problems with the ADPs for the atoms in the bond: the TLS segment boundary was misassigned, or something went wrong during refinement, or one or both atoms have non-positive definite ADPs.
The SKTTLS report has summaries and the full table of these residuals plus a table of the distribution of anisotropies. Summaries include:
- the number of bonds for which residuals were counted, and the number of bonds with any residual beyond the 95th or 99th percentile;
- the mean, standard deviation and worst value of each residual for this structure;
- the mean and standard deviation anisotropy for all protein, nucleic acid and other non-solvent atoms in this structure;
- a list of up to 100 outliers, bonds in this structure which are beyond the 95th percentils for any residual. If there are more than 100 such outliers, this analysis should be run again after whatever caused those problems is fixed.
The table of all residuals, SKTTLS TABLE 1, and the table of anisotropy distributions, SKTTLS TABLE 2, are formatted for loggraph
or xloggraph
plotting. In addition, the files are formatted so that simple extracts of these tables can be plotted by programs such as gnuplot: all non-blank lines other than table data lines are prefixed with '#' and each table is bracketted by lines of the form: "### START OF SKTTLS TABLE N ###
" and "### END OF SKTTLS TABLE N ###
".
- SKTTLS TABLE 1
- contains bond identifiers, TLS group identifiers, and residual values. Bond identifiers are residue number, chain ID as text and as a number (to facilitate plotting of separate chains in gnuplot) and bond name e.g. C1-N2 or, if there are alternate atoms, C(A)1-N(A)2. TLS group IDs are TLS group number (numbered sequentially from 1 by their order in TLSIN, rather than the given group number) and a point at each segment break. The opposite of rDELU is included after the residual values so that all 3 residuals can be plotted by loggraph with minimal overlapping. The first two lines of the table contain values for the 95th and 99th percentile of each residual.
- SKTTLS TABLE 2
- contains anisotropy bin minimum, maximum and center, and fractions and counts for protein, nucleic acid or other non-solvent atoms. Non-positive definite ADPs are counted in the first bin, -0.05 to 0.00; isotropic ADPS are counted in the last bin, 1.00 to 1.00.
- SKTTLS TABLE 3
- is the list of outliers, containing bond identifiers and all three residuals values for each outlying bond. Marks (? or !) show which residuals were beyond the 95 or 99th percentile. This table is not formatted for plotting.
NOTE: This is a change to the previous behaviour (CCP4 6.1.1 and earlier) where no ANISOU records from XYZIN were written to XYZOUT.
Equations
- rSIMU = [1/6*(sigma(Uij - Vij)^2]^(1/2)
See also
References
- ↑ 1.0 1.1 Frank Zucker, P. Christoph Champ, and Ethan A. Merritt (2010). Validation of crystallographic models containing TLS or other descriptions of anisotropy. Acta Cryst., D66:889-900. DOI:10.1107/S0907444910020421
- ↑ Merritt EA (1999). "Comparing anisotropic displacement parameters in protein structures". Acta Crystallogr D Biol Crystallogr., 55(Pt 12):1997-2004. pmid:10666575.