http://wiki.christophchamp.com/index.php?title=PDB&feed=atom&action=historyPDB - Revision history2024-03-29T10:15:52ZRevision history for this page on the wikiMediaWiki 1.26.2http://wiki.christophchamp.com/index.php?title=PDB&diff=3776&oldid=prevChristoph at 06:45, 25 April 20072007-04-25T06:45:42Z<p></p>
<p><b>New page</b></p><div>The '''Protein Data Bank''' ('''PDB''') is a repository for 3-D structural data of proteins and nucleic acids. This data, typically obtained by [[:Category:Crystallography|X-ray crystallography]] or NMR spectroscopy, is submitted by biologists and biochemists from around the world, is released into the public domain, and can be accessed for free.<br />
<br />
==ATOM coordinates format overview==<br />
The ATOM records present the atomic coordinates for standard residues (see http://deposit.pdb.org/public-component-erf.cif). They also present the occupancy and temperature factor for each atom. Heterogen coordinates use the HETATM record type. The element symbol is always present on each ATOM record; segment identifier and charge are optional.<br />
<br />
*Record Format<br />
<pre><br />
COLUMNS DATA TYPE FIELD DEFINITION<br />
------------------------------------------------------<br />
1 - 6 Record name "ATOM "<br />
7 - 11 Integer serial Atom serial number.<br />
13 - 16 Atom name Atom name.<br />
17 Character altLoc Alternate location indicator.<br />
18 - 20 Residue name resName Residue name.<br />
22 Character chainID Chain identifier.<br />
23 - 26 Integer resSeq Residue sequence number.<br />
27 AChar iCode Code for insertion of residues.<br />
31 - 38 Real(8.3) x Orthogonal coordinates for X in <br />
Angstroms<br />
39 - 46 Real(8.3) y Orthogonal coordinates for Y in <br />
Angstroms<br />
47 - 54 Real(8.3) z Orthogonal coordinates for Z in <br />
Angstroms<br />
55 - 60 Real(6.2) occupancy Occupancy.<br />
61 - 66 Real(6.2) tempFactor Temperature factor.<br />
77 - 78 LString(2) element Element symbol, right-justified.<br />
79 - 80 LString(2) charge Charge on the atom.<br />
</pre><br />
# ATOM records for proteins are listed from amino to carboxyl terminus.<br />
# Nucleic acid residues are listed from the 5' to the 3' terminus.<br />
# No ordering is specified for polysaccharides.<br />
# The list of ATOM records in a chain is terminated by a TER record.<br />
# If more than one model is present in the entry, each model is delimited by MODEL and ENDMDL records.<br />
# If an atom is provided in more than one position, then a non-blank alternate location indicator must be used as the alternate location indicator for each of the positions. Within a residue all atoms that are associated with each other in a given conformation are assigned the same alternate position indicator.<br />
# For atoms that are in alternate sites indicated by the alternate site indicator, sorting of atoms in the ATOM/ HETATM list uses the following general rules:<br />
#*In the simple case that involves a few atoms or a few residues with alternate sites, the coordinates occur one after the other in the entry.<br />
#*In the case of a large heterogen groups which are disordered, the atoms for each conformer are listed together.<br />
# The insertion code is commonly used in sequence numbering<br />
# If the depositor provides the data, then the isotropic B value is given for the temperature factor.<br />
# If there are neither isotropic B values from the depositor, nor anisotropic temperature factors in ANISOU, then the default value of 0.0 is used for the temperature factor.<br />
# Columns 77 - 78 contain the atom's element symbol (as given in the periodic table), right-justified.<br />
# Columns 79 - 80 indicate any charge on the atom, e.g., 2+, 1-. In most cases these are blank.<br />
<br />
*Verification/Validation/Value Authority Control<br />
<br />
PDB checks <code>ATOM/HETATM</code> records for PDB format, sequence information, and packing. The PDB reserves the right to return deposited coordinates to the author for transformation into PDB format.<br />
<br />
*Relationships to Other Record Types<br />
<br />
The <code>ATOM</code> records are compared to the corresponding sequence database. Residue discrepancies appear in the <code>SEQADV</code> record. Missing atoms are annotated in the remarks. <code>HETATM</code> records are formatted in the same way as <code>ATOM</code> records. The sequence implied by <code>ATOM</code> records must be identical to that given in <code>SEQRES</code>, with the exception that residues that have no coordinates, e.g., due to disorder, must appear in <code>SEQRES</code>.<br />
<br />
*Example<br />
<pre><br />
1 2 3 4 5 6 7 8<br />
12345678901234567890123456789012345678901234567890123456789012345678901234567890<br />
ATOM 145 N VAL A 25 32.433 16.336 57.540 1.00 11.92 N<br />
ATOM 146 CA VAL A 25 31.132 16.439 58.160 1.00 11.85 C<br />
ATOM 147 C VAL A 25 30.447 15.105 58.363 1.00 12.34 C<br />
ATOM 148 O VAL A 25 29.520 15.059 59.174 1.00 15.65 O<br />
ATOM 149 CB AVAL A 25 30.385 17.437 57.230 0.28 13.88 C<br />
ATOM 150 CB BVAL A 25 30.166 17.399 57.373 0.72 15.41 C<br />
ATOM 151 CG1AVAL A 25 28.870 17.401 57.336 0.28 12.64 C<br />
ATOM 152 CG1BVAL A 25 30.805 18.788 57.449 0.72 15.11 C<br />
ATOM 153 CG2AVAL A 25 30.835 18.826 57.661 0.28 13.58 C<br />
ATOM 154 CG2BVAL A 25 29.909 16.996 55.922 0.72 13.25 C<br />
</pre><br />
<br />
*Known Problems<br />
No distinction is made between ribo- and deoxyribonucleotides in the SEQRES records. These residues are identified with the same residue name (i.e., <code>A, C, G, T, U</code>). <br />
<br />
==External links==<br />
*[http://www.rcsb.org/ RCSB PDB]<br />
*[http://www.wwpdb.org/documentation/format23/v2.3.html Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description] &mdash; version 2.3; 1998-07-09.<br />
*[[wikipedia:Protein Data Bank]]<br />
*[[wikipedia:Protein Data Bank (file format)]]<br />
<br />
[[Category:Bioinformatics]]<br />
[[Category:Crystallography]]</div>Christoph