Amino acid

In chemistry, an amino acid is a molecule that contains both amine and carboxyl functional groups. In biochemistry, this term refers to alpha-amino acids with the general formula NH₂CHRCOOH.^[1]

Amino acid atoms

The following atoms are expected for a given amino acid:

 aa   1 2  3 4 5  6   7   8   9   10  11  12  13  14 
 A:   N CA C O CB                                      : Alanine
 V:   N CA C O CB CG1 CG2                              : Valine
 L:   N CA C O CB CG  CD1 CD2                          : Leucine
 I:   N CA C O CB CG1 CG2 CD1                          : Isoleucine
 P:   N CA C O CB CG  CD                               : Proline
 M:   N CA C O CB CG  SD  CE                           : Methionine
 F:   N CA C O CB CG  CD1 CD2 CE1 CE2 CZ               : Phenylalanine
 W:   N CA C O CB CG  CD1 CD2 NE1 CE2 CE3 CZ2 CZ3 CH2  : Tryptophan
 G:   N CA C O                                         : Glycine
 S:   N CA C O CB OG                                   : Serine
 T:   N CA C O CB OG1 CG2                              : Threonine
 C:   N CA C O CB SG                                   : Cysteine
 Y:   N CA C O CB CG  CD1 CD2 CE1 CE2 CZ  OH           : Tyrosine
 N:   N CA C O CB CG  OD1 ND2                          : Asparagine
 Q:   N CA C O CB CG  CD  OE1 NE2                      : Glutamine
 D:   N CA C O CB CG  OD1 OD2                          : Aspartic acid
 E:   N CA C O CB CG  CD  OE1 OE2                      : Glutamic acid
 K:   N CA C O CB CG  CD  CE  NZ                       : Lysine
 R:   N CA C O CB CG  CD  NE  CZ  NH1 NH2              : Arginine
 H:   N CA C O CB CG  ND1 CD2 CE1 NE2                  : Histidine
 X:   N CA C O CB                                      : Nonstandard (ATOM or HETATM records)
 #:   N CA C O                                         : Unknown (ATOM records)

Reduced (redundant or simplified) alphabets for proteins

Two letters alphabet^[2]^[3]

AGTSNQDEHRKP => P: Hydrophilic
CMFILVWY     => H: Hydrophobic

Five letters alphabet: Chemical / structural properties^[4]

IVL   => A: Aliphatic
FYWH  => R: Aromatic
KRDE  => C: Charged
GACS  => T: Tiny
TMQNP => D: Diverse

Six letters alphabet: Chemical / structural properties #2^[4]

IVL   => A: Aliphatic
FYWH  => R: Aromatic
KR    => C: Pos. charged
DE    => C: Neg. charged
GACS  => T: Tiny
TMQNP => D: Diverse

3 IMGT amino acid hydropathy alphabet^[5]

IVLFCMAW => P: Hydrophilic
GTSYPM   => N: Neutral
DNEQKR   =>H: Hydrophobic

Five letters alphabet: Chemical / structural properties^[5]

IVL   => A: Aliphatic
FYWH  => R: Aromatic
KRDE  => C: Charged
GACS  => T: Tiny
TMQNP => D: Diverse5 IMGT amino acid volume alphabet
GAS   => G: 60-90
CDPNT => C: 108-117
EVQH  => E: 138-154
MILKR => M: 162-174
FYW   => F: 189-228

11 IMGT amino acid chemical characteristics alphabet^[5]

AVIL => A: Aliphatic
F    => F: Phenylalanine
CM   => C: Sulfur
G    => G: Glycine
ST   => S: Hydroxyl
W    => W: Tryptophan
Y    => Y: Tyrosine
P    => P: Proline
DE   => A: Acidic
NQ   => N: Amide
HKR  => H: Basic

Murphy et al., 2000; 15 letters alphabet^[6]

LVIM => L: Large hydrophobic
C    => C
A    => A
G    => G
S    => S
T    => T
P    => P
FY   => F: Hydrophobic/aromatic sidechains
W    => W
E    => E
D    => D
N    => N
Q    => Q
KR   => K: Long-chain positively charged
H    => H

Murphy et al., 2000; 10 letters alphabet^[6]

LVIM => L: Large hydrophobic
C    => C
A    => A
G    => G
ST   => S: Polar
P    => P
FYW  => F:Hydrophobic/aromatic sidechains
EDNQ => E: Charged / polar
KR   => K: Long-chain positively charged
H    => H

Murphy et al., 2000; 8 letters alphabet^[6]

LVIMC => L: Hydrophobic
AG    => A
ST    => S: Polar
P     => P
FYW   => F: Hydrophobic/aromatic sidechains
EDNQ  => E
KR    => K: Long-chain positively charged
H     => H

Murphy et al., 2000; 4 letters alphabet^[6]

LVIMC   => L: Hydrophobic
AGSTP   => A
FYW     => F: Hydrophobic/aromatic sidechains
EDNQKRH => E

Murphy et al., 2000; 2 letters alphabet^[6]

LVIMCAGSTPFYW => P: Hydrophobic
EDNQKRH       => E: Hydrophilic

Wang & Wang, 1999; 5 letters alphabet^[7]

CMFILVWY => I
ATH      => A
GP       => G
DE       => E
SNQRK    => K

Wang & Wang, 1999; 5 letters variant alphabet^[7]

CMFI => I
LVWY => L
ATGS => A
NQDE => E
HPRK => K

Wang & Wang, 1999; 3 letters alphabet^[7]

CMFILVWY => I
ATHGPR   => A
DESNQK   => E

Wang & Wang, 1999; 2 letters alphabet^[7]

CMFILVWY     => I
ATHGPRDESNQK => A

Li et al., 2003; 10 letters alphabet^[8]

C   => C
FYW => Y
ML  => L
IV  => V
G   => G
P   => P
ATS => S
NH  => N
QED => E
RK  => K

Li et al., 2003; 5 letters alphabet^[8]

CFYW    => Y
MLIV    => I
G       => G
PATS    => S
NHQEDRK => E

Li et al., 2003; 4 letters alphabet^[8]

CFYW    => Y
MLIV    => I
GPATS   => S
NHQEDRK => E

Li et al., 2003; 3 letters alphabet^[8]

CFYWMLIV => I
GPATS    => S
NHQEDRK  => E

References

↑ Proline is an exception to this general formula. It lacks the NH₂ group because of the cyclization of the side chain.
↑ Chan HS, Dill KA (1989). "Compact polymers". Macromolecules, 22:4559-4573.
↑ Lau KF, Dill KA (1989). "A lattice statistical mechanics model of the conformational and sequence spaces of proteins". Macromolecules, 22:3986-3997.
↑ ^4.0 ^4.1 Betts MJ, Russell RB (2003). "Amino acid properties and consequences of subsitutions". Bioinformatics for Geneticists, M.R. Barnes, I.C. Gray eds, Wiley.
↑ ^5.0 ^5.1 ^5.2 Pommié C, Levadoux S, Sabatier R, Lefranc G & Lefranc MP (2004). "IMGT standardized criteria for statistical analysis of immunoglobulin V-REGION amino acid properties". Journal of Molecular Recognition, 17:17-32. PMID: 14872534
↑ ^6.0 ^6.1 ^6.2 ^6.3 ^6.4 Murphy LR, Wallqvist A, Levy RM (2000). "Simplified amino acid alphabets for protein fold recognition and implications for folding". Protein Eng, 13:149-152. PMID: 10775656
↑ ^7.0 ^7.1 ^7.2 ^7.3 Wang J, Wang W (1999). "A computational approach to simplifying the protein folding alphabet". Nat Struct Biol, 11:1033-1038. PMID: 10542095
↑ ^8.0 ^8.1 ^8.2 ^8.3 Li T, Fan K, Wang J, Wang W (2003). "Reduction of protein sequence complexity by residue grouping". Protein Eng, 5:323-330. PMID: 12826723

External links

[1] Proline is an exception to this general formula. It lacks the NH₂ group because of the cyclization of the side chain.

[Chan1989-2] Chan HS, Dill KA (1989). "Compact polymers". Macromolecules, 22:4559-4573.

[Lau1989-3] Lau KF, Dill KA (1989). "A lattice statistical mechanics model of the conformational and sequence spaces of proteins". Macromolecules, 22:3986-3997.

[Betts-4] 4.0 ^4.1 Betts MJ, Russell RB (2003). "Amino acid properties and consequences of subsitutions". Bioinformatics for Geneticists, M.R. Barnes, I.C. Gray eds, Wiley.

[Pommie2004-5] 5.0 ^5.1 ^5.2 Pommié C, Levadoux S, Sabatier R, Lefranc G & Lefranc MP (2004). "IMGT standardized criteria for statistical analysis of immunoglobulin V-REGION amino acid properties". Journal of Molecular Recognition, 17:17-32. PMID: 14872534

[Murphy2000-6] 6.0 ^6.1 ^6.2 ^6.3 ^6.4 Murphy LR, Wallqvist A, Levy RM (2000). "Simplified amino acid alphabets for protein fold recognition and implications for folding". Protein Eng, 13:149-152. PMID: 10775656

[Wang1999-7] 7.0 ^7.1 ^7.2 ^7.3 Wang J, Wang W (1999). "A computational approach to simplifying the protein folding alphabet". Nat Struct Biol, 11:1033-1038. PMID: 10542095

[Li2003-8] 8.0 ^8.1 ^8.2 ^8.3 Li T, Fan K, Wang J, Wang W (2003). "Reduction of protein sequence complexity by residue grouping". Protein Eng, 5:323-330. PMID: 12826723

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

Amino acid

Contents

Amino acid atoms

Reduced (redundant or simplified) alphabets for proteins

References

Further reading

External links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools