Nomenclature and Symbolism for Amino Acids and Peptides

3AA-14 to 3AA-16

Continued from 3AA-11 to 3AA-13

Contents of 3AA-14 to 3AA-16

References for 3AA-14 to 3AA-16

Continued in 3AA-17


Part 2. Symbolism

Part 2, Section A: THE THREE-LETTER SYSTEM (a revision and updating of [10])

3AA-14. GENERAL CONSIDERATIONS ON THREE-LETTER SYMBOLS

14.1. The symbol chosen for an amino acid (Table 1) is derived from its trivial name, and is usually the first three letters of this name. It is written as one capital letter followed by two lower-case letters, e.g. Gln (not GLN or gln), regardless of its position in a sentence or structure. If any other convention is used in representing residues, e.g. to emphasize homology, this should be stated clearly whenever it is used. When the symbol is used for a purpose other than representing an amino-acid residue, e.g. to designate a genetic factor, three lower-case italic letters may be used, e.g. gln.

14.2. The main use of the symbols is in representing amino-acid sequences. Inasmuch as the symbols by themselves represent the unsubstituted amino acids, they are modified (3AA-16) by hyphens to represent residues. We do not recommend use of the symbols to represent free amino acids in textual material, but such use may be desirable in tables, diagrams or figures. It may also be convenient to use them for indicating residue numbers, e.g. Tyr-110 for tyrosine residue 110. For substituents, supplementary symbols are used (3AA-17 and 3AA-18).

14.3. A symbol may represent either the name or the formula of a compound.

14.4. Heteroatoms of amino-acid residues (e.g. O-3 serine, N-6 of lysine) do not explicitly appear in the symbol, as it represents the whole molecule including them (but see 3AA-17.4).

14.5. Amino-acid symbols denote the L configuration of chiral amino acids unless otherwise indicated by the presence of D or DL before the symbol and separated from it with a hyphen (see also 3AA-19.2). L may similarly be inserted for emphasis.

14.6. Structural formulas may be used together with symbols to make complicated features or reactions clear (for examples see 3AA-17.4).

3AA-15. SYMBOLS FOR AMINO ACIDS

3AA-15.1. Symbols for Common Amino Acids

The symbols for the amino acids that are coded for by mRNA are listed in column 2 of Table 1.

3AA-15.2. Symbols for Less Common Peptide Constituents

Symbols for less common amino acids should be defined in each publication in which they appear. See Addendum and JCBN/NC-IUBMB Newsletter 1999 for selenocysteine. The following principles and notations are recommended.

15.2.1. Hydroxyamino Acids

The symbol 5Hyl is recommended for 5-hydroxylysine, and 4Hyp for 4-hydroxyproline (the numbers may be omitted, especially when limiting the symbols to three letters helps alignment of sequences, provided that the position of substitution is made clear in the text). Similarly 3Hyp would represent 3-hydroxyproline. Alternatively, symbols may be formed as shown in 3AA-17.3 below for substituted residues, so that 4-hydroxyproline may be written as:

Pro(4-OH) or

15.2.2. Alloisoleucine and Allothreonine

Alloisoleucine and allothreonine (3AA-4.4) may be symbolized by aIle and aThr respectively.

15.2.3. 'Nor' Amino Acids

Since 'nor' in 'norvaline' and 'norleucine' is not used in its systematic sense of denoting a lower homologue, but to change the trivial name of a branched-chain compound to designate a straight-chain compound, its use for amino acids should be progressively abandoned (3AA-2.4), along with the earlier symbols Nva and Nle. Appropriate symbols for these compounds, 2-aminopentanoic and 2-aminohexanoic acids, based on symbols proposed for the unsubstituted acids [19], are Ape and Ahx (see also 3AA-15.2.5).error details

15.2.4. 'Homo' Amino Acids

The prefix 'homo', used in the sense of a higher homologue, is commonly used for two amino acids (3AA-2.3). They are symbolized as follows:

   Homoserine     Hse
   Homocysteine   Hcy
15.2.5. Higher Unbranched Amino Acids

Click here for "table free" view if the following is faulty.

The functional prefix 'amino' is included in the symbol as the letter 'A' and 'diamino' as 'A2'. The trivial name of the parent acid is abbreviated to two letters, based, when possible, on the symbols for lipid nomenclature [19]. Unless otherwise indicated single groups are on C-2, two amino groups are in the 2 and terminal positions for monocarboxylic acids, and each is geminal with a carboxyl group for dicarboxylic acids. The location of amino groups other than these is shown by appropriate prefixes.

ExamplesSymbolNote
β-Alanine (3-aminopropanoic acid)βAla
2-Aminobutyric acid (2-aminobutanoic acid)Abu
2-Aminopentanoic acid (2-aminovaleric acid)error detailsApe
2-Aminohexanoic acidAhx
6-Aminohexanoic acidεAhxi
2-Aminoadipic acid (2-aminohexanedioic acid)Aad
3-Aminoadipic acid (3-aminohexanedioic acid)βAad
2-Aminopimelic acid (2-aminoheptanedioic acid)Apm
2,3-Diaminopropionic acid (2,3-diaminopropanoic acid)A2pr or Dprii, iii, iv
2,4-Diaminobutyric acid (2,4-diaminobutanoic acid)A2bu or Dabii
Ornithine (2,5-diaminovaleric acid, 2,5-diaminopentanoic acid)Orn
2,6-Diaminopimelic acid (2,6-diaminoheptanedioic acid)A2pm or Dpmii, iii

Notes

(i) This symbol is recommended in place of the previous εAcp, in which 'cp' stood for caproic, which may be confused with capric and caprylic.

(ii) The previous edition of these recommendations [10] discouraged abbreviations starting 'D' for 'di' or 'T' for 'tri' or 'tetra'; because these letters were overused. We concur in preferring subscripts when these can be applied to well-known symbols, so that Me2SO is preferable to DMSO, Me3Si- to TMS-, and H4 to TH. Nevertheless we are not convinced that 'A2' easily suggests 'diamino', so alternative symbols are presented.

(iii) 'Dap' should not be used as a symbol, since it could be construed to mean either diaminopropanoic acid or diaminopimelic acid.

(iv) 2,3-Diaminopropanoic acid can be regarded as 3-aminoalanine, and so may be symbolized by 'side-chain substitution' (3AA-17.3 below) as Ala(NH2) or , but users should beware of the possibility that the former may be confused with Ala-NH2 (3AA-17.1), the symbol for alaninamide.

15.2.6. Carboxylated and Oxidized Amino Acids

Symbols are recommended for two amino acids that have an additional acidic group and may occur in polypeptide sequences. They are:

  4-Carboxyglutamic acid   Gla
  Cysteic acid             Cya
15.2.7. Non-Amino-Acid Residues in Peptides

Symbols for sugar residues (e.g. Glc, Gal) have been proposed [22], as have ones for nucleoside residues (e.g. Ado, Cyd) [23], and these may be combined with amino-acid symbols to represent glycopeptides, etc. These symbols include [22] Neu for neuraminic acid, Neu5Ac for N-acetyl neuraminic acid, and Mur for muramic acid. Depsipeptides (3AA-19.6) contain hydroxyacid residues; when symbols are used for these they should be defined.

3AA-16. SYMBOLISM OF AMINO-ACID RESIDUES

3AA-16.1. General Principles for Symbolizing Residues

The peptide glycylglycylglycine is symbolized as Gly-Gly-Gly. This involves modifying the symbol Gly for glycine, NH2-CH2 -COOH, by adding hyphens to it, in three ways:

(i) Gly- = NH2-CH2-CO- (normally as NH3+-CH2-CO-)
(ii) -Gly = -NH-CH2-COOH (normally as -NH-CH2-COO-)
(iii) -Gly- = -NH-CH2-CO-

Thus the hyphen, which represents the peptide bond, removes OH from the 1-carboxyl group of the amino acid (written in the conventional un-ionized form) when it is placed on the right of the symbol (i), and removes H from the 2-amino group of the amino acid when it is placed on the left of the symbol (ii); both modifications can apply to one symbol (iii).

Thus the peptide Gly-Glu (without hyphens at its ends) is distinguished from the sequence -Gly-Glu- (with hyphens at its ends).

3AA-16.2. Lack of Hydrogen on the 2-Amino Group

A hyphen on the left of the symbol signifies removal of a hydrogen atom from the 2-amino group, as well as representing the bond formed by the group thus produced. If it should prove necessary to draw a bond to N-2 on the right of the symbol (e.g. in a cyclic peptide, 3AA-19.4 below), then the hyphen must be replaced by an arrow, which points from CO to NH within the peptide bond.

If both atoms on N-2 are replaced, two lines can be drawn on the left of the symbol, e.g.

3AA-16.3. Lack of Hydroxyl on the 1-Carboxyl Group

A hyphen on the right of the symbol signifies removal of hydroxyl from the 1-carboxyl group as well as representing the bond formed by the group produced. If it is not possible to draw this bond on the right of the symbol, as in a cyclic peptide (3AA-19.4) then the hyphen must be replaced by an arrow, which has the same effect.

3AA-16.4. Removal of Groups from Side Chains

16.4.1. Monocarboxylic Acids

A vertical line drawn above or below the symbol for a monocarboxylic amino acid represents removal of hydrogen from the side chain so that a radical is formed. Replacement of this hydrogen by a substituent is treated in 3AA-17.2 below. Unless indicated by a locant placed beside the line, the hydrogen is assumed to be removed from a heteroatom in the residue. Examples:

Notes. (a) H is removed from N-ω rather than N-δ of arginine unless otherwise indicated; (b) a locant, π or τ (3AA-2.2.4), is always required for histidine.

l6.4.2. Dicarboxylic acids

A vertical line drawn above or below either of the symbols Asp and Glu represents removal of OH from the side-chain carboxyl group, as well as representing a bond to a substituent. If a hydrogen has to be removed from a saturated carbon of the side chain, then a vertical line may be used, but it must be accompanied by a locant. Examples:

3AA-16.5. Cyclic Derivatives of Amino Acids

Combination of horizontal lines, indicating removal of H from N-2 (3AA-16.1, 3AA-16.2) or OH from C-1 (3AA-16.1, 3AA-16.3), with the vertical lines that indicate removal of side-chain atoms (3AA-16.4) allows formation of symbols for 5-oxoproline (systematically 5-oxopyrrolidine-2-carboxylic acid, also known as pyroglutamic acid or pyrrolidonecarboxylic acid) and for homoserine lactone, as follows:

See the Addendum for cyclic amides formed by elimination of water between a β-carboxyl group and an α-NH group.


References

7. International Union of Biochemistry (1978) Biochemical Nomenclature and Related Documents, The Biochemical Society, London.

10. IUPAC-IUB Commission on Biochemical Nomenclature (CBN), Symbols for Amino-Acid Derivatives and Peptides, Recommendations 1971, Arch. Biochem. Biophys. 150, 1-8 (1972); Biochem. J. 126, 773-780 (1972), corrected l35, 9 (1973); Biochemistry 11, 1726-1732 (1972); Biochim. Biophys. Acta, 263, 205-212 (1972); Eur. J. Biochem. 27, 201-207 (1972), corrected 45, 2 (1974); J. Biol. Chem. 247, 977-983 (1972); Pure Appl. Chem. 40, 315-331 (1974); also pp. 78-84 in [7].

19. IUPAC-IUB Commission on Biochemical Nomenclature (CBN), The Nomenclature of Lipids, Recommendations l976, Biochem. J. 171, 21-35 (1978); Eur. J. Biochem. 79, 11-21 (1977); Hoppe-Seyler's. Z. Physiol. Chem. 358, 617-631 (1977); Lipids, 12, 455-468 (1977); also pp. 122-132 in [7].

22. IUPAC-IUB Joint Commission on Biochemical Nomenclature (JCBN). Abbreviated Terminology of Oligosaccharide Chains, Recommendations 1980, Eur. J. Biochem. 126, 433-437 (1982); J. Biol. Chem. 257, 3347-3351 (1982); Pure Appl. Chem. 54, 1517-1522 (1982).

23. IUPAC-IUB Commission on Biochemical Nomenclature (CBN), Abbreviations for and Symbols for Nucleic Acids, Polynucleotides and their Constituents, Recommendations 1970, Arch. Biochem. Biophys. 145, 425-436 (1971); Biochem. J. 120, 449-454 (1970); Biochemistry, 9, 4022-4027 (1970); Biochim. Biophys. Acta, 247, 1-12 (1971); Eur. J. Biochem. 15, 203-208 (1970), corrected 25, 1 (1972); Hoppe-Seyler's Z. Physiol. Chem. 351, 1055-1063 (1970) (in German); J. Biol. Chem. 245, 5171-5176 (1970); Mol. Biol. 6, 166-174 (1972) (in Russian); Pure Appl. Chem. 40, 277-290 (1974); also pp. 116-121 in [7].


Continue to the next section with 3AA-17 of Amino Acids and Peptides.

Return to Amino Acids and Peptides home page.