BioPerl Alphabets

From BioPerl
Jump to: navigation, search

Bioperl alphabets

Bioperl modules use the standard extended single-letter genetic alphabets to represent nucleotide and amino acid sequences.

In addition to the standard alphabet, the following symbols are also acceptable in a biosequence:


Symbol Meaning
? a missing nucleotide or amino acid
- gap in sequence

Extended DNA / RNA alphabet

Symbol Meaning Nucleic Acid
A A Adenine
C C Cytosine
G G Guanine
T T Thymine
U U Uracil
M A or C aMino
R A or G puRine
W A or T Weak
S C or G Strong
Y C or T pYrimidine
K G or T Keto
V A or C or G not T (V)
H A or C or T not G (H)
D A or G or T not C (D)
B C or G or T not A (B)
X G or A or T or C any (not recommended)
N G or A or T or C aNy
IUPAC-IUB SYMBOLS FOR NUCLEOTIDE NOMENCLATURE:
Cornish-Bowden (1985) Nucl. Acids Res. 13: 3021-3030.

Amino Acid alphabet

Note that every letter of the alphabet is now used in the amino acid code.

Symbol Meaning
A Alanine
B Aspartic Acid, Asparagine
C Cystine
D Aspartic Acid
E Glutamic Acid
F Phenylalanine
G Glycine
H Histidine
I Isoleucine
J Leucine,Isoleucine
K Lysine
L Leucine
M Methionine
N Asparagine
O Pyrrolysine
P Proline
Q Glutamine
R Arginine
S Serine
T Threonine
U Selenocysteine
V Valine
W Tryptophan
X Unknown
Y Tyrosine
Z Glutamic Acid, Glutamine
* Terminator
IUPAC-IUP AMINO ACID SYMBOLS:
Biochem J. 1984 Apr 15; 219(2): 345-373
Eur J Biochem. 1993 Apr 1; 213(1): 2
G. Srinivasan, C. M. James, J. A. Krzycki.  Pyrrolysine encoded by
UAG in Archaea: charging of a UAG-decoding specialized tRNA.  Science
2002, 296:1459-1462.
Personal tools
Namespaces
Variants
Actions
Main Links
documentation
community
development
Toolbox