FASTQ sequence format

From BioPerl

Jump to: navigation, search


Description

A file format used frequently at the Wellcome Trust Sanger Institute to bundle a FASTA sequence and its quality data.

Fastq files have sequence and quality data on a single line and the quality values are single-byte encoded. To retrieve the decimal values for qualities you need to subtract 33 (or Octal 41) from each byte and then convert to a '2 digit + 1 space' integer. You can check if 33 is the right number because the first byte which is always '!' corresponds to a quality value of 0.

This file format can be parsed by the Bio::SeqIO system using the Bio::SeqIO::fastq module.

See also the Qual format.

Example

@NCYC361-11a03.q1k bases 1 to 1576
GCGTGCCCGAAAAAATGCTTTTGGAGCCGCGCGTGAAAT...
+NCYC361-11a03.q1k bases 1 to 1576
!)))))****(((***%%((((*(((+,**(((+**+,-...
Personal tools