FASTQ sequence format
Contents |
Description
A file format used frequently at the Wellcome Trust Sanger Institute to bundle a FASTA sequence and its quality data.
FASTQ files have sequence and quality data on a single line and the quality values are single-byte encoded. FOr the standard Sanger version of FASTQ, to retrieve the decimal values for qualities you need to subtract 33 (or Octal 41) from each byte and then convert to a '2 digit + 1 space' integer.
The original FASTQ file format can be parsed by the Bio::SeqIO system using the Bio::SeqIO::fastq module. The most recent version (BioPerl v. 1.6.1) can parse variations of FASTQ fastq_paper from Solexa and Illumina and interconvert them if the proper variants are designated (either 'sanger', 'illumina', or 'solexa').
See also:
- the Qual format.
- Sanger FASTQ format and the Solexa/Illumina variants
- FASTQ on Wikipedia
- MAQ page on FASTQ format
Example
@NCYC361-11a03.q1k bases 1 to 1576 GCGTGCCCGAAAAAATGCTTTTGGAGCCGCGCGTGAAAT... +NCYC361-11a03.q1k bases 1 to 1576 !)))))****(((***%%((((*(((+,**(((+**+,-...
See also
- Bio::Seq::Quality
- An example of creating FASTQ from FASTA and QUAL has been scrapbooked here.
References
<biblio>
- fastq_paper pmid=20015970
</biblio>