FASTA alignment program
This entry refers to the FASTA alignment program [1, 2]. It produces output which can be parsed by in BioPerl by Bio::SearchIO. There is also a FASTA sequence format which refer to the sequence file format that was initially designed for input to these tools. There is a simple extension of the sequence format to a FASTA multiple alignment format which is different from the database search result format that is output by the FASTA applications.
Bill Pearson's package for sequence database searching.
(Wanted: someone to add some history of FASTA here)
Tips and Hints
BioPerl can parse both the default output and the
-m 9 output which happens to be much more compact and leads to smaller filesizes (since alignments are not produced). If your needs are just E-value scores from SSEARCH or FASTA you can use the following options to produce a small tab-delimited file using the fastam9_to_table.PLS script.
fasta34 -H -E 1e-5 -m 9 -d 0 QueryFile SearchDatabase | fastam9_to_table > results.tab
This will lead to a small filesize limiting your disk space usage requirements and potentially speeding up your analysis.
From the release notes, here is information on how to search a sequence profile against a database using SW algorithm.
>>June 16, 2003 version: fasta34t22 ssearch34 now supports PSI-BLAST PSSM/profiles. Currently, it only supports the "checkpoint" file produced by blastall, and only on certain architectures where byte-reordering is unnecessary. It has not been tested extensively with the -S option. ssearch34 -P blast.ckpt -f -11 -g -1 -s BL62 query.aa library Will use the frequency information in the blast.chkpt file to do a position specific scoring matrix (PSSM) search using the Smith-Waterman algorithm. Because ssearch34 calculates scores for each of the sequences in the database, we anticipate that PSSM ssearch34 statistics will be more reliable than PSI-Blast statistics. The Blast checkpoint file is mostly double precision frequency numbers, which are represented in a machine specific way. Thus, you must generate the checkpoint file on the same machine that you run ssearch34 or prss34 -P query.ckpt. To generate a checkpoint file, run: blastpgp -j 2 -h 1e-6 -i query.fa -d swissprot -C query.ckpt -o /dev/null (This searches swissprot for 2 iterations ("-j 2" using a E() threshold 1e-6 saving the resulting position specific frequencies in query.ckpt. Note that the original query.fa and query.ckpt must match.)