A program for filtering low complexity regions in amino acid sequences [1, 2]. Residues that have been masked are represented as "X" in an alignment, but it can also be run as a standalone program as well. The module Bio::Tools::Run::Seg can assist.
It is useful to run seg on sequences before running FASTA so that the -S option can be used to mask sequence portions in the search phase but allow them to participate in the final alignment. An example of seg running on a FASTA sequence format]] sequence database is shown here
$ seg database -q -z 1 > database.seg
It can be found here
- Wootton JC and Federhen S. Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 1996;266:554-71.
- Wootton JC and Federhen S. Statistics of local complexity in amino acid sequences and sequence databases. Computers in Chemistry 1996; 17:149-163.