Module:Bio::SearchIO::blastxml
From BioPerl
| Pdoc documentation: Bio::SearchIO::blastxml | CPAN documentation: Bio::SearchIO::blastxml |
|---|
This is a Bio::SearchIO plugin module which parses BLAST XML output. This module requires XML::SAX and one of the following backend parsers:
- XML::SAX::ExpatXS - SAX2-compliant, XS-based based interface to expat
- XML::LibXML - comes with a SAX2-compliant module (XML::LibXML::SAX)
- XML::SAX::PurePerl - if you have installed XML::SAX, you already have this module.
Of the above, XML::SAX::ExpatXS is by far the fastest; the default XML::SAX::PurePerl parser is the slowest.
Note
- BioPerl currently requires the latest version of XML::SAX (v 0.15). Older versions have a known bug (Issue #2159) in XML::SAX::PurePerl which affects encoded characters; this does not cause XML parsing to fail but leads to other errors.
- Only the latest version of XML::SAX::Expat (v. 0.38) works; it fixes a bug which required external entity validation (not possible with BLAST XML). It is highly recommended that anyone using this module switch to using XML::SAX::ExpatXS, which is actively maintained.
If you are interested in speed-testing a particular SAX2-based XML::SAX backend, you can do so by the following:
use Bio::SearchIO; $XML::SAX::ParserPackage = 'XML::SAX::PurePerl'; my $parser = Bio::SearchIO->new(-verbose => 1, -format => 'blastxml', -file => shift); my $ct = 0; while (my $res = $parser->next_result) { # do whatever here }
If ParserDetails.ini doesn't exist, you can use the following to add it (run only if ParserDetails.ini doesn't exist!):
perl -MXML::SAX -e "XML::SAX->add_parser(q(XML::SAX::ExpatXS))->save_parsers()"