ListSummary:May 10-31,2006
May 10-31, 2006
Oh me oh my. I go to conference and come back to a pretty busy mail-list. Of course I added a few comments myself to everything, but that's just because I can't keep anything to myself. Looks like lots of changes and proposed changes to modules, a few new ideas, numerous bug reports, and some good ol' Perl hackery. There's so much now I'll have to break this list summary into a couple of updates, which should be complete by June 6 (fingers crossed!).
First things first...
Announcements
The Deobfuscator
At the top of the list is the availability of the Deobfuscator interface on the BioPerl server, thanks to Mauricio and Dave. What's the Deobfuscator? It's an easy way 'to determine the methods that are available from a given BioPerl module' (directly from the page below):
http://www.bioperl.org/wiki/Deobfuscator
http://bioperl.org/pipermail/bioperl-l/2006-May/021516.html
Second call for abstracts - BOSC 2006
Darin London has made the second call for abstracts for BOSC 2006. Details here...
http://bioperl.org/pipermail/bioperl-announce-l/2006-May/thread.html
CDAT - Integrative objects for character trees and data
On the main mail list, Arlin Stoltzfus has made a proposal an integrative object (called CDAT) for character trees and data. The second link below has the link to proposal outline in PDF format...
http://bioperl.org/pipermail/bioperl-l/2006-May/021481.html
http://bioperl.org/pipermail/bioperl-l/2006-May/021509.html
Modware - BioPerl-based OO API for Chado
Eric Just has posted Modware, a BioPerl-based, object-oriented API, written in Perl, for Chado. More information can be found here...
Eric's mailing list and news posts. Here is a direct link to the Modware site.
Bioperl-based Applications for "Free Software" Session
Andreas Bender announces a Computational Life Sciences Conference in which users can submit BioPerl or cool uses for BioPerl...
http://bioperl.org/pipermail/bioperl-l/2006-May/021636.html
Project OpenLab
Jay Hannah proposes a new project (Project OpenLab) to 'provide a "point and click" toolset allowing researchers to quickly build arbitrary databases of sequences, primer groups, and primers.' Fernan Aguero and Rutger Vos gives their opinions...
http://bioperl.org/pipermail/bioperl-l/2006-May/021638.html
http://omaha.pm.org/kwiki/?BioPerl
Bio::Phylo Mail List
Rutger Vos has announced the availability of the Bio::Phylo mail list, hosted by the Open Bioinformatics Foundation... hey, that's us!
http://lists.open-bio.org/mailman/listinfo/bio-phylo-l
http://bioperl.org/pipermail/bioperl-l/2006-May/021740.html
And now to the mail lists!
Bioperl-l
Bio::SeqIO oddness
YT points out issues with the way strains and subspecies designations are handled within Bio::SeqIO (via Bio::Species). Torsten chimes in as well; the resolution (though indirect): pull out the information from the 'source' tag from the feature table and look into Bio::Species yet again....
http://bioperl.org/pipermail/bioperl-l/2006-May/021474.html
More fun with primer3
Chen Li and friends (?) continue on with discussion on the idiosyncrasies of using Bio::Tools::Primer3, Bio::Tools::Run::Primer3, and just what is the difference between the two (hint: it's in the name)...
http://bioperl.org/pipermail/bioperl-l/2006-May/021476.html
http://bioperl.org/pipermail/bioperl-l/2006-May/021488.html
Problems grabbing the PubMed ID
Yu Zhou responds to an older post about grabbing the PubMed ID from a file; Brian O. reveals that Yu may be using old code...
http://bioperl.org/pipermail/bioperl-l/2006-May/021477.html
Fun(?), annoyances, and confusion with Bio::Species/Bio::Taxonomy::Node/Bio::DB::Taxonomy
Sendu Bala writes in to state his confusion with Bio::Taxonomy::Node and how it handles names. Jason and YT chime in, discussing issues which also relate to Bio::DB::Taxonomy and Bio::Species parsing...
http://bioperl.org/pipermail/bioperl-l/2006-May/021479.html
BioPerl and converting gene predictions to GFF
Torsten responds to an off-list question asking how one can convert several results containing gene predictions to GFF format, giving a bit of code to demonstrate...
http://bioperl.org/pipermail/bioperl-l/2006-May/021502.html
More fun(?) with Bio::DB::Taxonomy
Back for more, Sendu Bala shows that Bio::DB::Taxonomy has problems with certain names at the species level (and possibly below) and posts some potential code as an example. YT agrees and, in a caffeine-induced haze, goes off on all things taxonomic, born of frustrations with species/strain mangling from Bio::Species and Bio::DB::Taxonomy. Nadeem Faruque gives his thoughts as well, including other possible avenues to fix the issue...
http://bioperl.org/pipermail/bioperl-l/2006-May/021505.html
http://bioperl.org/pipermail/bioperl-l/2006-May/021542.html
How not to use BioPerl - Lesson #1
Saurabh Maheshwari asks for help with a complicated script that is meant to look for protein interactions using Bio::Graph modules to look for protein-protein interactions. Of course, this wasn't immediately apparent from the original post, so Marc Logghe and Sean Davis prodded Saurabh for more information. Sean Davis makes further recommendations, while YT gets honest when scrutinizing the submitted code. Lesson learned here? Do not cut-and-paste or copy from various scripts and expect it to work in the way you want...
http://bioperl.org/pipermail/bioperl-l/2006-May/021506.html
http://bioperl.org/pipermail/bioperl-l/2006-May/021524.html
Parsing output files from other tools
Hubert Prielinger wants to know if Bio::SearchIO can parse mpsearch format. YT points out that it can't, but one could either build a module or use ssearch and Bio::SearchIO::fasta instead...
http://bioperl.org/pipermail/bioperl-l/2006-May/021515.html
How to get the Reverse Complement from a sequence
Chen Li has problems getting the reverse complement from a sequence file; YT and Marc lend a hand...
http://bioperl.org/pipermail/bioperl-l/2006-May/021518.html
UniProt and MySQL
Hilmar responds, requests thought on issues with getting the UniProt database into MySQL (originally posted on BioSQL, see below)...
http://bioperl.org/pipermail/bioperl-l/2006-May/021531.html
Possible memory leak in Bio::SearchIO?
Wayne Clark is worried about the possibility of a memory leak in Bio::SearchIO::blast. YT and Torsten give their thoughts, including the possibility that MySQL is to blame...
http://bioperl.org/pipermail/bioperl-l/2006-May/021540.html
http://bioperl.org/pipermail/bioperl-l/2006-May/021562.html
Bio::DB::Query::GenBank and ID's
Bernd Web finds some potential problems with the way Bio::DB::Query::GenBank and Bio::DB::GenBank handle sequence ID's, including several instances when thrown errors would be helpful and problems with example in the POD. YT explains that this is mainly due to differences between methods used between the two modules and decides to look into it in the future...
http://bioperl.org/pipermail/bioperl-l/2006-May/021553.html
Where's Bio::SeqIO::entrezgene?
Kenny Daily wants to know where to find Bio::SeqIO::entrezgene; YT points out where. Kenny replies back with some problems and Stefan Kirov helps out...
http://bioperl.org/pipermail/bioperl-l/2006-May/021550.html
http://bioperl.org/pipermail/bioperl-l/2006-May/021552.html
Six Frame Translation
Chen Li wants to know how to translate in all six frames; Scott Markel and Brian O. help out...
http://bioperl.org/pipermail/bioperl-l/2006-May/021554.html
http://bioperl.org/pipermail/bioperl-l/2006-May/021560.html
Where's Bio::ASN1::EntrezGene?
Ryan Golhar looks for Bio::ASN1::EntrezGene but can't find it in BioPerl; YT points out that it isn't in BioPerl, but Ryan figures this out anyway...
http://bioperl.org/pipermail/bioperl-l/2006-May/021556.html
Formatting sequence output
Chen Li chimes in again with a question about a module for sequence formatting. Malcolm Cook gives him some advice...
http://bioperl.org/pipermail/bioperl-l/2006-May/021565.html
http://bioperl.org/pipermail/bioperl-l/2006-May/021579.html
Bio::Map enhancements
Sendu Bala has posted proposed enhancements to Bio::Map modules in Bugzilla...
http://bioperl.org/pipermail/bioperl-l/2006-May/021570.html
Performance problems and proposed enhancements for BioPerl
David Waner uncovered some pretty significant performance issues with BioPerl and Perl 5.8 on Windows and suggested several fixes for the changes which reduced the parsing time dramatically. YT responded enthusiastically (he's seen issues himself) and Brian O. asked about how the relevant tests responded...
Getting at features and annotation
Nick Staffa wanted to know how to get features and annotation in a GenBank genome file; Brian O., Torsten, and Chandan Singh all give advise (specifically, check the FAQ and the HOWTO's)...
http://bioperl.org/pipermail/bioperl-l/2006-May/021575.html
BioPerl-ext, alignments, and Inline C/XS
Adam Kraut posted a question about how to best 'wrap' a C library for aligning sequences. Aaron Mackey points him towards bioperl-ext and gives some pointers on getting started. Stephen Lenk also chips in with some more advice and a little code...
http://bioperl.org/pipermail/bioperl-l/2006-May/021576.html
http://bioperl.org/pipermail/bioperl-l/2006-May/021587.html
How to parse BLAST XML output
Hubert Prielinger asked how to parse BLAST XML output. Turns out, according to Warren Gish, that WU-BLAST XML and BLAST XML should both be parsed by Bio::SearchIO::blastxml. YT pointed out changes recently to Bio::SearchIO::blastxml that require XML::SAX and XML::SAX::ExpatXS. The next issue was trying to get at taxonomic information, which Jason and YT address...
http://bioperl.org/pipermail/bioperl-l/2006-May/021574.html
http://bioperl.org/pipermail/bioperl-l/2006-May/021595.html
Guess the sequence format
Wijaya Edward wants to know if there is a way to guess the sequence format. Jason points out Bio::Tools::GuessSeqFormat...
http://bioperl.org/pipermail/bioperl-l/2006-May/021597.html
Bio::Graphics issues
Our old friend Chen Li comes forward with problems concerning Bio::Graphics. Brian O., Torsten, and Lincoln lend a hand...
http://bioperl.org/pipermail/bioperl-l/2006-May/021600.html
Bio::Graph questions
Neil Saunders has problems with using Bio::Graph modules, specifically Bio::Graph::IO and Bio::Graph::ProteinGraph. Brian O. suggests a different (though experimental) set of modules he's donating, called bioperl-network. Brian, has there been an announcement yet?
http://bioperl.org/pipermail/bioperl-l/2006-May/021602.html
Bio::SeqIO::swiss versioning problems
Michael Rogoff points out that Bio::SeqIO::swiss doesn't parse out the sequence version correctly (its on the same line as the date) and proposes a patch. Jason directs him to the proper place to submit patches, but then points out that the date lines needs to have the extra versioning information stripped out. YT also says that the patch doesn't address recent changes as UniProt to the SwissProt format.
http://bioperl.org/pipermail/bioperl-l/2006-May/thread.html
Bio::SeqFeature::Tools::Unflattener problems
Barry Moore uncovers a nasty bug with Unflattener, and YT confirms. Another one for Bugzilla...
http://bioperl.org/pipermail/bioperl-l/2006-May/021610.html
Bio::Graphics problems (it's like I'm hearing an echo...)
Chen Li tries to figure out why he can't get a .png file to work correctly. Lincoln, Torsten, Brian O., and Marc Logghe offer up suggestions, with Lincoln providing the solution...
http://bioperl.org/pipermail/bioperl-l/2006-May/021600.html
Bio::Restriction::IO...
Derek Fairley asks about using Bio::Restriction::IO::bairoch. Though he got no direct answer from the mail list, the same question was posited again (see below).
http://bioperl.org/pipermail/bioperl-l/2006-May/021617.html
Bio::Assembly::IO::ace output
Rowan Mitchell wants to know if write output is to be implemented for Bio::Assembly::IO::ace. Robson Francisco de Souza replies back that there are no current plans to add a write method for this module since it wasn't needed when originally submitted...
http://bioperl.org/pipermail/bioperl-l/2006-May/021618.html
Getting files in EMBL format
Chen Li wants to know how to download files directly from EMBL. No one responds. Poor Chen...
http://bioperl.org/pipermail/bioperl-l/2006-May/021622.html
Bio::Graphics::Panel negative position numbering
Kevin Lam Koiyau wants to know how to get negative position numbering with Bio::Graphics::Panel. Lincoln indicates that this is possible under bioperl 1.5.1. Jonathan Epstein adds a second question on how to add directional arrows to for BLAST hits, but no one answers.
http://bioperl.org/pipermail/bioperl-l/2006-May/021625.html
http://bioperl.org/pipermail/bioperl-l/2006-May/021640.html
...and Bio::Restriction::IO again
Jelena Obradovic points out problems with converting a REBASE file into a Bio::Restriction::EnzymeCollection object. YT points out the format is not captitalized, but find that the normal format ('bairoch') dies with an error. Brian points out that the format name should be more forgiving of case but YT indicates that case is the least of the issues here...
http://bioperl.org/pipermail/bioperl-l/2006-May/021632.html
Bio::Tools::Run::StandAloneBlast
Genevieve DeClerck is trying to work out how to combine several analyses while parsing through a BLAST report. Brian O. and Torsten lends a hand, and Sendu Bala gives some input on a possible solution.
http://bioperl.org/pipermail/bioperl-l/2006-May/021642.html
Problems with bioperl-ext
Orphan : Simon Rayner is having issues getting bioperl-ext running on SUSe Linux...
http://bioperl.org/pipermail/bioperl-l/2006-May/021646.html
Using and integrated server to grab output from multiple servers
Orphan : Shameer Khadar wants to know how to get a system set up for grabbing output from multiple servers for display on an integrated server. Shameer, never start your question with 'My query may not be directly related to BioPERL', as we might lose interest...
http://bioperl.org/pipermail/bioperl-l/2006-May/021648.html
Bio::Graphics::Panel backgroud color
Orphan : Jelena Obradovic wants to know how to turn off the backgraound color of the panel.
CVS and code auditing
Torsten Seemann stirs up a hornet's nest with an audit regarding the use of 'return undef;' in the core code based on a suggestion made by Damian Conway in Perl Best Practices. Pretty much everybody (where's Jason?!?) gives an opinion on the issue; YT kicks the nest around a bit more by pointing out how many 'throw_not_implemented statements are found in regular (non-interface) code. How fun!
http://bioperl.org/pipermail/bioperl-l/2006-May/021651.html
Bio::TreeIO "Collapse" function
Lucia Peixoto ponders on how to collapse any node in a tree. Aaron Mackey and Jason suggest solutions. Lucia has a few more questions. I have a question Lucia: what exactly is the "ass_Descendent" method? Couldn't find that one in POD...
http://bioperl.org/pipermail/bioperl-l/2006-May/021658.html
Moving the tutorial to the wiki
Jay Hannah tries moving the Bioperl tutorial instructions to the wiki and finds out how tricky it can be. Brian O., YT, and Mauricio all join in on trying to decide whether to keep the entire script or to split the documentation from the script completely...
http://bioperl.org/pipermail/bioperl-l/2006-May/021670.html
How to add methods to a module
Hongyu Zhang wants to donate code to add to Bio::SimpleAlign but doesn't know how. YT modifies the FAQ (sneaky!) and points him to it...
http://bioperl.org/pipermail/bioperl-l/2006-May/021683.html
BioSQL-l
Problems loading uniprot release 49.6 into mysql
S. Rayner (if that is your real name) relays problems gettng Uniprot loaded into MySQL. An issue with the 'RL' line is located in SwissProt files in which the annotation isn't unique. Hilmar suggests a few things to work around that and suggests several possible fixes...
http://lists.open-bio.org/pipermail/biosql-l/2006-May/000977.html
load_seqdatabase.pl errors and GenBank-mangled UniProt files
Gerben Menschaert submits a problem with loading a GenBank sequence. Hilmar points out that the file in question is a UniProt file and that the GenBank parser has problems with these files, then suggests loading the UniProt files using the SwissProt format.
http://lists.open-bio.org/pipermail/biosql-l/2006-May/000983.html
BioSQL and gbrowse
Genevieve DeClerck says she's having problems getting BioSQL to work with Gbrowse. Hilmar has a few suggestions...
Problem with adding features under BioSQL
Michael Cipriano finds an issue with BioSQL and adding features under GBrowse. Hilmar is confused (it IS BioSQL-l, not the GBrowse list). Lincoln posts a fix for the issue.
http://lists.open-bio.org/pipermail/biosql-l/2006-May/000987.html
Bioperl-guts-l (for the diehards)
Note: Significant module changes and additions to CVS are normally announced on the main bioperl-l list if they are in decent enough condition for production work. If modules listed below have not been announced, then there may be a very good reason for it. If you plan on trying to use these, consider contacting the author(s). Many of the modules discussed in this section are highly experimental and are in various stages of development. They may or may not work at all. Therefore, we are not responsible for any problems faced with using this code.
I'm trying out a new layout this week which will hopefully enable me to keep up with all the changes a bit easier (I'm using a script to consolodate this stuff). Let me know what you think.
CVS Commits
Brian has contributed bioperl-network, which are used to analyze protein-protein interactions.
Module changes:
- Core package : Bio::Index::EMBL, Bio::Index::Fasta, Bio::Index::GenBank, Bio::Index::Qual, Bio::Index::SwissPfam
- Network package : Bio::Network::Edge, Bio::Network::IO, Bio::Network::Interaction, Bio::Network::Node, Bio::Network::ProteinNet
- Network package : Bio::Network::IO::dip_tab, Bio::Network::IO::psi_xml
- Network package : Bio::Network::ProteinNet
- Network package : Bio::Network::Edge, Bio::Network::IO, Bio::Network::Interaction, Bio::Network::ProteinNet
Test file changes and additions:
- Modified tests :Edge.t, IO_dip_tab.t, IO_psi_xml.t, Interaction.t, Node.t, ProteinNet.t
- Added/Modified data :00001.xml, bovin_small_intact.xml, psi_xml.dat, sv40_small.xml, tab1part.tab, tab2part.tab, tab3part.tab, tab4part.tab
- Modified tests :IO_psi_xml.t
- Modified tests :ProteinNet.t
- Added/Modified data :arath_small-02.xml
Modified or added scripts/examples:
Module changes:
- Core package : Bio::SearchIO::blastxml
- Core package : Bio::SeqIO::genbank
- Core package : Bio::Annotation::Reference
- Core package : Bio::Restriction::IO::bairoch, Bio::Restriction::IO::base, Bio::Restriction::IO::withrefm
- Core package : Bio::Restriction::IO
- Core package : Bio::SeqIO::genbank
Test file changes and additions:
- Modified tests :RestrictionIO.t
Module changes:
- Core package : Bio::Index::GenBank
- Core package : Bio::Index::AbstractSeq
- Core package : Bio::SeqIO::genbank
Lincoln continues on with GFF3 integration into Bioperl
Module changes:
- Core package : Bio::DB::SeqFeature
- Core package : Bio::DB::SeqFeature::Store::DBI::mysql
- Core package : Bio::Graphics::Glyph::xyplot
- Core package : Bio::DB::SeqFeature::NormalizedFeature, Bio::DB::SeqFeature::Store
- Core package : Bio::Graphics::FeatureFile
- Core package : Bio::DB::SeqFeature::NormalizedFeature
- Core package : Bio::DB::SeqFeature
- Core package : Bio::DB::SeqFeature::Store::GFF3Loader
- Core package : Bio::DB::SeqFeature::NormalizedFeature, Bio::DB::SeqFeature::NormalizedFeatureI, Bio::DB::SeqFeature::NormalizedTableFeatureI, Bio::DB::SeqFeature::Store
- Core package : Bio::DB::SeqFeature::Store::DBI::mysql
- Core package : Bio::DB::SeqFeature::Store::GFF3Loader
- Core package : Bio::DB::SeqFeature::Store::DBI::mysql
- Core package : Bio::DB::SeqFeature::NormalizedFeature
- Core package : Bio::Graphics::Glyph
- Core package : Bio::DB::SeqFeature::Store::DBI::mysql
- Core package : Bio::DB::SeqFeature::NormalizedFeature, Bio::DB::SeqFeature::Segment, Bio::DB::SeqFeature::Store
- Core package : Bio::Graphics::Glyph::gene, Bio::Graphics::Glyph::generic, Bio::Graphics::Glyph::transcript, Bio::Graphics::Glyph::transcript2
- Core package : Bio::DB::SeqFeature::NormalizedFeature, Bio::DB::SeqFeature::Segment, Bio::DB::SeqFeature::Store
- Core package : Bio::DB::SeqFeature::Store::DBI::mysql
- Core package : Bio::Graphics::FeatureBase
- Core package : Bio::DB::SeqFeature::Store::GFF3Loader
- Core package : Bio::DB::SeqFeature
- Core package : Bio::DB::SeqFeature::NormalizedFeature, Bio::DB::SeqFeature::Segment, Bio::DB::SeqFeature::Store
- Core package : Bio::Graphics::Glyph::translation
- Core package : Bio::DB::SeqFeature::NormalizedFeature, Bio::DB::SeqFeature::Segment
- Core package : Bio::Graphics::FeatureBase, Bio::Graphics::Glyph
Modified or added scripts/examples:
- scripts/Bio-SeqFeature-Store/bp_seqfeature_load.PLS
- scripts/Bio-SeqFeature-Store/bp_seqfeature_load.PLS
Scott Cain
Module changes:
Sohel has contributed the OBO format parser for ontology parsing:
http://article.gmane.org/gmane.comp.lang.perl.bio.general/11196/
Module changes:
- Core package : Bio::Ontology::Xref
- Core package : Bio::Ontology::Term
- Core package : Bio::Ontology::SimpleGOEngine
- Core package : Bio::Ontology::OBOterm
- Core package : Bio::Ontology::OBOEngine
- Core package : Bio::OntologyIO::simplehierarchy
- Core package : Bio::OntologyIO::dagflat
olivier at dev.open-bio.org
Some movement on bioperl-pipeline!!
Module changes:
- Pipeline package : Bio::Pipeline::Manager
- Pipeline package : Bio::Pipeline::Transformer
- Pipeline package : Bio::Pipeline::Runnable::Blast
- Pipeline package : Bio::Pipeline::Runnable::Blat
- Pipeline package : Bio::Pipeline::Runnable::Phylip
- Pipeline package : Bio::Pipeline::Runnable::ProteinAnnotation
Bug Reports
Jason went on a bug hunt and stomped a few along the way, along with the rest of the BioPerl gang...
Changes
- Issue #843 :_readline() in SeqIO.pm bugzilla-daemon at newportal.open-bio.org
- Issue #1829 :-ve xyplot.pm - gff data must be in order
- Issue #1915 :Tranditional bootstrap in newick format tree not accesible
- Issue #1926 :Missing sections in Pdoc HTML
- Issue #1945 :tblastn reports the wrong frame and hit position
- Issue #1953 :Inconsistency between Bio::Factory::FTLocationFactory->from_string and Bio::Location::Split->to_FTstring
- Issue #1954 :SeqIO::game does not write or read <computational_analysis> elements
- Issue #1955 :Add Bio::AnnotatableI inheretance to Bio::Map::SimpleMap
- Issue #1957 :Add some tests for fpc.pm to t/MapIO.t
- Issue #1998 :Bio::Map::Marker incomplete implementation
- Issue #2002 :Infinite recursive loop in Unflattener.pm
- Issue #2003 :swiss.pm doesn't parse sequence version out of swissprot files
- Issue #2011 :New: Bio::Restriction::IO enhancements/code issues
Resolved
- WORKSFORME Issue #1988 :SearchIO::blast parses bit score incorrectly when score is reported in scientific notation
- WORKSFORME Issue #2001 :RemoteBlast does not return HTML reports properly
- LATER Issue #1539 :Fasta.pm cannot be told to read sequence ID as accession number
- LATER Issue #1924 :scansite
- INVALID Issue #2004 :swiss.pm parsing error with certain locations
- INVALID Issue #2009 :Bio::Restriction::IO object fails to load enzymes from local file; default set always loaded instead.
- FIXED Issue #1830 :strange dashes on the negative axis when using xyplot
- FIXED Issue #1985 :PSI-BLAST parsing fails on Windows
- FIXED Issue #1989 :GI identifier missing when using Bio::Index::GenBank
- FIXED Issue #2000 :AlignIO clustal/fasta parsing
- FIXED Issue #2006 :Bio::SeqIO::genbank does not parse CONSRTM field
- FIXED Issue #2007 :Remote webblast with input file
- FIXED Issue #2008 :Bio::Location::Split produces erroneous start/end coordinates with multiple Fuzzy sublocations
- FIXED Issue #2010 :Bio::Restriction::IO::bairoch has problems with enzymes with multiple sites