BioPerl has switched (as of mid-April, 2014) to GitHub Issues. We still have access to the Redmine-based tracking system, but this will be effectively read-only. We no longer support access to the original Bugzilla instance.
Please submit bugs or enhancement requests to BioPerl GitHub Issues. The older BioPerl Redmine tracking system remains but is will no longer be used, and the oldest Bugzilla-based system is not supported and addition of new bugs has been disabled.
We really do want you to submit bugs, even though it means more things for us to do! It means there is something we didn't think of or test in the particular module and we won't know about this unless you tell us. The Redmine system requires you register to avoid spam and to allow us to contact you again when the bug is fixed or to clarify the problem and solutions.
It is important that you record the version of BioPerl you are running (if you don't know, see the FAQ the question is addressed there). You can also include the version of Perl you are using and the Operating System you are running.
When submitting new bugs on BioPerl on GitHub, enter a brief description and other general information. You can use Markdown to add links, syntax highlighting, and so on; see the GitHub Markdown docs for more.
You can paste example code in the description, but we suggest submitting as a GitHub Gist. If you have example fixes, we highly suggest using the tools GitHub has in place, namely the ability to fork the code and create a pull request with the relevant fix. This will show up as an issue automatically, so there isn't a need to file one separately.
Note that attachments on GitHub issues only work for images. If the example is a text file then use a GitHub Gist; alternatively, if the file is something available publicly then please provide a link to the file.
We gladly welcome patches. Patches for bioperl code should be created as described in the SubmitPatch HOWTO. Try to ensure the patch is derived against the latest code checked out from Git, particularly if the patch is large.
Briefly, you can generate the patch using the following command:
diff -u old new
For best results, follow this example:
cd $YOUR_WORKING_DIST_DIRECTORY/Bio/Frobnicator git pull git diff GrokFrobnicator.pm > my-patch.dif
Submitting New Modules and Code Snippets
We also accept new code, either as full-fledged modules or as snippets of code (snippets work better as a patch). New code must include documentation, example code (typically listed in a SYNOPSIS section), and tests with decent test coverage following our testing standards in our Writing_BioPerl_Tests HOWTO). Because we are moving to a more modular scheme for future Bioperl installations we highly suggest individual submission of modules to CPAN, primarily to help lower the barrier to submitting bug fixes.
Good bug reports are ones which provide a small amount of code and the necessary test files to reproduce your bug. By doing this work up front you insure the developer spends most of his or her time actually working on the problem. Pasting your entire 600 line program into the comment buffer is probably not going to get an enthusiastic response. In addition, isolating the problem down to a small amount of your code will help ensure that the bug is not on your end before we dive in and start working on it.
Open Issues (GitHub)
This list details the open Bugs on GitHub for BioPerl.
- Issue 101: Expand RemoteBlast synopsis sample code
- This illustrates how to send BLAST searches to a cloud service provider.
- Issue 100: Bio::DB::Taxonomy::flatfile needs rewrite
- We need to rethink the index structure here now that the number of taxonomy items is so large, DB_File hash interface is not working for so many entries. Can we explore either BerkeleyDB or perhaps a NOSQL option that can still run as a flatfile indexed file?
- Issue 99: bioperl-guts-l notifications
- Would like to get bioperl-guts-l notifications for github projects. Will look into this.
- Issue 91: Support FTS attribute table and WITHOUT ROWID optimization for Bio::DB::SeqFeature::Store::DBI::SQLite
- Several of our GBrowse instances had issues with full-text (attribute) searches timing out. Profiling revealed that the execution time of Bio::DB::SeqFeature::Store::search_attributes on our Bio::DB::SeqFeature::Store::DBI::SQLite databases contributed significantly to this problem. The proposed changes, which add support for indexing the attribute table using SQLite's FTS (full-text search) extension, resolved the issue (when used in conjunction with Scott Cain's recent commit that removed the use of CGI::Pretty in GBrowse: https://github.com/GMOD/GBrowse/commit/298cab3ece8f68b7baf2e9d085fab9055772fa3c) To demonstrate the performance impact of an FTS attribute table on search_attributes(), consider the following script: ``` # search_attributes.pl use strict; use warnings; use Bio::DB::SeqFeature::Store; my $db = Bio::DB::SeqFeature::Store->new(-adaptor => 'DBI::SQLite', -dsn => $ARGV); my @features = $db->search_attributes($ARGV, ['arabidopsis_defline', 'arabidopsis_symbol', 'pfam', 'go', 'panther', 'kegg_enzyme', 'kegg_orthology', 'cog_cluster']); print 'Features: ' . scalar(@features) . "\n"; ``` Given a Bio::DB::SeqFeature::Store::DBI::SQLite database with gene models & annotation created from a GFF3 file with 692,300 features, the following were typical observed execution times in our environment: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ $ time perl search_attributes.pl track-orig.db iron Features: 1283 real 0m1.61s user 0m0.71s sys 0m0.89s $ time perl search_attributes.pl track-fts.db iron Features: 1280 real 0m0.27s user 0m0.20s sys 0m0.06s ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The difference in number of matches is due to the difference in behavior between the two methods: FTS (with the MATCH operator) searches for tokens, while LIKE '%iron%' finds substrings. The extra three results returned with "LIKE '%iron%'" contained a spurious match containing "iron" as a substring: acclimation of photosynthesis to environment and two occurrences of "diiron", which may be relevant to a user: dicarboxylate diiron protein, putative (Crd1) OTOH, if the user searches for "Fe" instead, they get "real" hits with an FTS attribute table, whereas the non-FTS search returns thousands of spurious hits where "fe" is a substring. Because of this difference in behavior (and possible portability issues to systems with old DBD::SQLite instances---see below), I thought that FTS should be op-in rather than the default. Also, note that FTS support depends on the version of DBD::SQLite. The current DBD::SQLite by default supports two versions: FTS3 since sometime before 1.30_04 (2010-08-25), and FTS4 since 1.36_01 (2012-01-19). At least one design decision I made while implementing this change should be considered/debated before accepting this pull request: the -fts option is just a boolean flag. My initial implementation supported the creation of an FTS attribute table using a user-specified FTS version, but at the last minute I decided to KISS and just use the most recent version supported by the installed DBD::SQLite. This isn't a problem if someone decides to implement an FTS attribute table for MySQL, which supports only one such index type (FULLTEXT: http://dev.mysql.com/doc/refman/5.6/en/fulltext-search.html). However, it's conceivable that one might want to implement FTS for PostgreSQL and have control over whether GIN or GiST indexing is used (http://www.postgresql.org/docs/9.3/static/textsearch-indexes.html), or, with SQLite, specify FTS3 at database (instead of the most recent FTS4) at creation time to allow its use on a host with an older DBD::SQLite.
- Issue 87: Issues with bioperl versioning
- Blocker on new 1.6/1.7 releases, see: https://github.com/andk/pause/issues/75
- Issue 83: GenBank parsing CONTIG issues
- See: http://mailman.open-bio.org/pipermail/bioperl-l/2014-September/088945.html Basically, this works with the August patch for GenBank parsing (so the bug isn't there) but some change since 1.6.922 has caused parsing to slow dramatically. We'll need to bisect this.
- Issue 79: Added script to extract DNA sequences (as well as 5' or 3' regions if specified) from a FASTA file using a BLAST output file
- Added a personal script to extract a DNA sequence from a FASTA file using a BLAST output file. Expects at least two arguments, the BLAST file and the FASTA file. There are a number of optional arguments that are explained in the script. This script is especially useful when trying to extract sequences with variance (hence the BLAST search beforehand) from FASTA files. For example, say that you are trying to extract a given gene and 2000 base pairs 5' to it from 20 different genomes. All you have is one gene sequence, however. By doing a BLAST search between each of the genomes and the gene and then using this script, you can extract the sequences that you are interested in. The script also has options to extract a specified 3' or 5' sequence from the FASTA file, as well as an e-value cut off. The final output is the extracted sequence in FASTA format. Is this useful/generic enough to be included in the scripts directory? The script is well tested and takes command-line arguments.
- Issue 61: new modules Bio::Tools::Alignment::Overview and Bio::DB::NextProt
- Some time ago I made the mistake of uploading to CPAN a module called Bio::Tools::Alignment::Overview, the module wasn't associated to bioperl but for some reason I decided to use the namespace anyway (yeah, I know...) Now I'm organizing some projects and I decided to include the module to bioperl, so if you accept this pull request I will remove from PAUSE/CPAN the current module, so that you can upload it under bioperl. Sorry for the noobish mistake =) Cheers.
- Issue 46: Simplealign
- Hi, Chris, This is my implementation of Bio::SimpleAlign. I have a detailed report describing the improvement to the code, and explaining the failed tests using t/Align/SimpleAlign.t. If you need any help, please just let me know. Cheers, Jun