From David.Messina at sbc.su.se Sun Feb 1 08:47:59 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Sun, 1 Feb 2009 14:47:59 +0100 Subject: [Bioperl-l] trunk version vs. branch version In-Reply-To: <2C5C023C-5443-4A52-A397-B2EC2F1C2080@illinois.edu> References: <2C5C023C-5443-4A52-A397-B2EC2F1C2080@illinois.edu> Message-ID: <628aabb70902010547m42995097k57703d6444641a69@mail.gmail.com> I would think we'd want to set trunk to 1.006001 -- right past 1.6 -- because - it needs to be >1.6 - it works for the other distros which need to be set to require something after 1.6.0 BUT - it won't be >1.6.1 (i.e. 1.006100, I think), which is likely to be the next release I'm presuming that it might be problematic to have trunk at or close to 1.7 and then bump all the requirements backwards to make them work with the 1.6.x releases. But perhaps there are compelling reasons to make the trunk version peri-1.7? If so, I'd still argue for something like 1.006900 so we have room for 1.7-alpha releases that are <1.00700, similar to what you did for the 1.6-alphas. Dave From martin.senger at gmail.com Sun Feb 1 20:20:13 2009 From: martin.senger at gmail.com (Martin Senger) Date: Mon, 2 Feb 2009 09:20:13 +0800 Subject: [Bioperl-l] Need some bioperl-run tests In-Reply-To: References: Message-ID: <4d93f07c0902011720o30cffc9dr34fc6e999e128e30@mail.gmail.com> > I have also noticed that Pise.t and AnalysisFactory_soap.t tests are > failing. I have updated the tests so they run to completion, but (if > possible) I need some indication whether these web services are still > available The AnalysisFactory_soap could be used to access various tools - but the primary target was EMBOSS tools running at EBI. The software providing them at EBI - the Soaplab wrapper - went to its second version (Soaplab2), and even though it is still possible to provide backward compatible services, I am not sure that these old (Soaplab1) services are still maintained at EBI (I am cc-ing this to Mahmut who would be the best person to know). Martin -- Martin Senger email: martin.senger at gmail.com,m.senger at cgiar.org skype: martinsenger From cjfields at illinois.edu Mon Feb 2 11:30:36 2009 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 2 Feb 2009 10:30:36 -0600 Subject: [Bioperl-l] trunk version vs. branch version In-Reply-To: <628aabb70902010547m42995097k57703d6444641a69@mail.gmail.com> References: <2C5C023C-5443-4A52-A397-B2EC2F1C2080@illinois.edu> <628aabb70902010547m42995097k57703d6444641a69@mail.gmail.com> Message-ID: <07C7B69D-730D-431F-BA55-985464D62458@illinois.edu> On Feb 1, 2009, at 7:47 AM, Dave Messina wrote: > I would think we'd want to set trunk to 1.006001 -- right past 1.6 -- > because > - it needs to be >1.6 > - it works for the other distros which need to be set to require > something > after 1.6.0 > > BUT > - it won't be >1.6.1 (i.e. 1.006100, I think), which is likely to be > the > next release (Actually, that would be 1.6.100, or the 100th point release, if we stick to using numeric version nomenclature. The vstring version v1.6.1 would be 1.006001. But I see your point.) The reason I bring this up is simple semantics: we need a way to distinguish main trunk from the 1.6 branch for developers. bioperl- live's VERSION should always be ahead of any CPAN releases so we can add in a 'use Bio::Root::Foo #.######' if we want to indicate whether one should use the latest code (trunk) vs. latest release. One possiblity: we could have trunk's VERSION follow directly after the latest CPAN release as an alpha, so it would be 1.006000_001. If we decide that an alpha is necessary for 1.6.1, we release 1.006000_001 and increment trunk to 1.006000_002, otherwise we would release 1.006001 and trunk would increment to 1.006001_001. That has a significant downside: bioperl-live becomes a moving target, so any code reliant on changes only in main trunk would have to constantly change any version requirements (ick). Another issue that pops up are scripts expecting trunk ('use Bio::Root::Version 1.006000_001') that would pass for a later point release (1.006001, for instance, would pass the '1.006000_001' requirement). Double-ick. Lots of potential confusion best avoided completely, so I think that's out as an option. > I'm presuming that it might be problematic to have trunk at or close > to 1.7 > and then bump all the requirements backwards to make them work with > the > 1.6.x releases. We could continue on with the alternating dev/stable (even/odd minor) versioning but not release the odd-numbered 'dev' versions to CPAN. That would allow us to designate trunk as 1.007. The downside: do we want to risk another long wait between minor 'stable' releases? > But perhaps there are compelling reasons to make the trunk version > peri-1.7? > If so, I'd still argue for something like 1.006900 so we have room for > 1.7-alpha releases that are <1.00700, similar to what you did for the > 1.6-alphas. > > > Dave Kind of what I'm thinking, which is a nicer middle ground (it is pre-1.7, and is a stable target). Anyone else? chris From maj at fortinbras.us Mon Feb 2 11:25:47 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Mon, 2 Feb 2009 11:25:47 -0500 Subject: [Bioperl-l] little wiki bug? Message-ID: Folks- Is the ... tag processing within our purview? I was looking closely and noticed that an expression like for (0..$#a) { print $a[$_]; } will display #a { print $a[$_]; } formatted as a comment. cheers, MAJ From cjfields at illinois.edu Mon Feb 2 11:40:43 2009 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 2 Feb 2009 10:40:43 -0600 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-db In-Reply-To: <0215FF33-7DB8-4988-86EE-9371223C8814@gmx.net> References: <0215FF33-7DB8-4988-86EE-9371223C8814@gmx.net> Message-ID: <5CB87F44-74B0-42FC-95C2-46A4228FDC65@illinois.edu> So, for this particular test, should it pass with either an empty string or undef? It apparently works for all other tests. chris On Jan 30, 2009, at 11:13 AM, Hilmar Lapp wrote: > I think this has something to do with the database connection > configuration file (t/DBHarness.conf). I suspect in your version it > sets 'port' to an empty string, rather than not setting it at all, > or setting it explicitly to undef. > > The test that's failing is one of a series of tests that verify that > the connection attributes can be round-tripped through a DSN string. > Specifying 'port' in the DSN string as an empty value is illegal > though, so it gets left out whether it's undef or an empty string. > When parsed back, the port isn't found in the string and hence left > undef (meaning, use whatever your DBD driver thinks is the right > default). > > -hilmar > > On Jan 29, 2009, at 7:28 AM, Johann PELLET wrote: > >> Dear Chris, >> >> I have the following error on my Mac machine: (BioPerl 1.6, BioPerl- >> run >> 1.6) when I try to install Bioperl-db ( biosql-1.0.1): >> >> t/01dbadaptor.....1/23 >> # Failed test in t/01dbadaptor.t at line 44. >> # got: undef >> # expected: '' >> # Looks like you failed 1 test of 23. >> t/01dbadaptor..... Dubious, test returned 1 (wstat 256, 0x100) >> Failed 1/23 subtests >> t/02species.......ok >> t/03simpleseq.....ok >> t/04swiss.........ok >> t/05seqfeature....ok >> t/06comment.......ok >> t/07dblink........ok >> t/08genbank.......ok >> t/09fuzzy2........5/23 >> # Failed (TODO) test in t/09fuzzy2.t at line 64. >> # got: undef >> # expected: 'Q9QYG8' >> t/09fuzzy2........ok >> t/10ensembl.......ok >> t/11locuslink.....ok >> t/12ontology......ok >> t/13remove........ok >> t/14query.........ok >> t/15cluster.......ok >> t/16obda..........ok >> >> Test Summary Report >> ------------------- >> t/01dbadaptor (Wstat: 256 Tests: 23 Failed: 1) >> Failed test: 16 >> Non-zero exit status: 1 >> Files=16, Tests=1479, 15 wallclock secs ( 0.27 usr 0.10 sys + >> 11.15 cusr 1.11 csys = 12.63 CPU) >> Result: FAIL >> Failed 1/16 test programs. 1/1479 subtests failed. >> >> -- -- >> >> Johann Pellet >> IE Bioinformatique >> INSERM U851, I-MAP CERVI >> 21, Avenue Tony Garnier >> 69365 Lyon cedex 07 France >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason at bioperl.org Mon Feb 2 11:59:11 2009 From: jason at bioperl.org (Jason Stajich) Date: Mon, 2 Feb 2009 08:59:11 -0800 Subject: [Bioperl-l] trunk version vs. branch version In-Reply-To: <07C7B69D-730D-431F-BA55-985464D62458@illinois.edu> References: <2C5C023C-5443-4A52-A397-B2EC2F1C2080@illinois.edu> <628aabb70902010547m42995097k57703d6444641a69@mail.gmail.com> <07C7B69D-730D-431F-BA55-985464D62458@illinois.edu> Message-ID: > > Kind of what I'm thinking, which is a nicer middle ground (it is > pre-1.7, and is a stable target). Anyone else? > Wordpress designates the main trunk as "2.8-bleeding-edge" while the latest stable release was 2.7.x So pre-1.7 or some sort of 1.7 designate seems a sensible to me. -jason > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason at bioperl.org From johann.pellet at inserm.fr Mon Feb 2 12:09:42 2009 From: johann.pellet at inserm.fr (Johann PELLET) Date: Mon, 2 Feb 2009 18:09:42 +0100 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-db In-Reply-To: <5CB87F44-74B0-42FC-95C2-46A4228FDC65@illinois.edu> References: <0215FF33-7DB8-4988-86EE-9371223C8814@gmx.net> <5CB87F44-74B0-42FC-95C2-46A4228FDC65@illinois.edu> Message-ID: <67AA0448-C388-410C-894B-6EFA5B0A6C02@inserm.fr> Le 2 f?vr. 09 ? 17:40, Chris Fields a ?crit : > So, for this particular test, should it pass with either an empty > string or undef? It apparently works for all other tests. > > > chris Here, my t/DBHarness.biosql.conf : { 'port' => '', 'schema_sql' => ['../biosql-schema/sql/biosqldb-mysql.sql'], 'dbname' => 'biosql', 'host' => 'localhost', 'database' => 'biosql', 'password' => '****', 'user' => '****', 'driver' => 'Pg', } I have as I said this error: >>> t/01dbadaptor.....1/23 >>> # Failed test in t/01dbadaptor.t at line 44. >>> # got: undef >>> # expected: '' >>> # Looks like you failed 1 test of 23. >>> t/01dbadaptor..... Dubious, test returned 1 (wstat 256, 0x100) >>> Failed 1/23 subtests If I change the postgresql port in the t/DBHarness.biosql.conf 'port' => '5432', ./Build test t/01dbadaptor.....ok t/02species.......80/67 # Looks like you planned 67 tests but ran 54 extra. t/02species....... Dubious, test returned 255 (wstat 65280, 0xff00) All 67 subtests passed t/09fuzzy2........7/23 # Failed (TODO) test in t/09fuzzy2.t at line 64. # got: undef # expected: 'Q9QYG8' t/09fuzzy2........ok Tt/02species (Wstat: 65280 Tests: 121 Failed: 54) Failed tests: 68-121 Non-zero exit status: 255 Parse errors: Bad plan. You planned 67 tests but ran 121. Files=16, Tests=1533, 120 wallclock secs ( 0.29 usr 0.10 sys + 11.51 cusr 1.14 csys = 13.04 CPU) Result: FAIL Failed 1/16 test programs. 54/1533 subtests failed. So for the first test, if the port is not empty, it's ok now. But I still have other errors t02 and t09. Johann > > > On Jan 30, 2009, at 11:13 AM, Hilmar Lapp wrote: > >> I think this has something to do with the database connection >> configuration file (t/DBHarness.conf). I suspect in your version it >> sets 'port' to an empty string, rather than not setting it at all, >> or setting it explicitly to undef. >> >> The test that's failing is one of a series of tests that verify >> that the connection attributes can be round-tripped through a DSN >> string. Specifying 'port' in the DSN string as an empty value is >> illegal though, so it gets left out whether it's undef or an empty >> string. When parsed back, the port isn't found in the string and >> hence left undef (meaning, use whatever your DBD driver thinks is >> the right default). >> >> -hilmar >> >> On Jan 29, 2009, at 7:28 AM, Johann PELLET wrote: >> >>> Dear Chris, >>> >>> I have the following error on my Mac machine: (BioPerl 1.6, >>> BioPerl-run >>> 1.6) when I try to install Bioperl-db ( biosql-1.0.1): >>> >>> t/01dbadaptor.....1/23 >>> # Failed test in t/01dbadaptor.t at line 44. >>> # got: undef >>> # expected: '' >>> # Looks like you failed 1 test of 23. >>> t/01dbadaptor..... Dubious, test returned 1 (wstat 256, 0x100) >>> Failed 1/23 subtests >>> t/02species.......ok >>> t/03simpleseq.....ok >>> t/04swiss.........ok >>> t/05seqfeature....ok >>> t/06comment.......ok >>> t/07dblink........ok >>> t/08genbank.......ok >>> t/09fuzzy2........5/23 >>> # Failed (TODO) test in t/09fuzzy2.t at line 64. >>> # got: undef >>> # expected: 'Q9QYG8' >>> t/09fuzzy2........ok >>> t/10ensembl.......ok >>> t/11locuslink.....ok >>> t/12ontology......ok >>> t/13remove........ok >>> t/14query.........ok >>> t/15cluster.......ok >>> t/16obda..........ok >>> >>> Test Summary Report >>> ------------------- >>> t/01dbadaptor (Wstat: 256 Tests: 23 Failed: 1) >>> Failed test: 16 >>> Non-zero exit status: 1 >>> Files=16, Tests=1479, 15 wallclock secs ( 0.27 usr 0.10 sys + >>> 11.15 cusr 1.11 csys = 12.63 CPU) >>> Result: FAIL >>> Failed 1/16 test programs. 1/1479 subtests failed. >>> >>> -- -- >>> >>> Johann Pellet >>> IE Bioinformatique >>> INSERM U851, I-MAP CERVI >>> 21, Avenue Tony Garnier >>> 69365 Lyon cedex 07 France >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Mon Feb 2 12:45:40 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 2 Feb 2009 12:45:40 -0500 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-db In-Reply-To: <67AA0448-C388-410C-894B-6EFA5B0A6C02@inserm.fr> References: <0215FF33-7DB8-4988-86EE-9371223C8814@gmx.net> <5CB87F44-74B0-42FC-95C2-46A4228FDC65@illinois.edu> <67AA0448-C388-410C-894B-6EFA5B0A6C02@inserm.fr> Message-ID: On Feb 2, 2009, at 12:09 PM, Johann PELLET wrote: > [...] Here, my t/DBHarness.biosql.conf : > { > 'port' => '', Just remove this line. Or replace '' with undef. > [...] > So for the first test, if the port is not empty, it's ok now. But I > still have other errors t02 and t09. t09 is an expected failure (the test is TODO). t02 seems to pass, though it looks like the #tests expected is wrong. Should be inspected, but is probably harmless. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Feb 2 12:45:42 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 2 Feb 2009 12:45:42 -0500 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-db In-Reply-To: <5CB87F44-74B0-42FC-95C2-46A4228FDC65@illinois.edu> References: <0215FF33-7DB8-4988-86EE-9371223C8814@gmx.net> <5CB87F44-74B0-42FC-95C2-46A4228FDC65@illinois.edu> Message-ID: <98B91B75-76D2-4752-9B66-8CECDE9A9027@gmx.net> On Feb 2, 2009, at 11:40 AM, Chris Fields wrote: > So, for this particular test, should it pass with either an empty > string or undef? It could. Though strictly speaking, setting the port to an empty string isn't correct. It needs to be a number or undef. But who wants to be strict these days ... -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at illinois.edu Mon Feb 2 13:46:14 2009 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 2 Feb 2009 12:46:14 -0600 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-db In-Reply-To: <98B91B75-76D2-4752-9B66-8CECDE9A9027@gmx.net> References: <0215FF33-7DB8-4988-86EE-9371223C8814@gmx.net> <5CB87F44-74B0-42FC-95C2-46A4228FDC65@illinois.edu> <98B91B75-76D2-4752-9B66-8CECDE9A9027@gmx.net> Message-ID: Here are the relevant lines in bioperl-db's Build.PL that seem to be causing the problem. $config{port} = $build->prompt("Port the server is running on (optional, '' for none)?", $config{port}); $config{port} = '' if $config{port} eq "''"; I have changed that to set the conf file port to undef instead (unquoted). It will be in the next RC (next few days). chris On Feb 2, 2009, at 11:45 AM, Hilmar Lapp wrote: > > On Feb 2, 2009, at 11:40 AM, Chris Fields wrote: > >> So, for this particular test, should it pass with either an empty >> string or undef? > > > It could. Though strictly speaking, setting the port to an empty > string isn't correct. It needs to be a number or undef. > > But who wants to be strict these days ... > > -hilmar > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > From hlapp at gmx.net Mon Feb 2 14:10:09 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 2 Feb 2009 14:10:09 -0500 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-db In-Reply-To: References: <0215FF33-7DB8-4988-86EE-9371223C8814@gmx.net> <5CB87F44-74B0-42FC-95C2-46A4228FDC65@illinois.edu> <98B91B75-76D2-4752-9B66-8CECDE9A9027@gmx.net> Message-ID: <7832F57F-19E7-433B-A9EB-FF8E554B1FEF@gmx.net> On Feb 2, 2009, at 1:46 PM, Chris Fields wrote: > Here are the relevant lines in bioperl-db's Build.PL that seem to be > causing the problem. > > $config{port} = $build->prompt("Port the server is running on > (optional, '' for none)?", $config{port}); > $config{port} = '' if $config{port} eq "''"; > > I have changed that to set the conf file port to undef instead > (unquoted). It will be in the next RC (next few days). Ah - cool, thanks Chris! I hadn't realized that you can now actually generate the config file as part of the build process. It used to be done manually. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From David.Messina at sbc.su.se Mon Feb 2 14:51:12 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 2 Feb 2009 20:51:12 +0100 Subject: [Bioperl-l] trunk version vs. branch version In-Reply-To: <07C7B69D-730D-431F-BA55-985464D62458@illinois.edu> References: <2C5C023C-5443-4A52-A397-B2EC2F1C2080@illinois.edu> <628aabb70902010547m42995097k57703d6444641a69@mail.gmail.com> <07C7B69D-730D-431F-BA55-985464D62458@illinois.edu> Message-ID: <628aabb70902021151x76d7f191u6d964fa1bdd9a26e@mail.gmail.com> > > We could continue on with the alternating dev/stable (even/odd minor) > versioning but not release the odd-numbered 'dev' versions to CPAN. That > would allow us to designate trunk as 1.007. > > The downside: do we want to risk another long wait between minor 'stable' > releases? > Double-ick to this, too, I think. We've just shrugged off this yoke -- let's not put it on again. Kind of what I'm thinking, which is a nicer middle ground (it is pre-1.7, > and is a stable target). Yes, that makes sense. So in this case then, devs wanting to require trunk (as opposed to release) could say use Bio::Root::Foo 1.6.9; to explicitly target trunk (assuming 1.6.9 is the VERSION given to trunk) or they could say use Bio::Root::Foo 1.6.1; which would mean trunk OR any 1.6.x point release that might be released along the way to 1.7, thereby avoiding the need to reset the version requirement when the a point release comes out. D From Russell.Smithies at agresearch.co.nz Mon Feb 2 15:52:38 2009 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Tue, 3 Feb 2009 09:52:38 +1300 Subject: [Bioperl-l] little wiki bug? In-Reply-To: References: Message-ID: <18DF7D20DFEC044098A1062202F5FFF32162A374FE@exchsth.agresearch.co.nz> It looks like the wiki uses the "GeSHiHighlight" extension http://www.mediawiki.org/wiki/Extension:GeSHiHighlight . We use this extension as well and I don't think there's a simple way around it apart from tweaking your Perl code slightly. How about this instead? for (0.. @a - 1) { print $a[$_]; } Of course, having to adjust your Perl to get the highlighting "correct" isn't really a solution either :-( --Russell > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen > Sent: Tuesday, 3 February 2009 5:26 a.m. > To: bioperl list > Subject: [Bioperl-l] little wiki bug? > > Folks- > Is the ... tag processing within our purview? I was > looking closely and > noticed that an expression like > > for (0..$#a) { print $a[$_]; } > > will display > > #a { print $a[$_]; } > > formatted as a comment. > > cheers, > MAJ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From cjfields at illinois.edu Mon Feb 2 16:28:41 2009 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 2 Feb 2009 15:28:41 -0600 Subject: [Bioperl-l] trunk version vs. branch version In-Reply-To: <628aabb70902021151x76d7f191u6d964fa1bdd9a26e@mail.gmail.com> References: <2C5C023C-5443-4A52-A397-B2EC2F1C2080@illinois.edu> <628aabb70902010547m42995097k57703d6444641a69@mail.gmail.com> <07C7B69D-730D-431F-BA55-985464D62458@illinois.edu> <628aabb70902021151x76d7f191u6d964fa1bdd9a26e@mail.gmail.com> Message-ID: On Feb 2, 2009, at 1:51 PM, Dave Messina wrote: >> ...Kind of what I'm thinking, which is a nicer middle ground (it is >> pre-1.7, >> and is a stable target). > > > Yes, that makes sense. > > So in this case then, devs wanting to require trunk (as opposed to > release) > could say > > use Bio::Root::Foo 1.6.9; > > > to explicitly target trunk (assuming 1.6.9 is the VERSION given to > trunk) > > or they could say > > use Bio::Root::Foo 1.6.1; > > > which would mean trunk OR any 1.6.x point release that might be > released > along the way to 1.7, thereby avoiding the need to reset the version > requirement when the a point release comes out. > > > D Alright, then, I've set trunk to what we'll call 'pre-1.7', or v1.6.900 (1.006900); gives us more than enough chances for point releases (I hope we don't need more than 900). ;> Other svn distributions main trunk code likewise requires core 1.6.900. If there is overwhelming support we can rename it 1.006009 (1.6.9). chris From johnsonm at gmail.com Mon Feb 2 16:31:18 2009 From: johnsonm at gmail.com (Mark Johnson) Date: Mon, 2 Feb 2009 15:31:18 -0600 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-run, BioPerl-db, BioPerl-network In-Reply-To: <3A45218B-B17C-49D1-85F8-3296EB6BFCE7@illinois.edu> References: <151FBF26-C5FE-4388-875A-CC678B447318@illinois.edu> <73C90E2C-32D1-40FC-BC82-A9EC510BE384@illinois.edu> <3A45218B-B17C-49D1-85F8-3296EB6BFCE7@illinois.edu> Message-ID: On Thu, Jan 29, 2009 at 7:13 PM, Chris Fields wrote: > gmhmmp ver. 2.6p > > Maybe my model is out-of-date; I got them from here: > > http://exon.biology.gatech.edu/modelDownload.html We're still on 2.6c. I think you've got the right model. From David.Messina at sbc.su.se Mon Feb 2 17:19:37 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 2 Feb 2009 23:19:37 +0100 Subject: [Bioperl-l] little wiki bug? In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32162A374FE@exchsth.agresearch.co.nz> References: <18DF7D20DFEC044098A1062202F5FFF32162A374FE@exchsth.agresearch.co.nz> Message-ID: <628aabb70902021419r691ec2acqa959a2367d7f06ae@mail.gmail.com> This seems to be fixed in the latest version (1.0.8.2) of the GeSHi library, which underlies the plugin. I think we can swap that in and resolve the problem. I've uploaded the new library to the bioperl website, but I don't have perms to move the existing geshi library dir. If one of the core devs has a sec for this, email me off-list and I'll tell you where I put it. Dave From mauricio at open-bio.org Mon Feb 2 19:43:01 2009 From: mauricio at open-bio.org (Mauricio Herrera Cuadra) Date: Mon, 02 Feb 2009 18:43:01 -0600 Subject: [Bioperl-l] little wiki bug? In-Reply-To: <628aabb70902021419r691ec2acqa959a2367d7f06ae@mail.gmail.com> References: <18DF7D20DFEC044098A1062202F5FFF32162A374FE@exchsth.agresearch.co.nz> <628aabb70902021419r691ec2acqa959a2367d7f06ae@mail.gmail.com> Message-ID: <49879315.2070103@open-bio.org> I found the 'NEW_geshi_in_here' directory inside the cross-site-stuff area and overwrote the previous Geshi installation with this one, reconfigured MediaWiki and other appropriate bits and restarted the webserver. Syntax highlighting seems even more screwed up now, just take a look to the Graphics HOWTO: http://bioperl.org/wiki/HOWTO:Graphics Are we sure this newer version fixes things? :) Mauricio. Dave Messina wrote: > This seems to be fixed in the latest version (1.0.8.2) of the GeSHi library, > which underlies the plugin. > > I think we can swap that in and resolve the problem. > > I've uploaded the new library to the bioperl website, but I don't have perms > to move the existing geshi library dir. If one of the core devs has a sec > for this, email me off-list and I'll tell you where I put it. > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From mauricio at open-bio.org Tue Feb 3 00:21:24 2009 From: mauricio at open-bio.org (Mauricio Herrera Cuadra) Date: Mon, 02 Feb 2009 23:21:24 -0600 Subject: [Bioperl-l] little wiki bug? In-Reply-To: <4987A221.4000009@open-bio.org> References: <18DF7D20DFEC044098A1062202F5FFF32162A374FE@exchsth.agresearch.co.nz> <628aabb70902021419r691ec2acqa959a2367d7f06ae@mail.gmail.com> <49879315.2070103@open-bio.org> <18DF7D20DFEC044098A1062202F5FFF32162A376C6@exchsth.agresearch.co.nz> <4987A221.4000009@open-bio.org> Message-ID: <4987D454.3080705@open-bio.org> The problem was the GeSHiHighlight.php file which needed to be updated to its latest version too (aside from the Geshi library), though no errors were thrown to the Apache logs... Syntax highlighting is back to normal and apparently with the bug fixed, take a look to the Sandbox with Mark's example. Thanks, Mauricio. Mauricio Herrera Cuadra wrote: > Permissions are fine (0755), in fact, PHP files don't need more than > 0644 when they're used as includes. If they need to run from the command > line, that's a different story. And yeah, highlighting seems to be > pretty bad on our side :( > > Smithies, Russell wrote: >> I tried the latest version of Geshi on our wiki but saw no changes >> i.e. the coloring bug Mark demonstrated was still there. We certainly >> didn't lose _all_ our highlighting as you seem to have. Did you check >> file permissions on geshi.php? Should be 0755 I think. >>> -----Original Message----- >>> From: Mauricio Herrera Cuadra >>> [mailto:mauricio.at.openbio.org at gmail.com] On Behalf Of Mauricio >>> Herrera Cuadra >>> Sent: Tuesday, 3 February 2009 1:43 p.m. >>> To: Dave Messina >>> Cc: Smithies, Russell; bioperl list; Mark A. Jensen >>> Subject: Re: [Bioperl-l] little wiki bug? >>> >>> I found the 'NEW_geshi_in_here' directory inside the cross-site-stuff >>> area and overwrote the previous Geshi installation with this one, >>> reconfigured MediaWiki and other appropriate bits and restarted the >>> webserver. Syntax highlighting seems even more screwed up now, just >>> take >>> a look to the Graphics HOWTO: >>> >>> http://bioperl.org/wiki/HOWTO:Graphics >>> >>> Are we sure this newer version fixes things? :) >>> >>> Mauricio. >>> >>> >>> Dave Messina wrote: >>>> This seems to be fixed in the latest version (1.0.8.2) of the GeSHi >>> library, >>>> which underlies the plugin. >>>> >>>> I think we can swap that in and resolve the problem. >>>> >>>> I've uploaded the new library to the bioperl website, but I don't >>> have perms >>>> to move the existing geshi library dir. If one of the core devs has a >>> sec >>>> for this, email me off-list and I'll tell you where I put it. >>>> >>>> Dave >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >> >> ======================================================================= >> Attention: The information contained in this message and/or attachments >> from AgResearch Limited is intended only for the persons or entities >> to which it is addressed and may contain confidential and/or privileged >> material. Any review, retransmission, dissemination or other use of, or >> taking of any action in reliance upon, this information by persons or >> entities other than the intended recipients is prohibited by AgResearch >> Limited. If you have received this message in error, please notify the >> sender immediately. >> ======================================================================= > From sac at bioperl.org Tue Feb 3 02:00:12 2009 From: sac at bioperl.org (Steve Chervitz) Date: Mon, 2 Feb 2009 23:00:12 -0800 Subject: [Bioperl-l] What is happing to perl language In-Reply-To: References: Message-ID: <8f200b4c0902022300u72ff71e0we6d08342f6414da6@mail.gmail.com> On Fri, Jan 30, 2009 at 12:04 PM, Chris Fields wrote: > On Jan 30, 2009, at 12:30 PM, Heikki Lehvaslaiho wrote: > >> FYI >> >> A few years ago there were interesting articles about perl and new >> CPAN modules poping up here there. Recently, only chromatic has >> continued writing to http://www.perl.com/. >> >> The article below is a return to norm. >> >> Healthcheck: Perl >> The Perl Future by Piers Cawley >> >> http://www.heise-online.co.uk/open/Healthcheck-Perl-The-Perl-Future--/features/112388/0 >> >> -Heikki > > Note the healthy dose of perl6 (and a side mention of Moose) within the > above! Well worth reading. > > chris Also noteworthy, there was an interview last December with Larry Wall and another article on Perl 6, Slashdotted here (along with the obligate language flamewar): Larry Wall Talks Perl, Culture, and Community (12/14/08) http://developers.slashdot.org/article.pl?sid=08%2F12%2F14%2F1523236 Steve From shameer at ncbs.res.in Tue Feb 3 03:07:47 2009 From: shameer at ncbs.res.in (K. Shameer) Date: Tue, 3 Feb 2009 13:37:47 +0530 (IST) Subject: [Bioperl-l] What is happing to perl language In-Reply-To: <8f200b4c0902022300u72ff71e0we6d08342f6414da6@mail.gmail.com> References: <8f200b4c0902022300u72ff71e0we6d08342f6414da6@mail.gmail.com> Message-ID: <59406.192.168.1.1.1233648467.squirrel@mail.ncbs.res.in> Incase if any of you guys missed this one : Jonathan Rockway Blog : "http://blog.jrock.us/articles/You%20are%20missing%20the%20point%20of%20Perl.pod" -- K. Shameer > On Fri, Jan 30, 2009 at 12:04 PM, Chris Fields > wrote: >> On Jan 30, 2009, at 12:30 PM, Heikki Lehvaslaiho wrote: >> >>> FYI >>> >>> A few years ago there were interesting articles about perl and new >>> CPAN modules poping up here there. Recently, only chromatic has >>> continued writing to http://www.perl.com/. >>> >>> The article below is a return to norm. >>> >>> Healthcheck: Perl >>> The Perl Future by Piers Cawley >>> >>> http://www.heise-online.co.uk/open/Healthcheck-Perl-The-Perl-Future--/features/112388/0 >>> >>> -Heikki >> >> Note the healthy dose of perl6 (and a side mention of Moose) within the >> above! Well worth reading. >> >> chris > > Also noteworthy, there was an interview last December with Larry Wall > and another article on Perl 6, Slashdotted here (along with the > obligate language flamewar): > > Larry Wall Talks Perl, Culture, and Community (12/14/08) > http://developers.slashdot.org/article.pl?sid=08%2F12%2F14%2F1523236 > > Steve > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From David.Messina at sbc.su.se Tue Feb 3 03:51:36 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 3 Feb 2009 09:51:36 +0100 Subject: [Bioperl-l] little wiki bug? In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32162A376C6@exchsth.agresearch.co.nz> References: <18DF7D20DFEC044098A1062202F5FFF32162A374FE@exchsth.agresearch.co.nz> <628aabb70902021419r691ec2acqa959a2367d7f06ae@mail.gmail.com> <49879315.2070103@open-bio.org> <18DF7D20DFEC044098A1062202F5FFF32162A376C6@exchsth.agresearch.co.nz> Message-ID: <628aabb70902030051i1b0b519eiacb34d864c1f2c83@mail.gmail.com> Thanks for tackling this, Mauricio. > Syntax highlighting seems even more screwed up now, just > > take a look to the Graphics HOWTO: > > > > http://bioperl.org/wiki/HOWTO:Graphics > Yikes. It doesn't seem to be recognizing the tags -- stuff in between them is getting interpreted as regular wiki tags. > > Are we sure this newer version fixes things? :) > I typed Mark's test case into the demo box on the GeSHi homepage ( http://qbnz.com/highlighter/) and it worked correctly there. Not definitive, of course, but I thought it was worth a shot. :) Dave From cjfields at illinois.edu Tue Feb 3 08:02:04 2009 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 3 Feb 2009 07:02:04 -0600 Subject: [Bioperl-l] What is happing to perl language In-Reply-To: <8f200b4c0902022300u72ff71e0we6d08342f6414da6@mail.gmail.com> References: <8f200b4c0902022300u72ff71e0we6d08342f6414da6@mail.gmail.com> Message-ID: <24E004CC-EB2B-4C06-8883-D0B968D28A44@illinois.edu> On Feb 3, 2009, at 1:00 AM, Steve Chervitz wrote: > On Fri, Jan 30, 2009 at 12:04 PM, Chris Fields > wrote: >> On Jan 30, 2009, at 12:30 PM, Heikki Lehvaslaiho wrote: >> >>> FYI >>> >>> A few years ago there were interesting articles about perl and new >>> CPAN modules poping up here there. Recently, only chromatic has >>> continued writing to http://www.perl.com/. >>> >>> The article below is a return to norm. >>> >>> Healthcheck: Perl >>> The Perl Future by Piers Cawley >>> >>> http://www.heise-online.co.uk/open/Healthcheck-Perl-The-Perl-Future--/features/112388/0 >>> >>> -Heikki >> >> Note the healthy dose of perl6 (and a side mention of Moose) within >> the >> above! Well worth reading. >> >> chris > > Also noteworthy, there was an interview last December with Larry Wall > and another article on Perl 6, Slashdotted here (along with the > obligate language flamewar): > > Larry Wall Talks Perl, Culture, and Community (12/14/08) > http://developers.slashdot.org/article.pl?sid=08%2F12%2F14%2F1523236 > > Steve In case no one saw it: http://www.bioperl.org/wiki/User:Cjfields Larry was here talking about Perl6/Parrot and his Perl6 parser (along with a few others). Here are the talks from the conference: http://www.bioperl.org/wiki/User:Cjfields chris From maj at fortinbras.us Tue Feb 3 08:28:53 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Tue, 3 Feb 2009 08:28:53 -0500 Subject: [Bioperl-l] little wiki bug? In-Reply-To: <628aabb70902030051i1b0b519eiacb34d864c1f2c83@mail.gmail.com> References: <18DF7D20DFEC044098A1062202F5FFF32162A374FE@exchsth.agresearch.co.nz> <628aabb70902021419r691ec2acqa959a2367d7f06ae@mail.gmail.com> <49879315.2070103@open-bio.org> <18DF7D20DFEC044098A1062202F5FFF32162A376C6@exchsth.agresearch.co.nz> <628aabb70902030051i1b0b519eiacb34d864c1f2c83@mail.gmail.com> Message-ID: Thanks, all. Clearly, there is no such thing as a "little" bug to this crowd! The highlighting at the place where I noticed it (http://www.bioperl.org/wiki/Getting_all_k-mer_combinations_of_residues#version_5_.28Chris_Fields.29) (in the middle of a very cool scrap by Chris,btw) is now fine. onward-MAJ ----- Original Message ----- From: Dave Messina To: Smithies, Russell Cc: Mauricio Herrera Cuadra ; Mark A. Jensen ; BioPerl List Sent: Tuesday, February 03, 2009 3:51 AM Subject: Re: [Bioperl-l] little wiki bug? Thanks for tackling this, Mauricio. > Syntax highlighting seems even more screwed up now, just > take a look to the Graphics HOWTO: > > http://bioperl.org/wiki/HOWTO:Graphics Yikes. It doesn't seem to be recognizing the tags -- stuff in between them is getting interpreted as regular wiki tags. > Are we sure this newer version fixes things? :) I typed Mark's test case into the demo box on the GeSHi homepage (http://qbnz.com/highlighter/) and it worked correctly there. Not definitive, of course, but I thought it was worth a shot. :) Dave From heikki.lehvaslaiho at gmail.com Tue Feb 3 09:18:31 2009 From: heikki.lehvaslaiho at gmail.com (Heikki Lehvaslaiho) Date: Tue, 3 Feb 2009 16:18:31 +0200 Subject: [Bioperl-l] What is happing to perl language In-Reply-To: <24E004CC-EB2B-4C06-8883-D0B968D28A44@illinois.edu> References: <8f200b4c0902022300u72ff71e0we6d08342f6414da6@mail.gmail.com> <24E004CC-EB2B-4C06-8883-D0B968D28A44@illinois.edu> Message-ID: Chris, Rubbing shoulders with celebs! Good for you. :) The second link should probably be somewhere else than to your wiki page. -Heikki 2009/2/3 Chris Fields : > > On Feb 3, 2009, at 1:00 AM, Steve Chervitz wrote: > >> On Fri, Jan 30, 2009 at 12:04 PM, Chris Fields >> wrote: >>> >>> On Jan 30, 2009, at 12:30 PM, Heikki Lehvaslaiho wrote: >>> >>>> FYI >>>> >>>> A few years ago there were interesting articles about perl and new >>>> CPAN modules poping up here there. Recently, only chromatic has >>>> continued writing to http://www.perl.com/. >>>> >>>> The article below is a return to norm. >>>> >>>> Healthcheck: Perl >>>> The Perl Future by Piers Cawley >>>> >>>> >>>> http://www.heise-online.co.uk/open/Healthcheck-Perl-The-Perl-Future--/features/112388/0 >>>> >>>> -Heikki >>> >>> Note the healthy dose of perl6 (and a side mention of Moose) within the >>> above! Well worth reading. >>> >>> chris >> >> Also noteworthy, there was an interview last December with Larry Wall >> and another article on Perl 6, Slashdotted here (along with the >> obligate language flamewar): >> >> Larry Wall Talks Perl, Culture, and Community (12/14/08) >> http://developers.slashdot.org/article.pl?sid=08%2F12%2F14%2F1523236 >> >> Steve > > In case no one saw it: > > http://www.bioperl.org/wiki/User:Cjfields > > Larry was here talking about Perl6/Parrot and his Perl6 parser (along with a > few others). Here are the talks from the conference: > > http://www.bioperl.org/wiki/User:Cjfields > > chris > > -- -Heikki Heikki Lehvaslaiho - heikki lehvaslaiho gmail com http://kapkaupunki.blogspot.com/ From cjfields at illinois.edu Tue Feb 3 10:04:47 2009 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 3 Feb 2009 09:04:47 -0600 Subject: [Bioperl-l] What is happing to perl language In-Reply-To: References: <8f200b4c0902022300u72ff71e0we6d08342f6414da6@mail.gmail.com> <24E004CC-EB2B-4C06-8883-D0B968D28A44@illinois.edu> Message-ID: <547F873C-07C8-4911-AE89-19B7601F837B@illinois.edu> On Feb 3, 2009, at 8:18 AM, Heikki Lehvaslaiho wrote: > Chris, > > Rubbing shoulders with celebs! Good for you. :) > > The second link should probably be somewhere else than to your wiki > page. > > -Heikki Cut-and-paste artifact (sent too quickly, was late for the bus). The following link has TimToady in person talking about Perl6: http://www.acm.uiuc.edu/conference/2008/videos chris From Kevin.Clancy at invitrogen.com Tue Feb 3 17:45:57 2009 From: Kevin.Clancy at invitrogen.com (Clancy, Kevin) Date: Tue, 3 Feb 2009 14:45:57 -0800 Subject: [Bioperl-l] problems accessing svn repositories Message-ID: <28813B71732ED64A83348116D27A1A9A02AE50FC@CBD01EXCMBX01.ads.invitrogen.net> Hi Folks I am having some problems checking out a copy of the various bioperl repositories. When I use the command line specified in the wiki, I get the following message: svn: can't connect to host 'code.open-bio.org': Connection timed out. I can see the repositories when there is a URI. I do have svn installed on my computer. Is there a problem with the repository? Thanks kevin Kevin Clancy, PhD Senior Scientist Informatics & eCommerce Life Technologies 5791 Van Allen Way Carlsbad CA 92008 phone: 760 268 8356 cell: 240 417 8604 email: kevin.clancy at invitrogen.com From maj at fortinbras.us Tue Feb 3 19:06:12 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Tue, 3 Feb 2009 19:06:12 -0500 Subject: [Bioperl-l] true inserts into a SimpleAlign Message-ID: Folks- I was desiring to make a true insertion of a LocatableSeq into a SimpleAlign, rather than just adding it to the end as $aln->add_seq($lseq) would do. This: $aln->add_seq($lseq,$order) currently stomps on whatever is stored at $order in the index. The patch at the bottom of the post allows a true insertion (if something exists at index location $order, the minimum number of entries from that point on are moved down to make room.) Might be useful for others. All t/Align/SimpleAlign.t tests pass. Lemme know- Mark --- Index: SimpleAlign.pm =================================================================== --- SimpleAlign.pm (revision 15496) +++ SimpleAlign.pm (working copy) @@ -304,8 +304,22 @@ else { $self->debug( "Assigning $name to $order\n"); - $self->{'_order'}->{$order} = $name; + my $ordh = $self->{'_order'}; + if ($ordh->{$order}) { + # make space to insert + # $c->() returns (in reverse order) the first subsequence + # of consecutive integers; i.e., $c->(1,2,3,5,6,7) returns + # (3,2,1), and $c->(2,4,5) returns (2). + my $c; + $c = sub { return (($_[1]-$_[0] == 1) ? ($c->(@_[1..$#_]),$_[0]) : $_[0]); }; + map { + $ordh->{$_+1} = $ordh->{$_} + } $c->(sort {$a <=> $b} grep {$_ >= $order} keys %{$ordh}); + } + $ordh->{$order} = $name; + + unless( exists( $self->{'_start_end_lists'}->{$id})) { $self->{'_start_end_lists'}->{$id} = []; } From jason at bioperl.org Tue Feb 3 19:36:40 2009 From: jason at bioperl.org (Jason Stajich) Date: Tue, 3 Feb 2009 16:36:40 -0800 Subject: [Bioperl-l] problems accessing svn repositories In-Reply-To: <28813B71732ED64A83348116D27A1A9A02AE50FC@CBD01EXCMBX01.ads.invitrogen.net> References: <28813B71732ED64A83348116D27A1A9A02AE50FC@CBD01EXCMBX01.ads.invitrogen.net> Message-ID: <235DAF7F-1600-4943-BFE9-759C01F669C9@bioperl.org> Kevin - Anonymous checkout seems to work fine for me - is it possible there is a firewall port being blocked on your end for outgoing connections? http://www.robgonda.com/blog/index.cfm/2005/7/8/SVN-port--firewall We should maybe see if we can get the http interface to SVN running as well on code.open-bio.org -jason On Feb 3, 2009, at 2:45 PM, Clancy, Kevin wrote: > Hi Folks > I am having some problems checking out a copy of the various bioperl > repositories. When I use the command line specified in the wiki, I get > the following message: > svn: can't connect to host 'code.open-bio.org': Connection timed out. > I can see the repositories when there is a URI. I do have svn > installed > on my computer. > Is there a problem with the repository? > Thanks > kevin > > Kevin Clancy, PhD > Senior Scientist > Informatics & eCommerce > Life Technologies > 5791 Van Allen Way > Carlsbad CA 92008 > phone: 760 268 8356 > cell: 240 417 8604 > email: kevin.clancy at invitrogen.com > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason at bioperl.org From stefan.kirov at bms.com Tue Feb 3 19:27:37 2009 From: stefan.kirov at bms.com (Stefan Kirov) Date: Tue, 03 Feb 2009 19:27:37 -0500 Subject: [Bioperl-l] problems accessing svn repositories In-Reply-To: <28813B71732ED64A83348116D27A1A9A02AE50FC@CBD01EXCMBX01.ads.invitrogen.net> References: <28813B71732ED64A83348116D27A1A9A02AE50FC@CBD01EXCMBX01.ads.invitrogen.net> Message-ID: <4988E0F9.2030501@bms.com> Firewall? Clancy, Kevin wrote: > Hi Folks > I am having some problems checking out a copy of the various bioperl > repositories. When I use the command line specified in the wiki, I get > the following message: > svn: can't connect to host 'code.open-bio.org': Connection timed out. > I can see the repositories when there is a URI. I do have svn installed > on my computer. > Is there a problem with the repository? > Thanks > kevin > > Kevin Clancy, PhD > Senior Scientist > Informatics & eCommerce > Life Technologies > 5791 Van Allen Way > Carlsbad CA 92008 > phone: 760 268 8356 > cell: 240 417 8604 > email: kevin.clancy at invitrogen.com > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Tue Feb 3 20:31:08 2009 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 3 Feb 2009 19:31:08 -0600 Subject: [Bioperl-l] true inserts into a SimpleAlign In-Reply-To: References: Message-ID: If it passes all tests (not just SimpleAlign.t) I don't see a problem. BTW passed for me, but I didn't run everything. chris On Feb 3, 2009, at 6:06 PM, Mark A. Jensen wrote: > Folks- > > I was desiring to make a true insertion of a LocatableSeq into a > SimpleAlign, > rather than just adding it to the end as > $aln->add_seq($lseq) > would do. This: > $aln->add_seq($lseq,$order) > currently stomps on whatever is stored at $order in the index. The > patch at > the bottom of the post allows a true insertion (if something exists > at index > location $order, the minimum number of entries from that point on > are moved > down to make room.) > > Might be useful for others. All t/Align/SimpleAlign.t tests pass. > Lemme know- > > Mark > > --- > > Index: SimpleAlign.pm > =================================================================== > --- SimpleAlign.pm (revision 15496) > +++ SimpleAlign.pm (working copy) > @@ -304,8 +304,22 @@ > else { > $self->debug( "Assigning $name to $order\n"); > > - $self->{'_order'}->{$order} = $name; > + my $ordh = $self->{'_order'}; > + if ($ordh->{$order}) { > + # make space to insert > + # $c->() returns (in reverse order) the first subsequence > + # of consecutive integers; i.e., $c->(1,2,3,5,6,7) returns > + # (3,2,1), and $c->(2,4,5) returns (2). > + my $c; > + $c = sub { return (($_[1]-$_[0] == 1) ? ($c->(@_[1..$#_]), > $_[0]) : $_[0]); }; > + map { > + $ordh->{$_+1} = $ordh->{$_} > + } $c->(sort {$a <=> $b} grep {$_ >= $order} keys %{$ordh}); > > + } > + $ordh->{$order} = $name; > + > + > unless( exists( $self->{'_start_end_lists'}->{$id})) { > $self->{'_start_end_lists'}->{$id} = []; > } > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Kevin.Clancy at invitrogen.com Tue Feb 3 19:44:46 2009 From: Kevin.Clancy at invitrogen.com (Clancy, Kevin) Date: Tue, 3 Feb 2009 16:44:46 -0800 Subject: [Bioperl-l] problems accessing svn repositories In-Reply-To: <235DAF7F-1600-4943-BFE9-759C01F669C9@bioperl.org> References: <28813B71732ED64A83348116D27A1A9A02AE50FC@CBD01EXCMBX01.ads.invitrogen.net> <235DAF7F-1600-4943-BFE9-759C01F669C9@bioperl.org> Message-ID: <28813B71732ED64A83348116D27A1A9A02AE5174@CBD01EXCMBX01.ads.invitrogen.net> Hi Jason Thanks for the suggestion. IT thinks not however I'm going to try this at home and see if I can get to svn via my laptop. kevin Kevin Clancy Informatics & eCommerce Life Technologies phone: 760 268 8356 cell: 240 417 8604 email: kevin.clancy at invitrogen.com -----Original Message----- From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich Sent: Tuesday, February 03, 2009 4:37 PM To: Clancy, Kevin; Chris Dagdigian Cc: Bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] problems accessing svn repositories Kevin - Anonymous checkout seems to work fine for me - is it possible there is a firewall port being blocked on your end for outgoing connections? http://www.robgonda.com/blog/index.cfm/2005/7/8/SVN-port--firewall We should maybe see if we can get the http interface to SVN running as well on code.open-bio.org -jason On Feb 3, 2009, at 2:45 PM, Clancy, Kevin wrote: > Hi Folks > I am having some problems checking out a copy of the various bioperl > repositories. When I use the command line specified in the wiki, I get > the following message: > svn: can't connect to host 'code.open-bio.org': Connection timed out. > I can see the repositories when there is a URI. I do have svn > installed > on my computer. > Is there a problem with the repository? > Thanks > kevin > > Kevin Clancy, PhD > Senior Scientist > Informatics & eCommerce > Life Technologies > 5791 Van Allen Way > Carlsbad CA 92008 > phone: 760 268 8356 > cell: 240 417 8604 > email: kevin.clancy at invitrogen.com > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason at bioperl.org From davila at ioc.fiocruz.br Wed Feb 4 11:43:15 2009 From: davila at ioc.fiocruz.br (Alberto Davila) Date: Wed, 04 Feb 2009 14:43:15 -0200 Subject: [Bioperl-l] Bioperl examples to extract CDS from genbank files Message-ID: <4989C5A3.2040206@ioc.fiocruz.br> Hi fellows, I am looking for this example: examples/seq/extract_cds.pl listed at this page: http://www.bioperl.org/wiki/Bioperl_scripts but the link appears to be broken: http://code.open-bio.org/svnweb/index.cgi/bioperl/view/bioperl-live/trunk/examples/seq/extract_cds.pl I browsed the bioperl SVN repository (http://code.open-bio.org/svnweb/index.cgi/bioperl/browse/bioperl-live/trunk/examples) but was unable to find it... Could you offer any help to find it ? Thanks, Alberto From jason at bioperl.org Wed Feb 4 12:42:41 2009 From: jason at bioperl.org (Jason Stajich) Date: Wed, 4 Feb 2009 09:42:41 -0800 Subject: [Bioperl-l] how to input files named by variables to SearchIO? In-Reply-To: <517072a20902040921t1e27dd82q6c177c47b6afca46@mail.gmail.com> References: <517072a20902040921t1e27dd82q6c177c47b6afca46@mail.gmail.com> Message-ID: Please ask your questions on the mailing list in the future. You are using single quotes > =>'/home/evolution/temp/$id1.out'); instead of double quotes which you need to interpolate a variable > =>"/home/evolution/temp/$id1.out"); On Feb 4, 2009, at 9:21 AM, kevin fan wrote: > Dear Dr. Jason, > > i am annotating 11000 ESTs using stand along blastx against the NCBI > non-redundant protein database. I put the blast results seperately > since it > will be too large to see the raw results if i put all results into > one file. > the blast results are named as the sequences name. my question is > how to > input those files into to SearchIO? > > here are some of my codes: > > > system("$formatdb -i /home/evolution/temp3/$id1.txt -p F");#$id1 is > variable > for sequences. > > system("$blastx -i /home/evolution/temp/$id1.txt -d > /home/evolution/blast-2.2.19/data/nr -p blastx -o > /home/evolution/temp/$id1.out"); > > my $in = new Bio::SearchIO(-format => 'blast',-file > =>'/home/evolution/temp/$id1.out'); > > the first two commands run well on my machine, but when i try to > iuput the > file in SearchIO. it can not open the file named by variables. > > shaohua fan Jason Stajich jason at bioperl.org From sdavis2 at mail.nih.gov Thu Feb 5 07:39:50 2009 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 5 Feb 2009 07:39:50 -0500 Subject: [Bioperl-l] [JOB] National Cancer Institute, Bethesda, MD Message-ID: <264855a00902050439h2319fd18n63f055e248c028a7@mail.gmail.com> I hope I can be forgiven for the [JOB] spam.... The Genetics Branch at the National Cancer Institute in Bethesda, MD, USA, is currently offering training positions within our bioinformatics and computational biology group. Our group is embedded in a laboratory environment, offering close collaboration with bench scientists and a steady supply of novel and interesting data. The laboratory uses multiple high-throughput genomics technologies including microarray assays for gene expression, copy number, SNPs, DNA methylation, and miRNA as well as second-generation sequencing technologies and has access to Solexa, ABI Solid, and 454 technologies. The candidate should have knowledge and skills related to genomics or molecular biology; biostatistics or other advanced math discipline; programming languages such as perl, python, java, R, or C++ as well as experience with analysis of one or more high-throughput genomic technologies. Feel free to share this with interested individuals. If you are interested in applying, please forward a CV. Thanks, Sean -- Sean Davis, MD, PhD Genetics Branch Center for Cancer Research National Cancer Institute National Institutes of Health 37 Convent Drive Building 37, Room 6138 Bethesda, MD 20892 Phone: 301-435-2652 Fax: 301-402-3241 -- From dan.bolser at gmail.com Thu Feb 5 09:13:13 2009 From: dan.bolser at gmail.com (Dan Bolser) Date: Thu, 5 Feb 2009 14:13:13 +0000 Subject: [Bioperl-l] blastxml !defined($hsp->significance) Message-ID: <2c8757af0902050613j3d1214b3g20a69fec20b59901@mail.gmail.com> I don't know if I'm doing something wrong, but $hsp->significance seems to be undef when the evalue is 0 (parsing blastxml) with the latest version of the parser. The value should of course be 0 and not undef. Does anyone else see this (is it a known problem)? Cheers, Dan. From cjfields at illinois.edu Thu Feb 5 11:36:29 2009 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 5 Feb 2009 10:36:29 -0600 Subject: [Bioperl-l] blastxml !defined($hsp->significance) In-Reply-To: <2c8757af0902050613j3d1214b3g20a69fec20b59901@mail.gmail.com> References: <2c8757af0902050613j3d1214b3g20a69fec20b59901@mail.gmail.com> Message-ID: <1B2604FE-152A-4881-9A90-73260B6A9354@illinois.edu> Will check it out; my guess is there is a checking issue looking for simple bool, not defined. chris On Feb 5, 2009, at 8:13 AM, Dan Bolser wrote: > I don't know if I'm doing something wrong, but $hsp->significance > seems to be undef when the evalue is 0 (parsing blastxml) with the > latest version of the parser. The value should of course be 0 and not > undef. > > Does anyone else see this (is it a known problem)? > > > Cheers, > Dan. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From manni122 at hotmail.com Wed Feb 4 09:47:27 2009 From: manni122 at hotmail.com (manni122) Date: Wed, 4 Feb 2009 06:47:27 -0800 (PST) Subject: [Bioperl-l] dpAlign - pairwise sequence alignment - en masse Message-ID: <21831861.post@talk.nabble.com> Hope I am right here. I need help in coding the right routine. I have two files with a lot of sequences in. I want to pairwise align any Sequence in file A with any sequence in file B. My first loop is not working even the next command is given. This loop reads just the first sequence in file A and aligns with all sequences in file B which means the second loop is working. I appreciate any help out of this... At the moment I have: use Bio::Tools::dpAlign; use Bio::SeqIO; use Bio::SimpleAlign; use Bio::AlignIO; use Bio::PrimarySeqI; $factory = new Bio::Tools::dpAlign(-match => 3, -mismatch => -1, -gap => 3, -ext => 1, -alg => Bio::Tools::dpAlign::DPALIGN_LOCAL_MILLER_MYERS); $seqA = Bio::SeqIO->new(-file => "A.fas", -format => "fasta"); $seqB = Bio::SeqIO->new(-file => "B.fas", -format => "fasta"); my $alnout = Bio::AlignIO->new(-format => "clustalw", -file => ">out.fas"); #reads the first file in and should go through all sequences while( my $seq1 = $seqA->next_seq ) { #reads the second file in and goes through all sequences while (my $seq2 = $seqB->next_seq) { $factory = Bio::Tools::dpAlign->new(-alg => 1); $out = $factory->pairwise_alignment($seq1,$seq2); $alnout->write_aln($out); } next; } -- View this message in context: http://www.nabble.com/dpAlign---pairwise-sequence-alignment---en-masse-tp21831861p21831861.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From cjfields at illinois.edu Thu Feb 5 13:25:41 2009 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 5 Feb 2009 12:25:41 -0600 Subject: [Bioperl-l] blastxml !defined($hsp->significance) In-Reply-To: <2c8757af0902050613j3d1214b3g20a69fec20b59901@mail.gmail.com> References: <2c8757af0902050613j3d1214b3g20a69fec20b59901@mail.gmail.com> Message-ID: Dan, This has been fixed in bioperl-live. As mentioned it was a problem with significance() not checking defined-ness (it was merely checking bool). The fix is available via svn, you only need one file (Bio/Search/HSP/ GenericHSP.pm). It should be showing up via anon svn in the next hour or two. It has also been merged over to the 1.6 branch and will appear in the next point release. chris On Feb 5, 2009, at 8:13 AM, Dan Bolser wrote: > I don't know if I'm doing something wrong, but $hsp->significance > seems to be undef when the evalue is 0 (parsing blastxml) with the > latest version of the parser. The value should of course be 0 and not > undef. > > Does anyone else see this (is it a known problem)? > > > Cheers, > Dan. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Thu Feb 5 13:52:59 2009 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 5 Feb 2009 12:52:59 -0600 Subject: [Bioperl-l] Staden, BioLib and BioPerl Message-ID: <6B987B24-A455-4AD5-9890-E0FB8AACE43D@illinois.edu> I have been corresponding off-list with Pjotr Prins, who is heading up the BioLib initiative (http://biolib.open-bio.org). He has managed to get SWIG-based perl wrappers set up for the Staden io_lib libraries that compile on Mac/Linux and possibly other platforms; for those who want to test it out you can get the code (using git) here: http://github.com/pjotrp/biolib/tree/master For the Perl-based modules, is there a specific namespace we should use for future development, or should we start up a bioperl-biolib? Maybe Bio::SeqIO::staden::biolib? chris From maj at fortinbras.us Thu Feb 5 14:14:17 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Thu, 5 Feb 2009 14:14:17 -0500 Subject: [Bioperl-l] dpAlign - pairwise sequence alignment - en masse In-Reply-To: <21831861.post@talk.nabble.com> References: <21831861.post@talk.nabble.com> Message-ID: <587A4A49102E41AC847961ACD32F902F@NewLife> Manni122- You run out of sequences in $seqB on the first time through. I would do- my @seqB; while ( push @seqB, $seqB->next_seq ) { 1; } # or something while ( my $seq1 = $seqA->next_seq ) { foreach $seq2 (@seqB) { .... } } I'd probably drop the next; statement, too. cheers MAJ ----- Original Message ----- From: "manni122" To: Sent: Wednesday, February 04, 2009 9:47 AM Subject: [Bioperl-l] dpAlign - pairwise sequence alignment - en masse > > Hope I am right here. > I need help in coding the right routine. I have two files with a lot of > sequences in. I want to pairwise align any Sequence in file A with any > sequence in file B. My first loop is not working even the next command is > given. This loop reads just the first sequence in file A and aligns with all > sequences in file B which means the second loop is working. I appreciate any > help out of this... > > At the moment I have: > > use Bio::Tools::dpAlign; > use Bio::SeqIO; > use Bio::SimpleAlign; > use Bio::AlignIO; > use Bio::PrimarySeqI; > > $factory = new Bio::Tools::dpAlign(-match => 3, > -mismatch => -1, > -gap => 3, > -ext => 1, > -alg => > Bio::Tools::dpAlign::DPALIGN_LOCAL_MILLER_MYERS); > > > $seqA = Bio::SeqIO->new(-file => "A.fas", -format => "fasta"); > $seqB = Bio::SeqIO->new(-file => "B.fas", -format => "fasta"); > my $alnout = Bio::AlignIO->new(-format => "clustalw", > > -file => ">out.fas"); > > #reads the first file in and should go through all sequences > while( my $seq1 = $seqA->next_seq ) { > > #reads the second file in and goes through all sequences > while (my $seq2 = $seqB->next_seq) { > > $factory = Bio::Tools::dpAlign->new(-alg => 1); > > $out = $factory->pairwise_alignment($seq1,$seq2); > > $alnout->write_aln($out); > > } > next; > } > > -- > View this message in context: > http://www.nabble.com/dpAlign---pairwise-sequence-alignment---en-masse-tp21831861p21831861.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From maj at fortinbras.us Fri Feb 6 00:49:12 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 6 Feb 2009 00:49:12 -0500 Subject: [Bioperl-l] Unwise elimination of nodes in B:T:Node::remove_Descendent? Message-ID: Hi All, This is a pedantic one, but it addresses a relatively deep (or capricious?) choice made in the Node.pm code. Please bear with me. Background- I was trolling the archive for good scraps (dumpster-diving, if you will), and came upon this candidate: "branch length score - total length of the spanning subtree" http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027279.html Briefly, the question was, how do you get the total branch length of a subtree, and the solutions discussed involved using splice() to edit the tree, either by explicitly choosing the subtree nodes (-keep_id) or removing the other nodes (-remove_id), then performing total_branch_length() on the edited tree. Looking over the code, I saw two bugs. One was in the ultimate "solution" in the thread. Deeper than that is what I think is a problem in B:T:Node::remove_Descendent. The question-asker, Daniel G., wrote the following: (using the example tree I have painstakingly rendered below, only to be destroyed by my proportional font): A B C D E |5 |5 |4 |4 | \ / \ / / x y / |2 |1 / \ / / \ _/ / 10 \ / / z / |3 / \ / ROOT > Daniel Gerlach wrote: > Thanks for the quick answer. I tried: > > use Bio::TreeIO; > my $treeio = Bio::TreeIO->new(-format => 'newick', > -fh => \*DATA); > my $tree = $treeio->next_tree; > print $tree->total_branch_length,"\n"; > $tree->splice(-keep_id => [A,B,E]); > print $tree->total_branch_length,"\n"; > > __DATA__ > (((A:5,B:5)x:2,(C:4,D:4)y:1)z:3,E:10); > > Which gives me the message "MSG: After splicing, the original root was > removed but there are multiple candidates for the new root!" however the > root E was not removed. > > If I do it the complementary way by splicing out all unwanted nodes - > splice(-remove_id => [C,D]) - I get what I want: > > 34 > 25 The first problem was that the "complementary approaches" didn't give the same answer, and one threw an error. This is the bug in the code above. If you look at the tree, the nodes he really wants to keep are the leaves [A,B,E], *plus* the internal nodes [x,y,z]; that is... $tree->splice(-keep_id=>[A,B,E,x,'y',z]) $tree->total_branch_length doesn't throw and returns 25, which is correct. The second problem is that removing [C, D] also gives him 25, which is what he wanted, but is not correct. If the node 'y' remains in the tree after removing C and D, then the branch length should be 26 (z-y branch). The problem arises in this code at the end of remove_Descendent: # remove unecessary nodes if we have removed the part |||Node.pm LINE 272 # which branches. my $a1 = $self->ancestor; if( $a1 ) { my $bl = $self->branch_length || 0; my @d = $self->each_Descendent; if (scalar @d == 1) { $d[0]->branch_length($bl + ($d[0]->branch_length || 0)); $a1->add_Descendent($d[0]); $a1->remove_Descendent($self); } } When node C is removed from the example tree, the node 'y' is removed by this code apparently as a convenience. But this node is obviously not 'unnecessary', any more than the root of a bifurcating tree is unnecessary, which also is of order 2 and not 1 or 3 like good self-respecting nodes are. When I comment out the above remove_Descendent fragment, the following code use Bio::TreeIO; my $treeio = Bio::TreeIO->new(-format => 'newick', -fh => \*DATA); my $tree = $treeio->next_tree; my $keep_tree = $tree->_clone; my $rmv_tree = $tree->_clone; print $tree->total_branch_length,"\n"; $keep_tree->splice(-keep_id => [A,B,x, z,r,E]); print $keep_tree->total_branch_length,"\n"; $rmv_tree->splice(-remove_id => [C,D]); print $rmv_tree->total_branch_length,"\n"; $rmv_tree->splice(-remove_id => ['y']); print $rmv_tree->total_branch_length,"\n"; __DATA__ (((A:5,B:5)x:2,(C:4,D:4)y:1)z:3,E:10)r; returns 34 # original tree [A,B,C,D,E,x,'y',z,r] 25 # keep desired nodes [A,B,E,x,z,r] 26 # remove [C,D] 25 # remove undesired nodes [C,D,'y'] which is correct. Question is then, is the removal of "unnecessary nodes" relied upon? I will run the tests, but my thinking is that even if we get some fails, those should be dealt with by removing order 2 nodes by special request (looks like contract_linear_paths() is the thing to use for this). cheers and thanks, MAJ From heikki.lehvaslaiho at gmail.com Fri Feb 6 00:51:07 2009 From: heikki.lehvaslaiho at gmail.com (Heikki Lehvaslaiho) Date: Fri, 6 Feb 2009 07:51:07 +0200 Subject: [Bioperl-l] Staden, BioLib and BioPerl In-Reply-To: <6B987B24-A455-4AD5-9890-E0FB8AACE43D@illinois.edu> References: <6B987B24-A455-4AD5-9890-E0FB8AACE43D@illinois.edu> Message-ID: This is potentially a very useful addition. Maybe this is a chance to get more usage and testing for bioperl-ext. I'd suggest BioLib goes in there. Bio::SeqIO::staden::biolib sounds good for the Staden IO lib. The naming should follow BioPerl conventions and future components from BioLib will then be placed where they fit logically in BioPerl namespace. -Heikki 2009/2/5 Chris Fields : > I have been corresponding off-list with Pjotr Prins, who is heading up the > BioLib initiative (http://biolib.open-bio.org). He has managed to get > SWIG-based perl wrappers set up for the Staden io_lib libraries that compile > on Mac/Linux and possibly other platforms; for those who want to test it out > you can get the code (using git) here: > > http://github.com/pjotrp/biolib/tree/master > > For the Perl-based modules, is there a specific namespace we should use for > future development, or should we start up a bioperl-biolib? Maybe > Bio::SeqIO::staden::biolib? > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- -Heikki Heikki Lehvaslaiho - heikki lehvaslaiho gmail com http://kapkaupunki.blogspot.com/ From maj at fortinbras.us Fri Feb 6 01:07:46 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 6 Feb 2009 01:07:46 -0500 Subject: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? In-Reply-To: References: Message-ID: <1627B3D3B9AD497CB9B7ACF0E1DE9487@NewLife> CORRECTION: please patch the previous post... ... > code above. If you look at the tree, the nodes he really wants to keep are - the leaves [A,B,E], *plus* the internal nodes [x,y,z]; that is... + the leaves [A,B,E], *plus* the internal nodes [x,,z]; that is... > - $tree->splice(-keep_id=>[A,B,E,x,'y',z]) + $tree->splice(-keep_id=>[A,B,E,x,,z]) > $tree->total_branch_length > > doesn't throw and returns 25, which is correct. sorry, it's late ... MAJ From heikki.lehvaslaiho at gmail.com Fri Feb 6 01:37:02 2009 From: heikki.lehvaslaiho at gmail.com (Heikki Lehvaslaiho) Date: Fri, 6 Feb 2009 08:37:02 +0200 Subject: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? In-Reply-To: <1627B3D3B9AD497CB9B7ACF0E1DE9487@NewLife> References: <1627B3D3B9AD497CB9B7ACF0E1DE9487@NewLife> Message-ID: Mark, You have not obviously committed any fixes so it is a bit difficult to make sure how things work, but I would say that if your code works to give the results at the end of your mail, do commit it. The logic of things seems right: If you are keeping nodes, the code will do the right thing. If you are manually removing nodes by naming them, you better know what you are doing. I would not worry about keeping backward compatibility when the code has a clear error. One solution that might highlight the issues in splicing to users would be to add more verbosely named methods. E.g. splice_subtrees(@leafnodeids) # input is leaf nodes, only. Removes also "unnecessary" internal nodes -Heikki 2009/2/6 Mark A. Jensen : > CORRECTION: please patch the previous post... > > ... >> >> code above. If you look at the tree, the nodes he really wants to keep are > > - the leaves [A,B,E], *plus* the internal nodes [x,y,z]; that is... > + the leaves [A,B,E], *plus* the internal nodes [x,,z]; that is... >> > - $tree->splice(-keep_id=>[A,B,E,x,'y',z]) > + $tree->splice(-keep_id=>[A,B,E,x,,z]) >> >> $tree->total_branch_length >> >> doesn't throw and returns 25, which is correct. > > sorry, it's late ... MAJ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- -Heikki Heikki Lehvaslaiho - heikki lehvaslaiho gmail com http://kapkaupunki.blogspot.com/ From cjfields at illinois.edu Fri Feb 6 08:02:56 2009 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 6 Feb 2009 07:02:56 -0600 Subject: [Bioperl-l] Fwd: BioPerl and BioLib References: <20090206093842.GA22844@thebird.nl> Message-ID: <94E35D1C-D751-4EC2-AE78-6A8B4BE69904@illinois.edu> I'll forward this onto the list. Something that links into EMBOSS would be nice, or even SeqAn. SeqAn may be a bit of challenge as it is C++ and uses templates: http://www.seqan.de/ BTW, the NCBI toolset has a perl API but there are several problems with it not least being it's precompiled only for Linux: ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools%2B%2B/2008/Mar_17_2008/Perl/ -c Begin forwarded message: > From: Pjotr Prins > Date: February 6, 2009 3:38:42 AM CST > To: Chris Fields > Subject: BioPerl > > Hi Chris, > > While we are at it, and in the flow, I would like to add another > library to biolib. What do you suggest is a need for? Maybe you can > post a question on the BioPerl mailing list? Anything that will give > some real users. Emboss Nucleus or AJAX? Or some part of the NCBI > toolset? > > Pj. From maj at fortinbras.us Fri Feb 6 07:45:56 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 6 Feb 2009 07:45:56 -0500 Subject: [Bioperl-l] Unwise elimination of nodesinB:T:Node::remove_Descendent? In-Reply-To: References: <1627B3D3B9AD497CB9B7ACF0E1DE9487@NewLife> Message-ID: Ok, Heikki- Thanks; I'll go ahead with the commit to Node.pm. I agree too with the idea of convenience methods that DWIM for users--your suggestion in particular is a natural. MAJ ----- Original Message ----- From: "Heikki Lehvaslaiho" To: "Mark A. Jensen" Cc: Sent: Friday, February 06, 2009 1:37 AM Subject: Re: [Bioperl-l] Unwise elimination of nodesinB:T:Node::remove_Descendent? > Mark, > > You have not obviously committed any fixes so it is a bit difficult to > make sure how things work, but I would say that if your code works to > give the results at the end of your mail, do commit it. > > The logic of things seems right: If you are keeping nodes, the code > will do the right thing. If you are manually removing nodes by naming > them, you better know what you are doing. > > I would not worry about keeping backward compatibility when the code > has a clear error. > > One solution that might highlight the issues in splicing to users > would be to add more verbosely named methods. E.g. > > splice_subtrees(@leafnodeids) # input is leaf nodes, only. Removes > also "unnecessary" internal nodes > > -Heikki > > > 2009/2/6 Mark A. Jensen : >> CORRECTION: please patch the previous post... >> >> ... >>> >>> code above. If you look at the tree, the nodes he really wants to keep are >> >> - the leaves [A,B,E], *plus* the internal nodes [x,y,z]; that is... >> + the leaves [A,B,E], *plus* the internal nodes [x,,z]; that is... >>> >> - $tree->splice(-keep_id=>[A,B,E,x,'y',z]) >> + $tree->splice(-keep_id=>[A,B,E,x,,z]) >>> >>> $tree->total_branch_length >>> >>> doesn't throw and returns 25, which is correct. >> >> sorry, it's late ... MAJ >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > > -- > -Heikki > Heikki Lehvaslaiho - heikki lehvaslaiho gmail com > http://kapkaupunki.blogspot.com/ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From chad.a.davis at gmail.com Fri Feb 6 08:50:31 2009 From: chad.a.davis at gmail.com (Chad Davis) Date: Fri, 6 Feb 2009 14:50:31 +0100 Subject: [Bioperl-l] Fwd: BioPerl and BioLib In-Reply-To: <94E35D1C-D751-4EC2-AE78-6A8B4BE69904@illinois.edu> References: <20090206093842.GA22844@thebird.nl> <94E35D1C-D751-4EC2-AE78-6A8B4BE69904@illinois.edu> Message-ID: If you're looking for requests, how about something for structural bioinformatics: http://www.bioinf.uni-sb.de/OK/BALL/ ... though BALL might be as difficult as SeqAn, as it's also C++/templated. Though, I believe it already provides Python bindings (via Swig?). Maybe that makes it easier ... Chad On Fri, Feb 6, 2009 at 14:02, Chris Fields wrote: > I'll forward this onto the list. Something that links into EMBOSS would be > nice, or even SeqAn. SeqAn may be a bit of challenge as it is C++ and uses > templates: > > http://www.seqan.de/ > > BTW, the NCBI toolset has a perl API but there are several problems with it > not least being it's precompiled only for Linux: > > ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools%2B%2B/2008/Mar_17_2008/Perl/ > > -c > > Begin forwarded message: > > From: Pjotr Prins >> Date: February 6, 2009 3:38:42 AM CST >> To: Chris Fields >> Subject: BioPerl >> >> Hi Chris, >> >> While we are at it, and in the flow, I would like to add another >> library to biolib. What do you suggest is a need for? Maybe you can >> post a question on the BioPerl mailing list? Anything that will give >> some real users. Emboss Nucleus or AJAX? Or some part of the NCBI >> toolset? >> >> Pj. >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Fri Feb 6 09:21:31 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 6 Feb 2009 09:21:31 -0500 Subject: [Bioperl-l] Unwise elimination of nodes in B:T:Node::remove_Descendent? In-Reply-To: References: Message-ID: On Feb 6, 2009, at 12:49 AM, Mark A. Jensen wrote: > A B C D E > |5 |5 |4 |4 | > \ / \ / / > x y / > |2 |1 / > \ / / > \ _/ / 10 > \ / / > z / > |3 / > \ / > ROOT > >> [...] >> __DATA__ >> (((A:5,B:5)x:2,(C:4,D:4)y:1)z:3,E:10); >> [...] > > The first problem was that the "complementary approaches" didn't give > the same answer, and one threw an error. This is the bug in the > code above. If you look at the tree, the nodes he really wants to > keep are > the leaves [A,B,E], *plus* the internal nodes [x,y,z]; that is... > > $tree->splice(-keep_id=>[A,B,E,x,'y',z]) > $tree->total_branch_length > > doesn't throw and returns 25, which is correct. If you replace 'y' with the root in the above, yes. Otherwise no. > The second problem is that removing [C, D] also gives him 25, which is > what he wanted, but is not correct. Correct. > [...]The problem arises in this code at the end of remove_Descendent: > > # remove unecessary nodes if we have removed the part |||Node.pm > LINE 272 > # which branches. > my $a1 = $self->ancestor; > if( $a1 ) { > my $bl = $self->branch_length || 0; > my @d = $self->each_Descendent; > if (scalar @d == 1) { > $d[0]->branch_length($bl + ($d[0]->branch_length || 0)); > $a1->add_Descendent($d[0]); > $a1->remove_Descendent($self); > } > } > > When node C is removed from the example tree, the node 'y' is removed > by this code apparently as a convenience. I guess I'm confused by the above code snippet. If @d are the descendants of $a1, then I don't understand what the purpose of adding $d[0] as a descendant of $a1 is (after altering its branch length). Furthermore, if $a1 is the ancestor of $self, and $a1 has only one descendant, wouldn't that mean that $d[0] == $self? What am I missing? I also think it's a bad idea to simplify the tree (or collapse internal nodes of degree 1) as an implicit operation. It should be explicit (and I believe there is a method simplify() or something similar, isn't there? Ah - I see you quote contract_linear_paths()). Furthermore, I think it's also a bad idea to remove descendant leaf nodes if you remove an internal node. What if you really just wanted to remove the internal node because, for example, its branching point isn't well supported? So removing node 'y' should make z a node of degree 3, but not remove C and D unless you ask to remove the subtree beginning at 'y'. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From maj at fortinbras.us Fri Feb 6 09:34:03 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 6 Feb 2009 09:34:03 -0500 Subject: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? In-Reply-To: References: Message-ID: Hilmar-- >> doesn't throw and returns 25, which is correct. > > If you replace 'y' with the root in the above, yes. Otherwise no. > Absolutely-- up the thread you'll see the late-night correction- > I guess I'm confused by the above code snippet. If @d are the descendants of > $a1, then I don't understand what the purpose of adding $d[0] as a descendant > of $a1 is (after altering its branch length). Furthermore, if $a1 is the > ancestor of $self, and $a1 has only one descendant, wouldn't that mean that > $d[0] == $self? Yes-- this snippet is actually what was present in remove_Descendent, and what I proposed to delete -- sorry if that was confusing. The commit was simply the deletion of this snippet. > Furthermore, I think it's also a bad idea to remove descendant leaf nodes if > you remove an internal node. What if you really just wanted to remove the > internal node because, for example, its branching point isn't well supported? > So removing node 'y' should make z a node of degree 3, but not remove C and D > unless you ask to remove the subtree beginning at 'y'. Absolutely. Node deletion and pruning must be different operations. I'm new to the code, but Sendu's splice() seems to take care only to snip out requested nodes. It was the goodness in splice() that sent me to remove_Descendent() thanks! MAJ ----- Original Message ----- From: "Hilmar Lapp" To: "Mark A. Jensen" Cc: Sent: Friday, February 06, 2009 9:21 AM Subject: Re: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? > > On Feb 6, 2009, at 12:49 AM, Mark A. Jensen wrote: > >> A B C D E >> |5 |5 |4 |4 | >> \ / \ / / >> x y / >> |2 |1 / >> \ / / >> \ _/ / 10 >> \ / / >> z / >> |3 / >> \ / >> ROOT >> >>> [...] >>> __DATA__ >>> (((A:5,B:5)x:2,(C:4,D:4)y:1)z:3,E:10); >>> [...] >> >> The first problem was that the "complementary approaches" didn't give >> the same answer, and one threw an error. This is the bug in the >> code above. If you look at the tree, the nodes he really wants to keep are >> the leaves [A,B,E], *plus* the internal nodes [x,y,z]; that is... >> >> $tree->splice(-keep_id=>[A,B,E,x,'y',z]) >> $tree->total_branch_length >> >> doesn't throw and returns 25, which is correct. > > If you replace 'y' with the root in the above, yes. Otherwise no. > >> The second problem is that removing [C, D] also gives him 25, which is >> what he wanted, but is not correct. > > Correct. > >> [...]The problem arises in this code at the end of remove_Descendent: >> >> # remove unecessary nodes if we have removed the part |||Node.pm LINE 272 >> # which branches. >> my $a1 = $self->ancestor; >> if( $a1 ) { >> my $bl = $self->branch_length || 0; >> my @d = $self->each_Descendent; >> if (scalar @d == 1) { >> $d[0]->branch_length($bl + ($d[0]->branch_length || 0)); >> $a1->add_Descendent($d[0]); >> $a1->remove_Descendent($self); >> } >> } >> >> When node C is removed from the example tree, the node 'y' is removed >> by this code apparently as a convenience. > > I guess I'm confused by the above code snippet. If @d are the descendants of > $a1, then I don't understand what the purpose of adding $d[0] as a descendant > of $a1 is (after altering its branch length). Furthermore, if $a1 is the > ancestor of $self, and $a1 has only one descendant, wouldn't that mean that > $d[0] == $self? > > What am I missing? > > I also think it's a bad idea to simplify the tree (or collapse internal nodes > of degree 1) as an implicit operation. It should be explicit (and I believe > there is a method simplify() or something similar, isn't there? Ah - I see > you quote contract_linear_paths()). > > Furthermore, I think it's also a bad idea to remove descendant leaf nodes if > you remove an internal node. What if you really just wanted to remove the > internal node because, for example, its branching point isn't well supported? > So removing node 'y' should make z a node of degree 3, but not remove C and D > unless you ask to remove the subtree beginning at 'y'. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Fri Feb 6 09:46:16 2009 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 6 Feb 2009 08:46:16 -0600 Subject: [Bioperl-l] Unwise elimination of nodes in B:T:Node::remove_Descendent? In-Reply-To: References: Message-ID: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> On Feb 6, 2009, at 8:21 AM, Hilmar Lapp wrote: > > On Feb 6, 2009, at 12:49 AM, Mark A. Jensen wrote: > >> A B C D E >> |5 |5 |4 |4 | >> \ / \ / / >> x y / >> |2 |1 / >> \ / / >> \ _/ / 10 >> \ / / >> z / >> |3 / >> \ / >> ROOT >> >>> [...] >>> __DATA__ >>> (((A:5,B:5)x:2,(C:4,D:4)y:1)z:3,E:10); >>> [...] >> >> The first problem was that the "complementary approaches" didn't give >> the same answer, and one threw an error. This is the bug in the >> code above. If you look at the tree, the nodes he really wants to >> keep are >> the leaves [A,B,E], *plus* the internal nodes [x,y,z]; that is... >> >> $tree->splice(-keep_id=>[A,B,E,x,'y',z]) >> $tree->total_branch_length >> >> doesn't throw and returns 25, which is correct. > > If you replace 'y' with the root in the above, yes. Otherwise no. > >> The second problem is that removing [C, D] also gives him 25, which >> is >> what he wanted, but is not correct. > > Correct. > >> [...]The problem arises in this code at the end of remove_Descendent: >> >> # remove unecessary nodes if we have removed the part |||Node.pm >> LINE 272 >> # which branches. >> my $a1 = $self->ancestor; >> if( $a1 ) { >> my $bl = $self->branch_length || 0; >> my @d = $self->each_Descendent; >> if (scalar @d == 1) { >> $d[0]->branch_length($bl + ($d[0]->branch_length || 0)); >> $a1->add_Descendent($d[0]); >> $a1->remove_Descendent($self); >> } >> } >> >> When node C is removed from the example tree, the node 'y' is removed >> by this code apparently as a convenience. > > I guess I'm confused by the above code snippet. If @d are the > descendants of $a1, then I don't understand what the purpose of > adding $d[0] as a descendant of $a1 is (after altering its branch > length). Furthermore, if $a1 is the ancestor of $self, and $a1 has > only one descendant, wouldn't that mean that $d[0] == $self? > > What am I missing? > > I also think it's a bad idea to simplify the tree (or collapse > internal nodes of degree 1) as an implicit operation. It should be > explicit (and I believe there is a method simplify() or something > similar, isn't there? Ah - I see you quote contract_linear_paths()). > > Furthermore, I think it's also a bad idea to remove descendant leaf > nodes if you remove an internal node. What if you really just wanted > to remove the internal node because, for example, its branching > point isn't well supported? So removing node 'y' should make z a > node of degree 3, but not remove C and D unless you ask to remove > the subtree beginning at 'y'. > > -hilmar I suppose the best way to deal with some of these questions (and ensure Node/Tree is acting as expected) is to come up with several vetted test cases indicating what we expect the proper behavior to be for remove_Descendant(), contract_linear_paths(), and any other problematic Node/Tree/TreeFunctionI methods. In fact, I highly recommend any code changes like this add tests to the test suite demonstrating the issue. Possibly related to all this is a fairly significant lingering bug dealing with Bio::Tree::TreeFunctionsI::reroot() (http://bugzilla.open-bio.org/show_bug.cgi?id=2456 ). Any takers? chris From hlapp at gmx.net Fri Feb 6 09:54:03 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 6 Feb 2009 09:54:03 -0500 Subject: [Bioperl-l] Unwise elimination of nodes in B:T:Node::remove_Descendent? In-Reply-To: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> References: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> Message-ID: <569F05AD-1D2B-47A8-9312-81CFCAB41A0D@gmx.net> On Feb 6, 2009, at 9:46 AM, Chris Fields wrote: > I suppose the best way to deal with some of these questions (and > ensure Node/Tree is acting as expected) is to come up with several > vetted test cases indicating what we expect the proper behavior to be Well put, and I totally agree. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at illinois.edu Fri Feb 6 09:55:53 2009 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 6 Feb 2009 08:55:53 -0600 Subject: [Bioperl-l] Fwd: BioPerl and BioLib In-Reply-To: References: <20090206093842.GA22844@thebird.nl> <94E35D1C-D751-4EC2-AE78-6A8B4BE69904@illinois.edu> Message-ID: swig does handle C++ templates but they are definitely trickier to work around. I have tried a little XS-based wrapping of seqan and it is definitely not for the faint of heart (I gave it up, just not enough tuits). chris On Feb 6, 2009, at 7:50 AM, Chad Davis wrote: > If you're looking for requests, how about something for structural > bioinformatics: > > http://www.bioinf.uni-sb.de/OK/BALL/ > > ... though BALL might be as difficult as SeqAn, as it's also C++/ > templated. > Though, I believe it already provides Python bindings (via Swig?). > Maybe > that makes it easier ... > > Chad > > On Fri, Feb 6, 2009 at 14:02, Chris Fields > wrote: > >> I'll forward this onto the list. Something that links into EMBOSS >> would be >> nice, or even SeqAn. SeqAn may be a bit of challenge as it is C++ >> and uses >> templates: >> >> http://www.seqan.de/ >> >> BTW, the NCBI toolset has a perl API but there are several problems >> with it >> not least being it's precompiled only for Linux: >> >> ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools%2B%2B/2008/Mar_17_2008/ >> Perl/ >> >> -c >> >> Begin forwarded message: >> >> From: Pjotr Prins >>> Date: February 6, 2009 3:38:42 AM CST >>> To: Chris Fields >>> Subject: BioPerl >>> >>> Hi Chris, >>> >>> While we are at it, and in the flow, I would like to add another >>> library to biolib. What do you suggest is a need for? Maybe you can >>> post a question on the BioPerl mailing list? Anything that will give >>> some real users. Emboss Nucleus or AJAX? Or some part of the NCBI >>> toolset? >>> >>> Pj. >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Fri Feb 6 09:59:16 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 6 Feb 2009 09:59:16 -0500 Subject: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? In-Reply-To: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> References: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> Message-ID: <2EF8144E065A45808E8EECACF51124A4@NewLife> > I suppose the best way to deal with some of these questions (and ensure > Node/Tree is acting as expected) is to come up with several vetted test cases > indicating what we expect the proper behavior to be for remove_Descendant(), > contract_linear_paths(), and any other problematic Node/Tree/TreeFunctionI > methods. In fact, I highly recommend any code changes like this add tests to > the test suite demonstrating the issue. I can work the example of the thread into a test, adding some of the points brought in by Hilmar- > > Possibly related to all this is a fairly significant lingering bug dealing > with Bio::Tree::TreeFunctionsI::reroot() > (http://bugzilla.open-bio.org/show_bug.cgi?id=2456 ). Any takers? I take this one, if I have those privileges ( it is a privilege to serve, isn't it?)... > > chris MAJ ----- Original Message ----- From: "Chris Fields" To: "Hilmar Lapp" Cc: ; "Mark A. Jensen" Sent: Friday, February 06, 2009 9:46 AM Subject: Re: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? > > On Feb 6, 2009, at 8:21 AM, Hilmar Lapp wrote: > >> >> On Feb 6, 2009, at 12:49 AM, Mark A. Jensen wrote: >> >>> A B C D E >>> |5 |5 |4 |4 | >>> \ / \ / / >>> x y / >>> |2 |1 / >>> \ / / >>> \ _/ / 10 >>> \ / / >>> z / >>> |3 / >>> \ / >>> ROOT >>> >>>> [...] >>>> __DATA__ >>>> (((A:5,B:5)x:2,(C:4,D:4)y:1)z:3,E:10); >>>> [...] >>> >>> The first problem was that the "complementary approaches" didn't give >>> the same answer, and one threw an error. This is the bug in the >>> code above. If you look at the tree, the nodes he really wants to keep are >>> the leaves [A,B,E], *plus* the internal nodes [x,y,z]; that is... >>> >>> $tree->splice(-keep_id=>[A,B,E,x,'y',z]) >>> $tree->total_branch_length >>> >>> doesn't throw and returns 25, which is correct. >> >> If you replace 'y' with the root in the above, yes. Otherwise no. >> >>> The second problem is that removing [C, D] also gives him 25, which is >>> what he wanted, but is not correct. >> >> Correct. >> >>> [...]The problem arises in this code at the end of remove_Descendent: >>> >>> # remove unecessary nodes if we have removed the part |||Node.pm LINE 272 >>> # which branches. >>> my $a1 = $self->ancestor; >>> if( $a1 ) { >>> my $bl = $self->branch_length || 0; >>> my @d = $self->each_Descendent; >>> if (scalar @d == 1) { >>> $d[0]->branch_length($bl + ($d[0]->branch_length || 0)); >>> $a1->add_Descendent($d[0]); >>> $a1->remove_Descendent($self); >>> } >>> } >>> >>> When node C is removed from the example tree, the node 'y' is removed >>> by this code apparently as a convenience. >> >> I guess I'm confused by the above code snippet. If @d are the descendants of >> $a1, then I don't understand what the purpose of adding $d[0] as a >> descendant of $a1 is (after altering its branch length). Furthermore, if $a1 >> is the ancestor of $self, and $a1 has only one descendant, wouldn't that >> mean that $d[0] == $self? >> >> What am I missing? >> >> I also think it's a bad idea to simplify the tree (or collapse internal >> nodes of degree 1) as an implicit operation. It should be explicit (and I >> believe there is a method simplify() or something similar, isn't there? Ah - >> I see you quote contract_linear_paths()). >> >> Furthermore, I think it's also a bad idea to remove descendant leaf nodes if >> you remove an internal node. What if you really just wanted to remove the >> internal node because, for example, its branching point isn't well >> supported? So removing node 'y' should make z a node of degree 3, but not >> remove C and D unless you ask to remove the subtree beginning at 'y'. >> >> -hilmar > > I suppose the best way to deal with some of these questions (and ensure > Node/Tree is acting as expected) is to come up with several vetted test cases > indicating what we expect the proper behavior to be for remove_Descendant(), > contract_linear_paths(), and any other problematic Node/Tree/TreeFunctionI > methods. In fact, I highly recommend any code changes like this add tests to > the test suite demonstrating the issue. > > Possibly related to all this is a fairly significant lingering bug dealing > with Bio::Tree::TreeFunctionsI::reroot() > (http://bugzilla.open-bio.org/show_bug.cgi?id=2456 ). Any takers? > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From mmuratet at hudsonalpha.org Fri Feb 6 11:03:01 2009 From: mmuratet at hudsonalpha.org (Michael Muratet) Date: Fri, 6 Feb 2009 10:03:01 -0600 Subject: [Bioperl-l] load_seqdatabase.pl memory requirements unusually large Message-ID: <00C78AD9-02E4-4295-93E0-521DAE453842@hudsonalpha.org> Greetings I have use bioperl-db and load_seqdatabase.pl many times in the past and it's worked pretty much out of the box. I have been trying to load fasta files from hg18. For chr1, the virtual and resident memory quickly builds to 14 GB or so, then starts using up the 2GB swap until it's full and then the system hangs. The system is an 8-core Dell with 16GB of physical memory. chr1.fa is ~242MB. All of the disk storage is network mounted on an EMC system which (I am told) has a proprietary version of something that's NFS- like. I loaded chrM (~17kb) and load_seqdatabase grew to over 4 GB before it completed. I am using MySQL 5.0.51a-community, DBI 1.607, perl 5.85, bioperl 1.6.0, bioperl-db 1.006900. I have the innodb engine enabled in MySQL and the buffers and caches set for a 'large' system. I had some errors during the bioperl-db install: Test Summary Report ------------------- t/01dbadaptor.t (Wstat: 256 Tests: 23 Failed: 1) Failed test: 23 Non-zero exit status: 1 t/10ensembl.t (Wstat: 65280 Tests: 5 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 18 tests but ran 5. t/11locuslink.t (Wstat: 65280 Tests: 7 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 113 tests but ran 7. t/15cluster.t (Wstat: 65280 Tests: 7 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 162 tests but ran 7. Files=16, Tests=1205, 21 wallclock secs ( 0.51 usr 0.12 sys + 14.90 cusr 2.69 csys = 18.22 CPU) Result: FAIL Failed 4/16 test programs. 1/1205 subtests failed. The most recent bioperl-db documentation says that a workable version may be possible after some errors and so I went ahead with the install. The mailing list archive has some discussion about throughput but nothing really about filling up memory. Can anyone offer any clues about what's going on or where to start looking? Thanks Mike From hlapp at gmx.net Fri Feb 6 14:34:16 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 6 Feb 2009 14:34:16 -0500 Subject: [Bioperl-l] load_seqdatabase.pl memory requirements unusually large In-Reply-To: <00C78AD9-02E4-4295-93E0-521DAE453842@hudsonalpha.org> References: <00C78AD9-02E4-4295-93E0-521DAE453842@hudsonalpha.org> Message-ID: <821B9510-247B-4CA6-AC22-E0420698DD5F@gmx.net> Something seems to cause Perl to be crapping out. It it were a programmatic exception you would see the message and the trace. Could you run these tests by themselves: ./Build test --test-files t/11locuslink.t If that doesn't reveal the error, add a verbose=1 argument. Let us know what you find. -hilmar On Feb 6, 2009, at 11:03 AM, Michael Muratet wrote: > Greetings > > I have use bioperl-db and load_seqdatabase.pl many times in the past > and it's worked pretty much out of the box. > > I have been trying to load fasta files from hg18. For chr1, the > virtual and resident memory quickly builds to 14 GB or so, then > starts using up the 2GB swap until it's full and then the system > hangs. The system is an 8-core Dell with 16GB of physical memory. > chr1.fa is ~242MB. All of the disk storage is network mounted on an > EMC system which (I am told) has a proprietary version of something > that's NFS-like. > > I loaded chrM (~17kb) and load_seqdatabase grew to over 4 GB before > it completed. > > I am using MySQL 5.0.51a-community, DBI 1.607, perl 5.85, bioperl > 1.6.0, bioperl-db 1.006900. > > I have the innodb engine enabled in MySQL and the buffers and caches > set for a 'large' system. > > I had some errors during the bioperl-db install: > > Test Summary Report > ------------------- > t/01dbadaptor.t (Wstat: 256 Tests: 23 Failed: 1) > Failed test: 23 > Non-zero exit status: 1 > t/10ensembl.t (Wstat: 65280 Tests: 5 Failed: 0) > Non-zero exit status: 255 > Parse errors: Bad plan. You planned 18 tests but ran 5. > t/11locuslink.t (Wstat: 65280 Tests: 7 Failed: 0) > Non-zero exit status: 255 > Parse errors: Bad plan. You planned 113 tests but ran 7. > t/15cluster.t (Wstat: 65280 Tests: 7 Failed: 0) > Non-zero exit status: 255 > Parse errors: Bad plan. You planned 162 tests but ran 7. > Files=16, Tests=1205, 21 wallclock secs ( 0.51 usr 0.12 sys + 14.90 > cusr 2.69 csys = 18.22 CPU) > Result: FAIL > Failed 4/16 test programs. 1/1205 subtests failed. > > The most recent bioperl-db documentation says that a workable > version may be possible after some errors and so I went ahead with > the install. > > The mailing list archive has some discussion about throughput but > nothing really about filling up memory. > > Can anyone offer any clues about what's going on or where to start > looking? > > Thanks > > Mike > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From florent.angly at gmail.com Fri Feb 6 15:04:42 2009 From: florent.angly at gmail.com (Florent Angly) Date: Fri, 06 Feb 2009 12:04:42 -0800 Subject: [Bioperl-l] load_seqdatabase.pl memory requirements unusually large In-Reply-To: <821B9510-247B-4CA6-AC22-E0420698DD5F@gmx.net> References: <00C78AD9-02E4-4295-93E0-521DAE453842@hudsonalpha.org> <821B9510-247B-4CA6-AC22-E0420698DD5F@gmx.net> Message-ID: <498C97DA.4070804@gmail.com> Out of the blue, I'm going to ask: could it be some memory leak due to a circular reference? Florent Hilmar Lapp wrote: > Something seems to cause Perl to be crapping out. It it were a > programmatic exception you would see the message and the trace. > > Could you run these tests by themselves: > > ./Build test --test-files t/11locuslink.t > > If that doesn't reveal the error, add a verbose=1 argument. > > Let us know what you find. > > -hilmar > > On Feb 6, 2009, at 11:03 AM, Michael Muratet wrote: > >> Greetings >> >> I have use bioperl-db and load_seqdatabase.pl many times in the past >> and it's worked pretty much out of the box. >> >> I have been trying to load fasta files from hg18. For chr1, the >> virtual and resident memory quickly builds to 14 GB or so, then >> starts using up the 2GB swap until it's full and then the system >> hangs. The system is an 8-core Dell with 16GB of physical memory. >> chr1.fa is ~242MB. All of the disk storage is network mounted on an >> EMC system which (I am told) has a proprietary version of something >> that's NFS-like. >> >> I loaded chrM (~17kb) and load_seqdatabase grew to over 4 GB before >> it completed. >> >> I am using MySQL 5.0.51a-community, DBI 1.607, perl 5.85, bioperl >> 1.6.0, bioperl-db 1.006900. >> >> I have the innodb engine enabled in MySQL and the buffers and caches >> set for a 'large' system. >> >> I had some errors during the bioperl-db install: >> >> Test Summary Report >> ------------------- >> t/01dbadaptor.t (Wstat: 256 Tests: 23 Failed: 1) >> Failed test: 23 >> Non-zero exit status: 1 >> t/10ensembl.t (Wstat: 65280 Tests: 5 Failed: 0) >> Non-zero exit status: 255 >> Parse errors: Bad plan. You planned 18 tests but ran 5. >> t/11locuslink.t (Wstat: 65280 Tests: 7 Failed: 0) >> Non-zero exit status: 255 >> Parse errors: Bad plan. You planned 113 tests but ran 7. >> t/15cluster.t (Wstat: 65280 Tests: 7 Failed: 0) >> Non-zero exit status: 255 >> Parse errors: Bad plan. You planned 162 tests but ran 7. >> Files=16, Tests=1205, 21 wallclock secs ( 0.51 usr 0.12 sys + 14.90 >> cusr 2.69 csys = 18.22 CPU) >> Result: FAIL >> Failed 4/16 test programs. 1/1205 subtests failed. >> >> The most recent bioperl-db documentation says that a workable version >> may be possible after some errors and so I went ahead with the install. >> >> The mailing list archive has some discussion about throughput but >> nothing really about filling up memory. >> >> Can anyone offer any clues about what's going on or where to start >> looking? >> >> Thanks >> >> Mike >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Fri Feb 6 15:10:27 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 6 Feb 2009 15:10:27 -0500 Subject: [Bioperl-l] load_seqdatabase.pl memory requirements unusually large In-Reply-To: <498C97DA.4070804@gmail.com> References: <00C78AD9-02E4-4295-93E0-521DAE453842@hudsonalpha.org> <821B9510-247B-4CA6-AC22-E0420698DD5F@gmx.net> <498C97DA.4070804@gmail.com> Message-ID: <842543C6-06A5-451B-9749-D5379D85AC87@gmx.net> It could be, but that doesn't explain the test failures. They need not have the same cause, but they could, and the tests failing the way they do is certainly not right. -hilmar On Feb 6, 2009, at 3:04 PM, Florent Angly wrote: > Out of the blue, I'm going to ask: could it be some memory leak due > to a circular reference? > Florent > > Hilmar Lapp wrote: >> Something seems to cause Perl to be crapping out. It it were a >> programmatic exception you would see the message and the trace. >> >> Could you run these tests by themselves: >> >> ./Build test --test-files t/11locuslink.t >> >> If that doesn't reveal the error, add a verbose=1 argument. >> >> Let us know what you find. >> >> -hilmar >> >> On Feb 6, 2009, at 11:03 AM, Michael Muratet wrote: >> >>> Greetings >>> >>> I have use bioperl-db and load_seqdatabase.pl many times in the >>> past and it's worked pretty much out of the box. >>> >>> I have been trying to load fasta files from hg18. For chr1, the >>> virtual and resident memory quickly builds to 14 GB or so, then >>> starts using up the 2GB swap until it's full and then the system >>> hangs. The system is an 8-core Dell with 16GB of physical memory. >>> chr1.fa is ~242MB. All of the disk storage is network mounted on >>> an EMC system which (I am told) has a proprietary version of >>> something that's NFS-like. >>> >>> I loaded chrM (~17kb) and load_seqdatabase grew to over 4 GB >>> before it completed. >>> >>> I am using MySQL 5.0.51a-community, DBI 1.607, perl 5.85, bioperl >>> 1.6.0, bioperl-db 1.006900. >>> >>> I have the innodb engine enabled in MySQL and the buffers and >>> caches set for a 'large' system. >>> >>> I had some errors during the bioperl-db install: >>> >>> Test Summary Report >>> ------------------- >>> t/01dbadaptor.t (Wstat: 256 Tests: 23 Failed: 1) >>> Failed test: 23 >>> Non-zero exit status: 1 >>> t/10ensembl.t (Wstat: 65280 Tests: 5 Failed: 0) >>> Non-zero exit status: 255 >>> Parse errors: Bad plan. You planned 18 tests but ran 5. >>> t/11locuslink.t (Wstat: 65280 Tests: 7 Failed: 0) >>> Non-zero exit status: 255 >>> Parse errors: Bad plan. You planned 113 tests but ran 7. >>> t/15cluster.t (Wstat: 65280 Tests: 7 Failed: 0) >>> Non-zero exit status: 255 >>> Parse errors: Bad plan. You planned 162 tests but ran 7. >>> Files=16, Tests=1205, 21 wallclock secs ( 0.51 usr 0.12 sys + >>> 14.90 cusr 2.69 csys = 18.22 CPU) >>> Result: FAIL >>> Failed 4/16 test programs. 1/1205 subtests failed. >>> >>> The most recent bioperl-db documentation says that a workable >>> version may be possible after some errors and so I went ahead with >>> the install. >>> >>> The mailing list archive has some discussion about throughput but >>> nothing really about filling up memory. >>> >>> Can anyone offer any clues about what's going on or where to start >>> looking? >>> >>> Thanks >>> >>> Mike >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at illinois.edu Fri Feb 6 16:11:18 2009 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 6 Feb 2009 15:11:18 -0600 Subject: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? In-Reply-To: <2EF8144E065A45808E8EECACF51124A4@NewLife> References: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> <2EF8144E065A45808E8EECACF51124A4@NewLife> Message-ID: <17337564-0D4C-437C-BB82-6337D967F246@illinois.edu> On Feb 6, 2009, at 8:59 AM, Mark A. Jensen wrote: >> I suppose the best way to deal with some of these questions (and >> ensure Node/Tree is acting as expected) is to come up with several >> vetted test cases indicating what we expect the proper behavior to >> be for remove_Descendant(), contract_linear_paths(), and any >> other problematic Node/Tree/TreeFunctionI methods. In fact, I >> highly recommend any code changes like this add tests to the test >> suite demonstrating the issue. > > I can work the example of the thread into a test, adding some > of the points brought in by Hilmar- Any other areas of worry? >> Possibly related to all this is a fairly significant lingering bug >> dealing with Bio::Tree::TreeFunctionsI::reroot() (http://bugzilla.open-bio.org/show_bug.cgi?id=2456 >> ). Any takers? > > I take this one, if I have those privileges ( it is a privilege to > serve, isn't it?)... Cool, thanks Mark! -c From cjfields at illinois.edu Fri Feb 6 16:26:40 2009 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 6 Feb 2009 15:26:40 -0600 Subject: [Bioperl-l] load_seqdatabase.pl memory requirements unusually large In-Reply-To: <842543C6-06A5-451B-9749-D5379D85AC87@gmx.net> References: <00C78AD9-02E4-4295-93E0-521DAE453842@hudsonalpha.org> <821B9510-247B-4CA6-AC22-E0420698DD5F@gmx.net> <498C97DA.4070804@gmail.com> <842543C6-06A5-451B-9749-D5379D85AC87@gmx.net> Message-ID: <5E3BC16E-4B54-4811-A4BE-02C8CB0A21EF@illinois.edu> There was a known memory issue with Bio::Species (), but I think that may be resolved (http://bugzilla.open-bio.org/show_bug.cgi?id=2594). The odd thing about the failed in 01dbadaptor.t is it appears to be related to rolling back changes. Could that have something to do with using a loaded database for tests? I thought tests were supposed to be run with a clean database (with the biosql schema but nothing else). chris On Feb 6, 2009, at 2:10 PM, Hilmar Lapp wrote: > It could be, but that doesn't explain the test failures. They need > not have the same cause, but they could, and the tests failing the > way they do is certainly not right. > > -hilmar > > On Feb 6, 2009, at 3:04 PM, Florent Angly wrote: > >> Out of the blue, I'm going to ask: could it be some memory leak due >> to a circular reference? >> Florent >> >> Hilmar Lapp wrote: >>> Something seems to cause Perl to be crapping out. It it were a >>> programmatic exception you would see the message and the trace. >>> >>> Could you run these tests by themselves: >>> >>> ./Build test --test-files t/11locuslink.t >>> >>> If that doesn't reveal the error, add a verbose=1 argument. >>> >>> Let us know what you find. >>> >>> -hilmar >>> >>> On Feb 6, 2009, at 11:03 AM, Michael Muratet wrote: >>> >>>> Greetings >>>> >>>> I have use bioperl-db and load_seqdatabase.pl many times in the >>>> past and it's worked pretty much out of the box. >>>> >>>> I have been trying to load fasta files from hg18. For chr1, the >>>> virtual and resident memory quickly builds to 14 GB or so, then >>>> starts using up the 2GB swap until it's full and then the system >>>> hangs. The system is an 8-core Dell with 16GB of physical memory. >>>> chr1.fa is ~242MB. All of the disk storage is network mounted on >>>> an EMC system which (I am told) has a proprietary version of >>>> something that's NFS-like. >>>> >>>> I loaded chrM (~17kb) and load_seqdatabase grew to over 4 GB >>>> before it completed. >>>> >>>> I am using MySQL 5.0.51a-community, DBI 1.607, perl 5.85, bioperl >>>> 1.6.0, bioperl-db 1.006900. >>>> >>>> I have the innodb engine enabled in MySQL and the buffers and >>>> caches set for a 'large' system. >>>> >>>> I had some errors during the bioperl-db install: >>>> >>>> Test Summary Report >>>> ------------------- >>>> t/01dbadaptor.t (Wstat: 256 Tests: 23 Failed: 1) >>>> Failed test: 23 >>>> Non-zero exit status: 1 >>>> t/10ensembl.t (Wstat: 65280 Tests: 5 Failed: 0) >>>> Non-zero exit status: 255 >>>> Parse errors: Bad plan. You planned 18 tests but ran 5. >>>> t/11locuslink.t (Wstat: 65280 Tests: 7 Failed: 0) >>>> Non-zero exit status: 255 >>>> Parse errors: Bad plan. You planned 113 tests but ran 7. >>>> t/15cluster.t (Wstat: 65280 Tests: 7 Failed: 0) >>>> Non-zero exit status: 255 >>>> Parse errors: Bad plan. You planned 162 tests but ran 7. >>>> Files=16, Tests=1205, 21 wallclock secs ( 0.51 usr 0.12 sys + >>>> 14.90 cusr 2.69 csys = 18.22 CPU) >>>> Result: FAIL >>>> Failed 4/16 test programs. 1/1205 subtests failed. >>>> >>>> The most recent bioperl-db documentation says that a workable >>>> version may be possible after some errors and so I went ahead >>>> with the install. >>>> >>>> The mailing list archive has some discussion about throughput but >>>> nothing really about filling up memory. >>>> >>>> Can anyone offer any clues about what's going on or where to >>>> start looking? >>>> >>>> Thanks >>>> >>>> Mike >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Fri Feb 6 20:13:17 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 6 Feb 2009 20:13:17 -0500 Subject: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? In-Reply-To: <17337564-0D4C-437C-BB82-6337D967F246@illinois.edu> References: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> <2EF8144E065A45808E8EECACF51124A4@NewLife> <17337564-0D4C-437C-BB82-6337D967F246@illinois.edu> Message-ID: <25E8B6CF45F145FDA0548979D0D9C231@NewLife> Interested parties please have a look at fixes --- http://bugzilla.open-bio.org/show_bug.cgi?id=2456 cheers- MAJ ----- Original Message ----- From: "Chris Fields" To: "Mark A. Jensen" Cc: "Hilmar Lapp" ; Sent: Friday, February 06, 2009 4:11 PM Subject: Re: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? > > On Feb 6, 2009, at 8:59 AM, Mark A. Jensen wrote: > >>> I suppose the best way to deal with some of these questions (and ensure >>> Node/Tree is acting as expected) is to come up with several vetted test >>> cases indicating what we expect the proper behavior to be for >>> remove_Descendant(), contract_linear_paths(), and any other problematic >>> Node/Tree/TreeFunctionI methods. In fact, I highly recommend any code >>> changes like this add tests to the test suite demonstrating the issue. >> >> I can work the example of the thread into a test, adding some >> of the points brought in by Hilmar- > > Any other areas of worry? > >>> Possibly related to all this is a fairly significant lingering bug dealing >>> with Bio::Tree::TreeFunctionsI::reroot() >>> (http://bugzilla.open-bio.org/show_bug.cgi?id=2456 ). Any takers? >> >> I take this one, if I have those privileges ( it is a privilege to serve, >> isn't it?)... > > Cool, thanks Mark! > > -c > > From cjfields at illinois.edu Fri Feb 6 23:09:22 2009 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 6 Feb 2009 22:09:22 -0600 Subject: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? In-Reply-To: <25E8B6CF45F145FDA0548979D0D9C231@NewLife> References: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> <2EF8144E065A45808E8EECACF51124A4@NewLife> <17337564-0D4C-437C-BB82-6337D967F246@illinois.edu> <25E8B6CF45F145FDA0548979D0D9C231@NewLife> Message-ID: <38157EB0-359D-47DC-9214-348134FA3220@illinois.edu> Mark, Saw some errors pop up when running Tree tests (see the attachment on the bug report). They may be due to bad test data and not your patch so it'll need further investigating; a few appear to be the same test data using in various TreeIO formats. chris On Feb 6, 2009, at 7:13 PM, Mark A. Jensen wrote: > Interested parties please have a look at fixes --- > http://bugzilla.open-bio.org/show_bug.cgi?id=2456 > cheers- > MAJ > ----- Original Message ----- From: "Chris Fields" > > To: "Mark A. Jensen" > Cc: "Hilmar Lapp" ; > Sent: Friday, February 06, 2009 4:11 PM > Subject: Re: [Bioperl-l] Unwise elimination of nodes > inB:T:Node::remove_Descendent? > > >> >> On Feb 6, 2009, at 8:59 AM, Mark A. Jensen wrote: >> >>>> I suppose the best way to deal with some of these questions >>>> (and ensure Node/Tree is acting as expected) is to come up with >>>> several vetted test cases indicating what we expect the proper >>>> behavior to be for remove_Descendant(), >>>> contract_linear_paths(), and any other problematic Node/Tree/ >>>> TreeFunctionI methods. In fact, I highly recommend any code >>>> changes like this add tests to the test suite demonstrating the >>>> issue. >>> >>> I can work the example of the thread into a test, adding some >>> of the points brought in by Hilmar- >> >> Any other areas of worry? >> >>>> Possibly related to all this is a fairly significant lingering >>>> bug dealing with Bio::Tree::TreeFunctionsI::reroot() (http://bugzilla.open-bio.org/show_bug.cgi?id=2456 >>>> ). Any takers? >>> >>> I take this one, if I have those privileges ( it is a privilege >>> to serve, isn't it?)... >> >> Cool, thanks Mark! >> >> -c >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Fri Feb 6 23:54:02 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 6 Feb 2009 23:54:02 -0500 Subject: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? In-Reply-To: <38157EB0-359D-47DC-9214-348134FA3220@illinois.edu> References: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> <2EF8144E065A45808E8EECACF51124A4@NewLife> <17337564-0D4C-437C-BB82-6337D967F246@illinois.edu> <25E8B6CF45F145FDA0548979D0D9C231@NewLife> <38157EB0-359D-47DC-9214-348134FA3220@illinois.edu> Message-ID: <77B8F3C781994A829C196955886F1CA5@NewLife> cheers Chris-- I'll check it out- MAJ ----- Original Message ----- From: "Chris Fields" To: "Mark A. Jensen" Cc: Sent: Friday, February 06, 2009 11:09 PM Subject: Re: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? > Mark, > > Saw some errors pop up when running Tree tests (see the attachment on the bug > report). They may be due to bad test data and not your patch so it'll need > further investigating; a few appear to be the same test data using in various > TreeIO formats. > > chris > > On Feb 6, 2009, at 7:13 PM, Mark A. Jensen wrote: > >> Interested parties please have a look at fixes --- >> http://bugzilla.open-bio.org/show_bug.cgi?id=2456 >> cheers- >> MAJ >> ----- Original Message ----- From: "Chris Fields" > > >> To: "Mark A. Jensen" >> Cc: "Hilmar Lapp" ; >> Sent: Friday, February 06, 2009 4:11 PM >> Subject: Re: [Bioperl-l] Unwise elimination of nodes >> inB:T:Node::remove_Descendent? >> >> >>> >>> On Feb 6, 2009, at 8:59 AM, Mark A. Jensen wrote: >>> >>>>> I suppose the best way to deal with some of these questions (and ensure >>>>> Node/Tree is acting as expected) is to come up with several vetted test >>>>> cases indicating what we expect the proper behavior to be for >>>>> remove_Descendant(), contract_linear_paths(), and any other problematic >>>>> Node/Tree/ TreeFunctionI methods. In fact, I highly recommend any code >>>>> changes like this add tests to the test suite demonstrating the issue. >>>> >>>> I can work the example of the thread into a test, adding some >>>> of the points brought in by Hilmar- >>> >>> Any other areas of worry? >>> >>>>> Possibly related to all this is a fairly significant lingering bug >>>>> dealing with Bio::Tree::TreeFunctionsI::reroot() >>>>> (http://bugzilla.open-bio.org/show_bug.cgi?id=2456 ). Any takers? >>>> >>>> I take this one, if I have those privileges ( it is a privilege to serve, >>>> isn't it?)... >>> >>> Cool, thanks Mark! >>> >>> -c >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From dylankrishnan at gmail.com Sat Feb 7 11:10:25 2009 From: dylankrishnan at gmail.com (Dylan Krishnan) Date: Sat, 7 Feb 2009 10:10:25 -0600 Subject: [Bioperl-l] calculate the frequency of occurrence of the most commonly observed amino acid at each position of multiple sequence alignment Message-ID: I am new to perl but this is somethign I am seeking to do either through a bioperl module or just perl. Apparently, this is quite "straightforward using PERL," but I beg to differ. Any assistance regarding this matter would be greatly appreciated. Thanks! -dylan From maj at fortinbras.us Sat Feb 7 11:25:43 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 7 Feb 2009 11:25:43 -0500 Subject: [Bioperl-l] calculate the frequency of occurrence of the mostcommonly observed amino acid at each position of multiplesequence alignment In-Reply-To: References: Message-ID: Dylan, This is an extremely good exercise for anyone learning Perl to do bioinformatics. When you have done many exercises like this, you will see what people mean when they say it is very straightforward. Here are some hints: Use the "entropy" scrap at http://www.bioperl.org/wiki/Site_entropy_in_an_alignment . You will convert the function entropy_by_column() into the function you need. Replace the line $ent{$col} = entropy(values %res); with a line you will write using the "hash key at max value" scrap, found here: http://www.bioperl.org/wiki/Hash_key_at_the_max_value . Happy coding! Mark ----- Original Message ----- From: "Dylan Krishnan" To: Sent: Saturday, February 07, 2009 11:10 AM Subject: [Bioperl-l] calculate the frequency of occurrence of the mostcommonly observed amino acid at each position of multiplesequence alignment >I am new to perl but this is somethign I am seeking to do either through a > bioperl module or just perl. Apparently, this is quite "straightforward > using PERL," but I beg to differ. > > Any assistance regarding this matter would be greatly appreciated. > > Thanks! > > -dylan > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From dylankrishnan at gmail.com Sat Feb 7 11:43:02 2009 From: dylankrishnan at gmail.com (Dylan Krishnan) Date: Sat, 7 Feb 2009 10:43:02 -0600 Subject: [Bioperl-l] calculate the frequency of occurrence of the mostcommonly observed amino acid at each position of multiplesequence alignment In-Reply-To: References: Message-ID: thanks mark! the authors other approach is to load the alignment into a MS Excel worksheet and use the "autofilter" procedure to count the occurrences of any residue position of the alignment. the claim is "that excel is uselful for this purpose."sounds reasonable for 10 alignments but not 2000! again, many thanks. -dylan On Sat, Feb 7, 2009 at 10:25 AM, Mark A. Jensen wrote: > Dylan, > > This is an extremely good exercise for anyone learning Perl to do > bioinformatics. > When you have done many exercises like this, you will see what people mean > when they say it is very straightforward. > > Here are some hints: > > Use the "entropy" scrap at > http://www.bioperl.org/wiki/Site_entropy_in_an_alignment . > You will convert the function entropy_by_column() into the function you > need. > Replace the line > > $ent{$col} = entropy(values %res); > > with a line you will write using the "hash key at max value" scrap, found > here: http://www.bioperl.org/wiki/Hash_key_at_the_max_value . > > Happy coding! > Mark > > ----- Original Message ----- From: "Dylan Krishnan" < > dylankrishnan at gmail.com> > To: > Sent: Saturday, February 07, 2009 11:10 AM > Subject: [Bioperl-l] calculate the frequency of occurrence of the > mostcommonly observed amino acid at each position of multiplesequence > alignment > > > I am new to perl but this is somethign I am seeking to do either through a >> bioperl module or just perl. Apparently, this is quite "straightforward >> using PERL," but I beg to differ. >> >> Any assistance regarding this matter would be greatly appreciated. >> >> Thanks! >> >> -dylan >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> > From maj at fortinbras.us Sat Feb 7 11:46:25 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 7 Feb 2009 11:46:25 -0500 Subject: [Bioperl-l] calculate the frequency of occurrence of the mostcommonly observed amino acid at each position of multiplesequence alignment In-Reply-To: References: Message-ID: Yikes! Sounds like the authors need to pump some Perl too! cheers MAJ ----- Original Message ----- From: Dylan Krishnan To: Mark A. Jensen Cc: bioperl-l at lists.open-bio.org Sent: Saturday, February 07, 2009 11:43 AM Subject: Re: [Bioperl-l] calculate the frequency of occurrence of the mostcommonly observed amino acid at each position of multiplesequence alignment thanks mark! the authors other approach is to load the alignment into a MS Excel worksheet and use the "autofilter" procedure to count the occurrences of any residue position of the alignment. the claim is "that excel is uselful for this purpose."sounds reasonable for 10 alignments but not 2000! again, many thanks. -dylan On Sat, Feb 7, 2009 at 10:25 AM, Mark A. Jensen wrote: Dylan, This is an extremely good exercise for anyone learning Perl to do bioinformatics. When you have done many exercises like this, you will see what people mean when they say it is very straightforward. Here are some hints: Use the "entropy" scrap at http://www.bioperl.org/wiki/Site_entropy_in_an_alignment . You will convert the function entropy_by_column() into the function you need. Replace the line $ent{$col} = entropy(values %res); with a line you will write using the "hash key at max value" scrap, found here: http://www.bioperl.org/wiki/Hash_key_at_the_max_value . Happy coding! Mark ----- Original Message ----- From: "Dylan Krishnan" To: Sent: Saturday, February 07, 2009 11:10 AM Subject: [Bioperl-l] calculate the frequency of occurrence of the mostcommonly observed amino acid at each position of multiplesequence alignment I am new to perl but this is somethign I am seeking to do either through a bioperl module or just perl. Apparently, this is quite "straightforward using PERL," but I beg to differ. Any assistance regarding this matter would be greatly appreciated. Thanks! -dylan _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Sat Feb 7 11:56:30 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 7 Feb 2009 11:56:30 -0500 Subject: [Bioperl-l] calculate the frequency of occurrence of the mostcommonly observed amino acid at each position of multiplesequence alignment In-Reply-To: References: Message-ID: <3530AFC06DD34C0B8925AA2E5F0B6A5E@NewLife> Dylan- It's worth mentioning that the BioPerl method is very overhead-heavy; all the objects make it easy to just write a few lines, but probably won't be the absolute fastest way to do what you want. Another path to follow would be # your seqs are plain strings in the array @seqs, and are aligned and same length my $len = length($seqs[0]); my @residue_counts; foreach (0..$len-1) { my %h = (); foreach $seq (@seqs) { $h{ substr($seq, $_, 1) }++; } push @residue_counts, \%h; } Now, for each elt in @residue_counts (each elt is a reference to a hash), look for the key that has the maximum hash value. The snippet above is also worth working through for the educational value, esp. w/r to using hashes, which (IMHO) are one of the absolutely coolest thing about Perl. cheers- MAJ ----- Original Message ----- From: Dylan Krishnan To: Mark A. Jensen Cc: bioperl-l at lists.open-bio.org Sent: Saturday, February 07, 2009 11:43 AM Subject: Re: [Bioperl-l] calculate the frequency of occurrence of the mostcommonly observed amino acid at each position of multiplesequence alignment thanks mark! the authors other approach is to load the alignment into a MS Excel worksheet and use the "autofilter" procedure to count the occurrences of any residue position of the alignment. the claim is "that excel is uselful for this purpose."sounds reasonable for 10 alignments but not 2000! again, many thanks. -dylan On Sat, Feb 7, 2009 at 10:25 AM, Mark A. Jensen wrote: Dylan, This is an extremely good exercise for anyone learning Perl to do bioinformatics. When you have done many exercises like this, you will see what people mean when they say it is very straightforward. Here are some hints: Use the "entropy" scrap at http://www.bioperl.org/wiki/Site_entropy_in_an_alignment . You will convert the function entropy_by_column() into the function you need. Replace the line $ent{$col} = entropy(values %res); with a line you will write using the "hash key at max value" scrap, found here: http://www.bioperl.org/wiki/Hash_key_at_the_max_value . Happy coding! Mark ----- Original Message ----- From: "Dylan Krishnan" To: Sent: Saturday, February 07, 2009 11:10 AM Subject: [Bioperl-l] calculate the frequency of occurrence of the mostcommonly observed amino acid at each position of multiplesequence alignment I am new to perl but this is somethign I am seeking to do either through a bioperl module or just perl. Apparently, this is quite "straightforward using PERL," but I beg to differ. Any assistance regarding this matter would be greatly appreciated. Thanks! -dylan _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Sat Feb 7 15:39:20 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 7 Feb 2009 15:39:20 -0500 Subject: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? In-Reply-To: <38157EB0-359D-47DC-9214-348134FA3220@illinois.edu> References: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> <2EF8144E065A45808E8EECACF51124A4@NewLife> <17337564-0D4C-437C-BB82-6337D967F246@illinois.edu> <25E8B6CF45F145FDA0548979D0D9C231@NewLife> <38157EB0-359D-47DC-9214-348134FA3220@illinois.edu> Message-ID: Ok- some modified tests and editorial analysis up under Bug #2456- cheers MAJ ----- Original Message ----- From: "Chris Fields" To: "Mark A. Jensen" Cc: Sent: Friday, February 06, 2009 11:09 PM Subject: Re: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? > Mark, > > Saw some errors pop up when running Tree tests (see the attachment on the bug > report). They may be due to bad test data and not your patch so it'll need > further investigating; a few appear to be the same test data using in various > TreeIO formats. > > chris > > On Feb 6, 2009, at 7:13 PM, Mark A. Jensen wrote: > >> Interested parties please have a look at fixes --- >> http://bugzilla.open-bio.org/show_bug.cgi?id=2456 >> cheers- >> MAJ >> ----- Original Message ----- From: "Chris Fields" > > >> To: "Mark A. Jensen" >> Cc: "Hilmar Lapp" ; >> Sent: Friday, February 06, 2009 4:11 PM >> Subject: Re: [Bioperl-l] Unwise elimination of nodes >> inB:T:Node::remove_Descendent? >> >> >>> >>> On Feb 6, 2009, at 8:59 AM, Mark A. Jensen wrote: >>> >>>>> I suppose the best way to deal with some of these questions (and ensure >>>>> Node/Tree is acting as expected) is to come up with several vetted test >>>>> cases indicating what we expect the proper behavior to be for >>>>> remove_Descendant(), contract_linear_paths(), and any other problematic >>>>> Node/Tree/ TreeFunctionI methods. In fact, I highly recommend any code >>>>> changes like this add tests to the test suite demonstrating the issue. >>>> >>>> I can work the example of the thread into a test, adding some >>>> of the points brought in by Hilmar- >>> >>> Any other areas of worry? >>> >>>>> Possibly related to all this is a fairly significant lingering bug >>>>> dealing with Bio::Tree::TreeFunctionsI::reroot() >>>>> (http://bugzilla.open-bio.org/show_bug.cgi?id=2456 ). Any takers? >>>> >>>> I take this one, if I have those privileges ( it is a privilege to serve, >>>> isn't it?)... >>> >>> Cool, thanks Mark! >>> >>> -c >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From dylankrishnan at gmail.com Sat Feb 7 15:51:39 2009 From: dylankrishnan at gmail.com (Dylan Krishnan) Date: Sat, 7 Feb 2009 14:51:39 -0600 Subject: [Bioperl-l] calculate the frequency of occurrence of themostcommonly observed amino acid at each position ofmultiplesequence alignment In-Reply-To: <7F4F092DD155448792D493E4A23B43DA@NewLife> References: <3530AFC06DD34C0B8925AA2E5F0B6A5E@NewLife> <7F4F092DD155448792D493E4A23B43DA@NewLife> Message-ID: Thanks Mark! I'm still working on this - as a newbie, I'm still digesting your suggestions - here is what I think I want to do for a multiple sequence alignment - 1. find the total number of residues,n, in the alignment 2. find the total number of a specific residue, x, in an alignment 3. find the totalk number of times a residue,x, appears at a specific site 4. total number of sequences in an alignment. I initially thought about writing a single script to generate all these parameters but now think four separate (read: unsophisticated and utterly reductionist) scripts will do... I think your suggestions will clearly help me on this quest! -dylan On Sat, Feb 7, 2009 at 2:36 PM, Mark A. Jensen wrote: > oops-bugs in that. Try > > my $len = length($seqs[0]); >> my @residue_counts; >> my %h; >> foreach (0..$len-1) { >> %h = (); >> foreach $seq (@seqs) { >> $h{ substr($seq, $_, 1) }++; >> } >> push @residue_counts, {%h}; >> } >> > > > ----- Original Message ----- From: "Mark A. Jensen" > To: "Dylan Krishnan" > Cc: > Sent: Saturday, February 07, 2009 11:56 AM > Subject: Re: [Bioperl-l] calculate the frequency of occurrence of > themostcommonly observed amino acid at each position ofmultiplesequence > alignment > > > > Dylan- It's worth mentioning that the BioPerl method is very >> overhead-heavy; all >> the objects make it easy to just write a few lines, but probably won't be >> the absolute >> fastest way to do what you want. Another path to follow would be >> >> # your seqs are plain strings in the array @seqs, and are aligned and same >> length >> my $len = length($seqs[0]); >> my @residue_counts; >> foreach (0..$len-1) { >> my %h = (); >> foreach $seq (@seqs) { >> $h{ substr($seq, $_, 1) }++; >> } >> push @residue_counts, \%h; >> } >> >> Now, for each elt in @residue_counts (each elt is a reference to a hash), >> look for the >> key that has the maximum hash value. The snippet above is also worth >> working >> through for the educational value, esp. w/r to using hashes, which (IMHO) >> are one of >> the absolutely coolest thing about Perl. >> >> cheers- MAJ >> ----- Original Message ----- From: Dylan Krishnan >> To: Mark A. Jensen >> Cc: bioperl-l at lists.open-bio.org >> Sent: Saturday, February 07, 2009 11:43 AM >> Subject: Re: [Bioperl-l] calculate the frequency of occurrence of the >> mostcommonly observed amino acid at each position of multiplesequence >> alignment >> >> >> thanks mark! >> >> the authors other approach is to load the alignment into a MS Excel >> worksheet and use the "autofilter" procedure to count the occurrences of any >> residue position of the alignment. the claim is "that excel is uselful for >> this purpose."sounds reasonable for 10 alignments but not 2000! >> >> again, many thanks. >> >> >> -dylan >> >> On Sat, Feb 7, 2009 at 10:25 AM, Mark A. Jensen >> wrote: >> >> Dylan, >> >> This is an extremely good exercise for anyone learning Perl to do >> bioinformatics. >> When you have done many exercises like this, you will see what people >> mean >> when they say it is very straightforward. >> >> Here are some hints: >> >> Use the "entropy" scrap at >> http://www.bioperl.org/wiki/Site_entropy_in_an_alignment . >> You will convert the function entropy_by_column() into the function you >> need. >> Replace the line >> >> $ent{$col} = entropy(values %res); >> >> with a line you will write using the "hash key at max value" scrap, >> found >> here: http://www.bioperl.org/wiki/Hash_key_at_the_max_value . >> >> Happy coding! >> Mark >> >> ----- Original Message ----- From: "Dylan Krishnan" < >> dylankrishnan at gmail.com> >> To: >> Sent: Saturday, February 07, 2009 11:10 AM >> Subject: [Bioperl-l] calculate the frequency of occurrence of the >> mostcommonly observed amino acid at each position of multiplesequence >> alignment >> >> >> >> I am new to perl but this is somethign I am seeking to do either >> through a >> bioperl module or just perl. Apparently, this is quite >> "straightforward >> using PERL," but I beg to differ. >> >> Any assistance regarding this matter would be greatly appreciated. >> >> Thanks! >> >> -dylan >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> > From maj at fortinbras.us Sat Feb 7 15:36:38 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 7 Feb 2009 15:36:38 -0500 Subject: [Bioperl-l] calculate the frequency of occurrence of themostcommonly observed amino acid at each position ofmultiplesequence alignment In-Reply-To: <3530AFC06DD34C0B8925AA2E5F0B6A5E@NewLife> References: <3530AFC06DD34C0B8925AA2E5F0B6A5E@NewLife> Message-ID: <7F4F092DD155448792D493E4A23B43DA@NewLife> oops-bugs in that. Try > my $len = length($seqs[0]); > my @residue_counts; > my %h; > foreach (0..$len-1) { > %h = (); > foreach $seq (@seqs) { > $h{ substr($seq, $_, 1) }++; > } > push @residue_counts, {%h}; > } ----- Original Message ----- From: "Mark A. Jensen" To: "Dylan Krishnan" Cc: Sent: Saturday, February 07, 2009 11:56 AM Subject: Re: [Bioperl-l] calculate the frequency of occurrence of themostcommonly observed amino acid at each position ofmultiplesequence alignment > Dylan- It's worth mentioning that the BioPerl method is very overhead-heavy; > all > the objects make it easy to just write a few lines, but probably won't be the > absolute > fastest way to do what you want. Another path to follow would be > > # your seqs are plain strings in the array @seqs, and are aligned and same > length > my $len = length($seqs[0]); > my @residue_counts; > foreach (0..$len-1) { > my %h = (); > foreach $seq (@seqs) { > $h{ substr($seq, $_, 1) }++; > } > push @residue_counts, \%h; > } > > Now, for each elt in @residue_counts (each elt is a reference to a hash), look > for the > key that has the maximum hash value. The snippet above is also worth working > through for the educational value, esp. w/r to using hashes, which (IMHO) are > one of > the absolutely coolest thing about Perl. > > cheers- MAJ > ----- Original Message ----- > From: Dylan Krishnan > To: Mark A. Jensen > Cc: bioperl-l at lists.open-bio.org > Sent: Saturday, February 07, 2009 11:43 AM > Subject: Re: [Bioperl-l] calculate the frequency of occurrence of the > mostcommonly observed amino acid at each position of multiplesequence > alignment > > > thanks mark! > > the authors other approach is to load the alignment into a MS Excel worksheet > and use the "autofilter" procedure to count the occurrences of any residue > position of the alignment. the claim is "that excel is uselful for this > purpose."sounds reasonable for 10 alignments but not 2000! > > again, many thanks. > > > -dylan > > On Sat, Feb 7, 2009 at 10:25 AM, Mark A. Jensen wrote: > > Dylan, > > This is an extremely good exercise for anyone learning Perl to do > bioinformatics. > When you have done many exercises like this, you will see what people mean > when they say it is very straightforward. > > Here are some hints: > > Use the "entropy" scrap at > http://www.bioperl.org/wiki/Site_entropy_in_an_alignment . > You will convert the function entropy_by_column() into the function you > need. > Replace the line > > $ent{$col} = entropy(values %res); > > with a line you will write using the "hash key at max value" scrap, found > here: http://www.bioperl.org/wiki/Hash_key_at_the_max_value . > > Happy coding! > Mark > > ----- Original Message ----- From: "Dylan Krishnan" > > To: > Sent: Saturday, February 07, 2009 11:10 AM > Subject: [Bioperl-l] calculate the frequency of occurrence of the > mostcommonly observed amino acid at each position of multiplesequence > alignment > > > > I am new to perl but this is somethign I am seeking to do either through > a > bioperl module or just perl. Apparently, this is quite "straightforward > using PERL," but I beg to differ. > > Any assistance regarding this matter would be greatly appreciated. > > Thanks! > > -dylan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From hlapp at gmx.net Sat Feb 7 16:04:13 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 7 Feb 2009 16:04:13 -0500 Subject: [Bioperl-l] load_seqdatabase.pl memory requirements unusually large In-Reply-To: <5E3BC16E-4B54-4811-A4BE-02C8CB0A21EF@illinois.edu> References: <00C78AD9-02E4-4295-93E0-521DAE453842@hudsonalpha.org> <821B9510-247B-4CA6-AC22-E0420698DD5F@gmx.net> <498C97DA.4070804@gmail.com> <842543C6-06A5-451B-9749-D5379D85AC87@gmx.net> <5E3BC16E-4B54-4811-A4BE-02C8CB0A21EF@illinois.edu> Message-ID: <68B96832-9CD5-4BC2-9587-BA13491A89EE@gmx.net> On Feb 6, 2009, at 4:26 PM, Chris Fields wrote: > Could that have something to do with using a loaded database for > tests? I thought tests were supposed to be run with a clean > database (with the biosql schema but nothing else). Yes. But the errors resulting from using a database that has stuff left over from previous test runs (for example because transactions and rollback aren't supported) ought to result in failing tests, not in premature termination of the test scripts, which is what is so odd. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Sat Feb 7 16:12:55 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 7 Feb 2009 16:12:55 -0500 Subject: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? In-Reply-To: References: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> <2EF8144E065A45808E8EECACF51124A4@NewLife> <17337564-0D4C-437C-BB82-6337D967F246@illinois.edu> <25E8B6CF45F145FDA0548979D0D9C231@NewLife> <38157EB0-359D-47DC-9214-348134FA3220@illinois.edu> Message-ID: <1C84598A-B560-4947-B156-0FA7B9F4F662@gmx.net> On Feb 7, 2009, at 3:39 PM, Mark A. Jensen wrote: > editorial analysis up under Bug #2456 Would you mind copying that here? This list is probably best for discussing desired behavior and related issues, if that's what the editorial analysis is about. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From maj at fortinbras.us Sat Feb 7 16:32:27 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 7 Feb 2009 16:32:27 -0500 Subject: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? In-Reply-To: <1C84598A-B560-4947-B156-0FA7B9F4F662@gmx.net> References: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu> <2EF8144E065A45808E8EECACF51124A4@NewLife> <17337564-0D4C-437C-BB82-6337D967F246@illinois.edu> <25E8B6CF45F145FDA0548979D0D9C231@NewLife> <38157EB0-359D-47DC-9214-348134FA3220@illinois.edu> <1C84598A-B560-4947-B156-0FA7B9F4F662@gmx.net> Message-ID: <398D32E9F3DB4155AAB493F05B7A4F7F@NewLife> np- (note that no commits have occurred yet on this.) MAJ comments under Bug #2456 **** I've identified (and I think fixed) the problems. I will paste in my reworked reroot() with my comments and editorializing to myself here (these won't appear in any commits) to indicate the issues. I've also written a new helper function for B:T:Node called create_node_on_branch(), which figures in the fix below. I'll describe it briefly in a separate comment. The clue that broke this case was that reroot()'ing on a leaf (as Stephanie did) gave the wrong answer (according to close visual inspection of the tree as rendered in FigTree), while reroot()'ing on an internal node gave the right answer. The main issue in the original reroot was the unnecessary creation of new node at the midpoint of the new root's ancestral branch that got left in the tree after rerooting. The original author felt that rerooting on a leaf was a special case, but the case wasn't handled correctly, so that an extra branch got created in the tree, and a spurious root that was not the desired new root (i.e., not the leaf requested by the user) was created. [see the code at http://bugzilla.open-bio.org/show_bug.cgi?id=2456#c3] *** create_node_on_branch() separates the issue of rerooting from the issue of deciding whether a new root should be created in the middle of a branch. These issues seemed to be conflated in the original reroot. Suppose now one wants to create a new root for a tree between the nodes A -> B at the midpoint between A and B. The code now would be: my $newroot = $tree->find_node('B')->create_node_on_branch(-FRACTION=>0.5, -ANNOT=>{-id=>'midpt',-desc=>'new midpoint root'}); $tree->reroot($newroot); Before you would (prob) have to do this 'by hand'-- my $nodeB = $tree->find_node('B'); my $nodeA = $tree->find_node('A'); my $newnode = Bio::Tree::Node->new(-branch_length=>0.5*$nodeB->branch_length); $newnode->ancestor($nodeA); $newnode->add_Descendent($nodeB); $nodeA->remove_Descendent($nodeA); $nodeB->branch_length(0.5*$nodeB->branch_length); # then $tree->reroot($newnode); # and hope for the best... *** Going thru Tree.t-- I see there is a real conflation of rerooting and root *creation*. The current (old) implementation creates an extra node and makes it the root, which isn't what I would expect -- if you reroot on a node, you want that very node to be the root. If you want a new root in my scheme, you create that node on the branch where you want and then reroot to that node. I think this is more natural, but maybe not for everyone. Anyway, I'm modifying the test file and will throw up a patch later. *** The mods to the other tests are mainly related to a miscounting of nodes for a set of related trees; get_nodes is right and the test is wrong, I believe. I've uploaded pdfs of FigTree renderings of the trees in question for others to doublecheck. *** May I take a moment here to say that FigTree rocks. *** FigTree is at http://tree.bio.ed.ac.uk/software/figtree/ , btw. ----- Original Message ----- From: "Hilmar Lapp" To: "Mark A. Jensen" Cc: "Chris Fields" ; Sent: Saturday, February 07, 2009 4:12 PM Subject: Re: [Bioperl-l] Unwise elimination of nodes inB:T:Node::remove_Descendent? > > On Feb 7, 2009, at 3:39 PM, Mark A. Jensen wrote: > >> editorial analysis up under Bug #2456 > > > Would you mind copying that here? This list is probably best for discussing > desired behavior and related issues, if that's what the editorial analysis is > about. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > From alexl at users.sourceforge.net Sun Feb 8 02:36:18 2009 From: alexl at users.sourceforge.net (Alex Lancaster) Date: Sun, 08 Feb 2009 00:36:18 -0700 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-run, BioPerl-db, BioPerl-network In-Reply-To: <151FBF26-C5FE-4388-875A-CC678B447318@illinois.edu> (Chris Fields's message of "Wed\, 28 Jan 2009 11\:14\:22 -0600") References: <151FBF26-C5FE-4388-875A-CC678B447318@illinois.edu> Message-ID: >>>>> "CF" == Chris Fields writes: CF> All, CF> I would like to announce that the first alpha releases for BioPerl- CF> run, BioPerl-db, and BioPerl-network are available. These are CF> designated as 1.005009_001, with a requirement for BioPerl 1.6 and CF> higher (1.006000). [...] CF> The archives can be downloaded from here: CF> BioPerl-run: CF> http://bioperl.org/DIST/BioPerl-run-1.5.9_1.tar.bz2 CF> http://bioperl.org/DIST/BioPerl-run-1.5.9_1.tar.gz CF> http://bioperl.org/DIST/BioPerl-run-1.5.9_1.zip There seems to be some duplicated code that either should have been removed from either BioPerl or BioPerl-run, the files: /usr/lib/perl5/vendor_perl/5.10.0/Bio/ConfigData.pm /usr/share/man/man3/Bio::ConfigData.3pm.gz are currently installed by *both* BioPerl and BioPerl-run. See the downstream Fedora bug for more details: http://bugzilla.redhat.com/show_bug.cgi?id=484495 This causes the package set to fail to install. Should ConfigData be in BioPerl or BioPerl-run? Thanks, Alex From cjfields at illinois.edu Sun Feb 8 10:18:34 2009 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 8 Feb 2009 09:18:34 -0600 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-run, BioPerl-db, BioPerl-network In-Reply-To: References: <151FBF26-C5FE-4388-875A-CC678B447318@illinois.edu> Message-ID: Alex, I don't see Bio::ConfigData with the last BioPerl core release on CPAN (1.6.0) or any of the BioPerl-*. Could you send me the actual file in question? chris On Feb 8, 2009, at 1:36 AM, Alex Lancaster wrote: >>>>>> "CF" == Chris Fields writes: > > CF> All, > CF> I would like to announce that the first alpha releases for > BioPerl- > CF> run, BioPerl-db, and BioPerl-network are available. These are > CF> designated as 1.005009_001, with a requirement for BioPerl 1.6 and > CF> higher (1.006000). > > [...] > > CF> The archives can be downloaded from here: > > CF> BioPerl-run: > > CF> http://bioperl.org/DIST/BioPerl-run-1.5.9_1.tar.bz2 > CF> http://bioperl.org/DIST/BioPerl-run-1.5.9_1.tar.gz > CF> http://bioperl.org/DIST/BioPerl-run-1.5.9_1.zip > > There seems to be some duplicated code that either should have been > removed from either BioPerl or BioPerl-run, the files: > > /usr/lib/perl5/vendor_perl/5.10.0/Bio/ConfigData.pm > /usr/share/man/man3/Bio::ConfigData.3pm.gz > > are currently installed by *both* BioPerl and BioPerl-run. See the > downstream Fedora bug for more details: > > http://bugzilla.redhat.com/show_bug.cgi?id=484495 > > This causes the package set to fail to install. Should ConfigData be > in BioPerl or BioPerl-run? > > Thanks, > Alex > From markus.liebscher at gmx.de Sun Feb 8 10:30:37 2009 From: markus.liebscher at gmx.de (manni122) Date: Sun, 8 Feb 2009 07:30:37 -0800 (PST) Subject: [Bioperl-l] Alignment optimization using degenerate code Message-ID: <21868936.post@talk.nabble.com> I have a problem in programming, that once solved might be worth adding it to the BioPerl bundle. I want to pairwise align DNA sequences with as little gaps as possible. To reach this I want to run a first round of alignment, than compare the alignment pair and search for opposite codons that can be exchanged each with another one coding for the corresponding amino acid. This should increase their identity score. Does anyone have an good idea how to start? Appreciate any help with this! Markus -- View this message in context: http://www.nabble.com/Alignment-optimization-using-degenerate-code-tp21868936p21868936.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From David.Messina at sbc.su.se Sun Feb 8 11:24:11 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Sun, 8 Feb 2009 17:24:11 +0100 Subject: [Bioperl-l] Alignment optimization using degenerate code In-Reply-To: <21868936.post@talk.nabble.com> References: <21868936.post@talk.nabble.com> Message-ID: <628aabb70902080824w72d897e9n7627d222383152dc@mail.gmail.com> Pairwise alignment algorithms, when run with typical parameters, typically minimize gaps anyway. I'm puzzled by your idea of changing the codons. If you're aligning DNA, you can't just change the sequence to improve the alignment. This sounds like a homework problem. I suggest you read up on the basics of pairwise sequence alignment algorithms and maybe even try doing the alignments by hand. That should lead to an understanding of how gaps are minimized. Dave From alexl at users.sourceforge.net Sun Feb 8 11:41:48 2009 From: alexl at users.sourceforge.net (Alex Lancaster) Date: Sun, 08 Feb 2009 09:41:48 -0700 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-run, BioPerl-db, BioPerl-network In-Reply-To: (Chris Fields's message of "Sun\, 8 Feb 2009 09\:18\:34 -0600") References: <151FBF26-C5FE-4388-875A-CC678B447318@illinois.edu> Message-ID: >>>>> "CF" == Chris Fields writes: CF> Alex, CF> I don't see Bio::ConfigData with the last BioPerl core release on CPAN CF> (1.6.0) or any of the BioPerl-*. Could you send me the actual file in CF> question? You are right that it doesn't appear to be contained in the source, but looking at the build log it appears to be dynamically generated in both packages. 1) In the case of bioperl: Writing config notes to blib/lib/Bio/ConfigData.pm and later: Manifying blib/lib/Bio/ConfigData.pm ->blib/libdoc/Bio::ConfigData.3pm later still: Installing /builddir/build/BUILDROOT/perl-bioperl-1.6.0-1.fc11.noarch/usr/lib/perl5/vendor_perl/5.10.0/Bio/ConfigData.pm See full log here: http://kojipkgs.fedoraproject.org/packages/perl-bioperl/1.6.0/1.fc11/data/logs/noarch/build.log 2) Similar generation of the ConfigData is done in bioperl-run: http://kojipkgs.fedoraproject.org/packages/perl-bioperl-run/1.5.9/0.1.1.fc11/data/logs/noarch/build.log Alex From cjfields at illinois.edu Sun Feb 8 13:13:29 2009 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 8 Feb 2009 12:13:29 -0600 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-run, BioPerl-db, BioPerl-network In-Reply-To: References: <151FBF26-C5FE-4388-875A-CC678B447318@illinois.edu> Message-ID: <647DE1BD-3A3C-4BC2-BC23-5E0C9DA67A57@illinois.edu> Alex, Odd. From what I'm reading that file shouldn't be added to the distribution or installed. It's generated on the fly by Module::Build for build and configuration only (triggered when using the 'features' option). Makes me wonder if this is an issue with Module::Build or our derived Bio::Root::Build, It does appear this is also popping up as a possible issue with Module::Build itself (included along with perl 5.10.0): http://www.nntp.perl.org/group/perl.module.build/2009/01/msg1809.html The obvious fix is to have it ignored upon installation. I'll try to get a fix up for the next alphas of BioPerl-run/db/network and the next point release of BioPerl. chris On Feb 8, 2009, at 10:41 AM, Alex Lancaster wrote: >>>>>> "CF" == Chris Fields writes: > > CF> Alex, > CF> I don't see Bio::ConfigData with the last BioPerl core release > on CPAN > CF> (1.6.0) or any of the BioPerl-*. Could you send me the actual > file in > CF> question? > > You are right that it doesn't appear to be contained in the source, > but looking at the build log it appears to be dynamically generated in > both packages. > > 1) In the case of bioperl: > > Writing config notes to blib/lib/Bio/ConfigData.pm > > and later: > > Manifying blib/lib/Bio/ConfigData.pm ->blib/libdoc/Bio::ConfigData.3pm > > later still: > > Installing /builddir/build/BUILDROOT/perl- > bioperl-1.6.0-1.fc11.noarch/usr/lib/perl5/vendor_perl/5.10.0/Bio/ > ConfigData.pm > > See full log here: > > http://kojipkgs.fedoraproject.org/packages/perl-bioperl/1.6.0/1.fc11/data/logs/noarch/build.log > > 2) Similar generation of the ConfigData is done in bioperl-run: > > http://kojipkgs.fedoraproject.org/packages/perl-bioperl-run/1.5.9/0.1.1.fc11/data/logs/noarch/build.log > > Alex > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Sun Feb 8 14:12:16 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Sun, 8 Feb 2009 14:12:16 -0500 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-run, BioPerl-db, BioPerl-network In-Reply-To: <647DE1BD-3A3C-4BC2-BC23-5E0C9DA67A57@illinois.edu> References: <151FBF26-C5FE-4388-875A-CC678B447318@illinois.edu> <647DE1BD-3A3C-4BC2-BC23-5E0C9DA67A57@illinois.edu> Message-ID: <16E50308F79B4A0292DCC67D3058417A@NewLife> As I understand it, Foo::Bar::ConfigData is generated by M:B to provide persistent access to the configuration parameters after (perhaps long after) installation, via a script config_data that comes with M:B. Is there a collision between Bio::ConfigData created by the main distribution and Bio::ConfigData created by bioperl-run? See the docs at http://search.cpan.org/~ewilhelm/Module-Build-0.31012/lib/Module/Build/Authoring.pod#SAVING_CONFIGURATION_INFORMATION ----- Original Message ----- From: "Chris Fields" To: "Alex Lancaster" Cc: Sent: Sunday, February 08, 2009 1:13 PM Subject: Re: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-run,BioPerl-db, BioPerl-network > Alex, > > Odd. From what I'm reading that file shouldn't be added to the distribution > or installed. It's generated on the fly by Module::Build for build and > configuration only (triggered when using the 'features' option). Makes me > wonder if this is an issue with Module::Build or our derived > Bio::Root::Build, It does appear this is also popping up as a possible issue > with Module::Build itself (included along with perl 5.10.0): > > http://www.nntp.perl.org/group/perl.module.build/2009/01/msg1809.html > > The obvious fix is to have it ignored upon installation. I'll try to get a > fix up for the next alphas of BioPerl-run/db/network and the next point > release of BioPerl. > > chris > > On Feb 8, 2009, at 10:41 AM, Alex Lancaster wrote: > >>>>>>> "CF" == Chris Fields writes: >> >> CF> Alex, >> CF> I don't see Bio::ConfigData with the last BioPerl core release on CPAN >> CF> (1.6.0) or any of the BioPerl-*. Could you send me the actual file in >> CF> question? >> >> You are right that it doesn't appear to be contained in the source, >> but looking at the build log it appears to be dynamically generated in >> both packages. >> >> 1) In the case of bioperl: >> >> Writing config notes to blib/lib/Bio/ConfigData.pm >> >> and later: >> >> Manifying blib/lib/Bio/ConfigData.pm ->blib/libdoc/Bio::ConfigData.3pm >> >> later still: >> >> Installing /builddir/build/BUILDROOT/perl- >> bioperl-1.6.0-1.fc11.noarch/usr/lib/perl5/vendor_perl/5.10.0/Bio/ >> ConfigData.pm >> >> See full log here: >> >> http://kojipkgs.fedoraproject.org/packages/perl-bioperl/1.6.0/1.fc11/data/logs/noarch/build.log >> >> 2) Similar generation of the ConfigData is done in bioperl-run: >> >> http://kojipkgs.fedoraproject.org/packages/perl-bioperl-run/1.5.9/0.1.1.fc11/data/logs/noarch/build.log >> >> Alex >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Sun Feb 8 17:41:23 2009 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 8 Feb 2009 16:41:23 -0600 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-run, BioPerl-db, BioPerl-network In-Reply-To: <16E50308F79B4A0292DCC67D3058417A@NewLife> References: <151FBF26-C5FE-4388-875A-CC678B447318@illinois.edu> <647DE1BD-3A3C-4BC2-BC23-5E0C9DA67A57@illinois.edu> <16E50308F79B4A0292DCC67D3058417A@NewLife> Message-ID: Mark, Yes, saw that. It is a 'feature' of Module::Build that snuck by us, at least by me. We can take advantage of that at some future point, but at this time I think we should attempt turning off installing Bio::ConfigData completely, at least until we do the following: (1) determine whether we want to call it 'Bio::ConfigData' (I vote 'no'; I don't like anything installed by default to our core namespace when it should probably be in Bio::Root, such as Bio::Root::ConfigData) (2) determine whether we want bioperl core and the subdistributions to use the same or different ConfigData modules (I vote 'same', no need to get overly complicated at this point) (3) describe exactly what does and does not belong in any permanent BioPerl config file (I think they should only be relevant to the local build/test/ installation and shouldn't include possibly volatile DB configuration settings) Right now it just houses the feature-related settings used within all the Build.PL. Based on their current naming ('BioDBSeqFeature_mysql', 'BioDBSeqFeature_BDB') I don't trust those to always contain the same key/value pairing, so we need to probably standardize these for the next bioperl minor release (they're potentially too volatile and could be consolidated in a meaningful way). Anyway, until then I wouldn't rely on ConfigData's existence or the specific namespace Bio::ConfigData; I'll probably add it to MANIFEST.SKIP. chris On Feb 8, 2009, at 1:12 PM, Mark A. Jensen wrote: > As I understand it, Foo::Bar::ConfigData is generated by M:B to > provide persistent access to > the configuration parameters after (perhaps long after) > installation, via a script > config_data that comes with M:B. Is there a collision between > Bio::ConfigData created by > the main distribution and Bio::ConfigData created by bioperl-run? > See the docs at > http://search.cpan.org/~ewilhelm/Module-Build-0.31012/lib/Module/Build/Authoring.pod#SAVING_CONFIGURATION_INFORMATION > > ----- Original Message ----- From: "Chris Fields" > > To: "Alex Lancaster" > Cc: > Sent: Sunday, February 08, 2009 1:13 PM > Subject: Re: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of > BioPerl-run,BioPerl-db, BioPerl-network > > >> Alex, >> >> Odd. From what I'm reading that file shouldn't be added to the >> distribution or installed. It's generated on the fly by >> Module::Build for build and configuration only (triggered when >> using the 'features' option). Makes me wonder if this is an issue >> with Module::Build or our derived Bio::Root::Build, It does >> appear this is also popping up as a possible issue with >> Module::Build itself (included along with perl 5.10.0): >> >> http://www.nntp.perl.org/group/perl.module.build/2009/01/msg1809.html >> >> The obvious fix is to have it ignored upon installation. I'll try >> to get a fix up for the next alphas of BioPerl-run/db/network and >> the next point release of BioPerl. >> >> chris >> >> On Feb 8, 2009, at 10:41 AM, Alex Lancaster wrote: >> >>>>>>>> "CF" == Chris Fields writes: >>> >>> CF> Alex, >>> CF> I don't see Bio::ConfigData with the last BioPerl core >>> release on CPAN >>> CF> (1.6.0) or any of the BioPerl-*. Could you send me the >>> actual file in >>> CF> question? >>> >>> You are right that it doesn't appear to be contained in the source, >>> but looking at the build log it appears to be dynamically >>> generated in >>> both packages. >>> >>> 1) In the case of bioperl: >>> >>> Writing config notes to blib/lib/Bio/ConfigData.pm >>> >>> and later: >>> >>> Manifying blib/lib/Bio/ConfigData.pm ->blib/libdoc/Bio::ConfigData. >>> 3pm >>> >>> later still: >>> >>> Installing /builddir/build/BUILDROOT/perl- >>> bioperl-1.6.0-1.fc11.noarch/usr/lib/perl5/vendor_perl/5.10.0/Bio/ >>> ConfigData.pm >>> >>> See full log here: >>> >>> http://kojipkgs.fedoraproject.org/packages/perl-bioperl/1.6.0/1.fc11/data/logs/noarch/build.log >>> >>> 2) Similar generation of the ConfigData is done in bioperl-run: >>> >>> http://kojipkgs.fedoraproject.org/packages/perl-bioperl-run/1.5.9/0.1.1.fc11/data/logs/noarch/build.log >>> >>> Alex >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Sun Feb 8 19:17:22 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Sun, 8 Feb 2009 19:17:22 -0500 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-run, BioPerl-db, BioPerl-network In-Reply-To: References: <151FBF26-C5FE-4388-875A-CC678B447318@illinois.edu> <647DE1BD-3A3C-4BC2-BC23-5E0C9DA67A57@illinois.edu> <16E50308F79B4A0292DCC67D3058417A@NewLife> Message-ID: <37D2144A5FDF4826B0211B7A56EE987D@NewLife> All your points sound reasonable; certainly without a hard look it would be most expedient just to zap the module. I would say that 1) it would be pretty useful to have all the BioPerl config data in one place, and 2) M:B's solution is a way to do this that seems pretty convenient for the installation of a single module, or a single distribution, but might not scale well to BioPerl's multi-distribution complexity, at least not without some help. If there are any Module::Build hackers listening, hope they'll chime in. If BP built its own config module, we could divide the "static" and "dynamic" config information explicitly according to whatever is decided, and provide that infomation programmatically; I think I can imagine scenarios (say a failover) that one might want to check what the current db settings are in a nice standardized way and then decide what to next. Being vague here intentionally, as I'm just getting to understand the machinery and techniques of configs and distributions. (Could the issue benefit from a wiki page?) MAJ ----- Original Message ----- From: "Chris Fields" To: "Mark A. Jensen" Cc: "Alex Lancaster" ; Sent: Sunday, February 08, 2009 5:41 PM Subject: Re: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-run, BioPerl-db, BioPerl-network > Mark, > > Yes, saw that. It is a 'feature' of Module::Build that snuck by us, at least > by me. We can take advantage of that at some future point, but at this time > I think we should attempt turning off installing Bio::ConfigData completely, > at least until we do the following: > > (1) determine whether we want to call it 'Bio::ConfigData' > (I vote 'no'; I don't like anything installed by default to our core > namespace when it should probably be in Bio::Root, such as > Bio::Root::ConfigData) > > (2) determine whether we want bioperl core and the subdistributions to use > the same or different ConfigData modules > (I vote 'same', no need to get overly complicated at this point) > > (3) describe exactly what does and does not belong in any permanent BioPerl > config file > (I think they should only be relevant to the local build/test/ installation > and shouldn't include possibly volatile DB configuration settings) > > Right now it just houses the feature-related settings used within all the > Build.PL. Based on their current naming ('BioDBSeqFeature_mysql', > 'BioDBSeqFeature_BDB') I don't trust those to always contain the same > key/value pairing, so we need to probably standardize these for the next > bioperl minor release (they're potentially too volatile and could be > consolidated in a meaningful way). > > Anyway, until then I wouldn't rely on ConfigData's existence or the specific > namespace Bio::ConfigData; I'll probably add it to MANIFEST.SKIP. > > chris > > On Feb 8, 2009, at 1:12 PM, Mark A. Jensen wrote: > >> As I understand it, Foo::Bar::ConfigData is generated by M:B to provide >> persistent access to >> the configuration parameters after (perhaps long after) installation, via a >> script >> config_data that comes with M:B. Is there a collision between >> Bio::ConfigData created by >> the main distribution and Bio::ConfigData created by bioperl-run? >> See the docs at >> http://search.cpan.org/~ewilhelm/Module-Build-0.31012/lib/Module/Build/Authoring.pod#SAVING_CONFIGURATION_INFORMATION >> >> ----- Original Message ----- From: "Chris Fields" > > >> To: "Alex Lancaster" >> Cc: >> Sent: Sunday, February 08, 2009 1:13 PM >> Subject: Re: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of >> BioPerl-run,BioPerl-db, BioPerl-network >> >> >>> Alex, >>> >>> Odd. From what I'm reading that file shouldn't be added to the >>> distribution or installed. It's generated on the fly by Module::Build for >>> build and configuration only (triggered when using the 'features' option). >>> Makes me wonder if this is an issue with Module::Build or our derived >>> Bio::Root::Build, It does appear this is also popping up as a possible >>> issue with Module::Build itself (included along with perl 5.10.0): >>> >>> http://www.nntp.perl.org/group/perl.module.build/2009/01/msg1809.html >>> >>> The obvious fix is to have it ignored upon installation. I'll try to get >>> a fix up for the next alphas of BioPerl-run/db/network and the next point >>> release of BioPerl. >>> >>> chris >>> >>> On Feb 8, 2009, at 10:41 AM, Alex Lancaster wrote: >>> >>>>>>>>> "CF" == Chris Fields writes: >>>> >>>> CF> Alex, >>>> CF> I don't see Bio::ConfigData with the last BioPerl core release on >>>> CPAN >>>> CF> (1.6.0) or any of the BioPerl-*. Could you send me the actual file >>>> in >>>> CF> question? >>>> >>>> You are right that it doesn't appear to be contained in the source, >>>> but looking at the build log it appears to be dynamically generated in >>>> both packages. >>>> >>>> 1) In the case of bioperl: >>>> >>>> Writing config notes to blib/lib/Bio/ConfigData.pm >>>> >>>> and later: >>>> >>>> Manifying blib/lib/Bio/ConfigData.pm ->blib/libdoc/Bio::ConfigData. 3pm >>>> >>>> later still: >>>> >>>> Installing /builddir/build/BUILDROOT/perl- >>>> bioperl-1.6.0-1.fc11.noarch/usr/lib/perl5/vendor_perl/5.10.0/Bio/ >>>> ConfigData.pm >>>> >>>> See full log here: >>>> >>>> http://kojipkgs.fedoraproject.org/packages/perl-bioperl/1.6.0/1.fc11/data/logs/noarch/build.log >>>> >>>> 2) Similar generation of the ConfigData is done in bioperl-run: >>>> >>>> http://kojipkgs.fedoraproject.org/packages/perl-bioperl-run/1.5.9/0.1.1.fc11/data/logs/noarch/build.log >>>> >>>> Alex >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From cjfields at illinois.edu Sun Feb 8 22:35:28 2009 From: cjfields at illinois.edu (Chris Fields) Date: Sun, 8 Feb 2009 21:35:28 -0600 Subject: [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of BioPerl-run, BioPerl-db, BioPerl-network In-Reply-To: <37D2144A5FDF4826B0211B7A56EE987D@NewLife> References: <151FBF26-C5FE-4388-875A-CC678B447318@illinois.edu> <647DE1BD-3A3C-4BC2-BC23-5E0C9DA67A57@illinois.edu> <16E50308F79B4A0292DCC67D3058417A@NewLife> <37D2144A5FDF4826B0211B7A56EE987D@NewLife> Message-ID: On Feb 8, 2009, at 6:17 PM, Mark A. Jensen wrote: > All your points sound reasonable; certainly without a hard look it > would > be most expedient just to zap the module. I would say that 1) it > would be > pretty useful to have all the BioPerl config data in one place, and > 2) M:B's solution is a way to do this that seems pretty convenient > for the installation of a single module, or a single distribution, but > might not scale well to BioPerl's multi-distribution > complexity, at least not without some help. If there are any > Module::Build > hackers listening, hope they'll chime in. I believe one can call (and modify, if permissions allow) the ConfigData file, so it's feasible to just modify the already-installed > If BP built its own config > module, we could divide the "static" and "dynamic" config information > explicitly according to whatever is decided, and provide that > infomation > programmatically; I think I can imagine scenarios (say a failover) > that > one might want to check what the current db settings are in a nice > standardized > way and then decide what to next. Being vague here intentionally, as > I'm > just getting to understand the machinery and techniques of configs > and distributions. > > (Could the issue benefit from a wiki page?) Yes. I will work on getting the current build system going and then go from there. It'll need some more discussion so I'll start up a new thread then. Alex, I'll work on getting the file removed for the next BioPerl-run/ db/network alpha (and it'll be removed from core for 1.6.1). Should be out in the next day or two. -c From gaospecial at gmail.com Mon Feb 9 04:01:35 2009 From: gaospecial at gmail.com (spring gao) Date: Mon, 9 Feb 2009 17:01:35 +0800 Subject: [Bioperl-l] bp_biblio.pl doesn't work Message-ID: <5b2271350902090101n2887644ah84512a324496a4c5@mail.gmail.com> Hi, all, I have some questions about the usage of bp_biblio.pl . when execute bp_biblio.pl example, return the following Errors. who knows how to repair this problem? Thank you! Information: bp_biblio.pl -v 1.006 Shell OUTPUT: bp_biblio.pl - -find java -attrs abstract -find perl Looking for 'java' in attributes 'abstract'... ------------- EXCEPTION ------------- MSG: --- SOAP FAULT --- soapenv:Server.userException embl.ebi.BibShare.BQSException: An empty query. It may happen because of using non-existing attributes. STACK Bio::DB::Biblio::soap::__ANON__ /usr/local/share/perl/5.8.8/Bio/DB/Biblio/soap.pm:109 STACK SOAP::Lite::call /usr/share/perl5/SOAP/Lite.pm:3412 STACK SOAP::Lite::__ANON__ /usr/share/perl5/SOAP/Lite.pm:3377 STACK Bio::DB::Biblio::soap::find /usr/local/share/perl/5.8.8/Bio/DB/Biblio/ soap.pm:393 STACK main::_find /usr/local/bin/bp_biblio.pl:214 STACK toplevel /usr/local/bin/bp_biblio.pl:107 ------------------------------------- From lengjingmao at gmail.com Mon Feb 9 06:57:26 2009 From: lengjingmao at gmail.com (kevin fan) Date: Mon, 9 Feb 2009 12:57:26 +0100 Subject: [Bioperl-l] met a problem when run Bio::Tools::Run::Phylo::PAML::Yn00 Message-ID: <517072a20902090357j429da547l873b1841841460e4@mail.gmail.com> Dear all, i am using Bio::Tools::Run::Phylo::PAML::Yn00 to calculate Ka/Ks ratio of pairwise sequences. bioperl version is 1.6, and PAML version is 4.2 when i run the Bio::Tools::Run::Phylo::PAML::Yn00 , it warns: -------------------- WARNING --------------------- MSG: There was an error - see error_string for the program output --------------------------------------------------- ------------- EXCEPTION: Bio::Root::NotImplemented ------------- MSG: Unknown format of PAML output did not see seqtype STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.0/Bio/Root/Root.pm:359 STACK: Bio::Tools::Phylo::PAML::_parse_summary /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:441 STACK: Bio::Tools::Phylo::PAML::next_result /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:257 STACK: kaks.pl:10 ---------------------------------------------------------------- From David.Messina at sbc.su.se Mon Feb 9 07:38:32 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 9 Feb 2009 13:38:32 +0100 Subject: [Bioperl-l] met a problem when run Bio::Tools::Run::Phylo::PAML::Yn00 In-Reply-To: <517072a20902090357j429da547l873b1841841460e4@mail.gmail.com> References: <517072a20902090357j429da547l873b1841841460e4@mail.gmail.com> Message-ID: <628aabb70902090438qcb0f1c3t9dc643b2715c4f4c@mail.gmail.com> Hi Kevin, Could you enter this on bugzilla, our bug tracker, and please attach your script and a small input file so we can try to reproduce the error? Thanks, Dave From cjfields at illinois.edu Mon Feb 9 08:11:32 2009 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 9 Feb 2009 07:11:32 -0600 Subject: [Bioperl-l] met a problem when run Bio::Tools::Run::Phylo::PAML::Yn00 In-Reply-To: <517072a20902090357j429da547l873b1841841460e4@mail.gmail.com> References: <517072a20902090357j429da547l873b1841841460e4@mail.gmail.com> Message-ID: <79F8333F-7EE7-4719-85D8-D4A2498B838A@illinois.edu> We do not adequately support PAML v4.0 and up at this time. Much of this is due to PAML's constantly shifting output (every release seems to break our parsers). chris On Feb 9, 2009, at 5:57 AM, kevin fan wrote: > Dear all, > i am using Bio::Tools::Run::Phylo::PAML::Yn00 to calculate Ka/Ks > ratio of > pairwise sequences. > bioperl version is 1.6, and PAML version is 4.2 > > when i run the Bio::Tools::Run::Phylo::PAML::Yn00 , it warns: > > -------------------- WARNING --------------------- > MSG: There was an error - see error_string for the program output > --------------------------------------------------- > > ------------- EXCEPTION: Bio::Root::NotImplemented ------------- > MSG: Unknown format of PAML output did not see seqtype > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/local/share/perl/5.10.0/Bio/Root/Root.pm:359 > STACK: Bio::Tools::Phylo::PAML::_parse_summary > /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:441 > STACK: Bio::Tools::Phylo::PAML::next_result > /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:257 > STACK: kaks.pl:10 > ---------------------------------------------------------------- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Mon Feb 9 08:36:32 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Mon, 9 Feb 2009 14:36:32 +0100 Subject: [Bioperl-l] met a problem when run Bio::Tools::Run::Phylo::PAML::Yn00 In-Reply-To: <517072a20902090503t77e3e73bh759d31c657a6cb07@mail.gmail.com> References: <517072a20902090357j429da547l873b1841841460e4@mail.gmail.com> <628aabb70902090438qcb0f1c3t9dc643b2715c4f4c@mail.gmail.com> <517072a20902090503t77e3e73bh759d31c657a6cb07@mail.gmail.com> Message-ID: <628aabb70902090536t778e2ad5me230157e953c81e1@mail.gmail.com> Hi Kevin, i have pasted my problem on the bugzilla. > > Great. Could you also attach part of your input file, Pool1_c100Pool2_c496.aln (or all of it if it's small)? Or if you'd prefer, just a simple test input file that can reproduce the problem? Thanks, Dave PS - please remember to 'Reply All' so that all of the email conversation is kept on the list so everyone can follow along and we can record it in the archives. From lengjingmao at gmail.com Mon Feb 9 08:39:46 2009 From: lengjingmao at gmail.com (kevin fan) Date: Mon, 9 Feb 2009 14:39:46 +0100 Subject: [Bioperl-l] met a problem when run Bio::Tools::Run::Phylo::PAML::Yn00 In-Reply-To: <79F8333F-7EE7-4719-85D8-D4A2498B838A@illinois.edu> References: <517072a20902090357j429da547l873b1841841460e4@mail.gmail.com> <79F8333F-7EE7-4719-85D8-D4A2498B838A@illinois.edu> Message-ID: <517072a20902090539l71682be7q2e83c4582922a4d9@mail.gmail.com> hi, chris thank you for the reply. i will use PAML 3.15 instead cheers, kevin 2009/2/9 Chris Fields > We do not adequately support PAML v4.0 and up at this time. Much of this > is due to PAML's constantly shifting output (every release seems to break > our parsers). > > chris > > > On Feb 9, 2009, at 5:57 AM, kevin fan wrote: > > Dear all, >> i am using Bio::Tools::Run::Phylo::PAML::Yn00 to calculate Ka/Ks ratio of >> pairwise sequences. >> bioperl version is 1.6, and PAML version is 4.2 >> >> when i run the Bio::Tools::Run::Phylo::PAML::Yn00 , it warns: >> >> -------------------- WARNING --------------------- >> MSG: There was an error - see error_string for the program output >> --------------------------------------------------- >> >> ------------- EXCEPTION: Bio::Root::NotImplemented ------------- >> MSG: Unknown format of PAML output did not see seqtype >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/local/share/perl/5.10.0/Bio/Root/Root.pm:359 >> STACK: Bio::Tools::Phylo::PAML::_parse_summary >> /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:441 >> STACK: Bio::Tools::Phylo::PAML::next_result >> /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:257 >> STACK: kaks.pl:10 >> ---------------------------------------------------------------- >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > From cjfields at illinois.edu Mon Feb 9 08:45:59 2009 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 9 Feb 2009 07:45:59 -0600 Subject: [Bioperl-l] met a problem when run Bio::Tools::Run::Phylo::PAML::Yn00 In-Reply-To: <517072a20902090539l71682be7q2e83c4582922a4d9@mail.gmail.com> References: <517072a20902090357j429da547l873b1841841460e4@mail.gmail.com> <79F8333F-7EE7-4719-85D8-D4A2498B838A@illinois.edu> <517072a20902090539l71682be7q2e83c4582922a4d9@mail.gmail.com> Message-ID: <63C69C7C-2822-434D-A75C-652DD3D54118@illinois.edu> kevin, As Dave mentioned, go ahead and attach any output from PAML 4.2 to the bug report. I want to make it a priority to get the PAMLM parser working for the 1.6 release series (if not 1.6.1, maybe 1.6.2). chris On Feb 9, 2009, at 7:39 AM, kevin fan wrote: > > hi, chris > thank you for the reply. i will use PAML 3.15 instead > > cheers, > kevin > 2009/2/9 Chris Fields > We do not adequately support PAML v4.0 and up at this time. Much of > this is due to PAML's constantly shifting output (every release > seems to break our parsers). > > chris > > > On Feb 9, 2009, at 5:57 AM, kevin fan wrote: > > Dear all, > i am using Bio::Tools::Run::Phylo::PAML::Yn00 to calculate Ka/Ks > ratio of > pairwise sequences. > bioperl version is 1.6, and PAML version is 4.2 > > when i run the Bio::Tools::Run::Phylo::PAML::Yn00 , it warns: > > -------------------- WARNING --------------------- > MSG: There was an error - see error_string for the program output > --------------------------------------------------- > > ------------- EXCEPTION: Bio::Root::NotImplemented ------------- > MSG: Unknown format of PAML output did not see seqtype > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/local/share/perl/5.10.0/Bio/Root/Root.pm:359 > STACK: Bio::Tools::Phylo::PAML::_parse_summary > /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:441 > STACK: Bio::Tools::Phylo::PAML::next_result > /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:257 > STACK: kaks.pl:10 > ---------------------------------------------------------------- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Mon Feb 9 08:50:56 2009 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 9 Feb 2009 07:50:56 -0600 Subject: [Bioperl-l] met a problem when run Bio::Tools::Run::Phylo::PAML::Yn00 In-Reply-To: <63C69C7C-2822-434D-A75C-652DD3D54118@illinois.edu> References: <517072a20902090357j429da547l873b1841841460e4@mail.gmail.com> <79F8333F-7EE7-4719-85D8-D4A2498B838A@illinois.edu> <517072a20902090539l71682be7q2e83c4582922a4d9@mail.gmail.com> <63C69C7C-2822-434D-A75C-652DD3D54118@illinois.edu> Message-ID: On Feb 9, 2009, at 7:45 AM, Chris Fields wrote: > kevin, > > As Dave mentioned, go ahead and attach any output from PAML 4.2 to > the bug report. I want to make it a priority to get the PAMLM > parser working for the 1.6 release series (if not 1.6.1, maybe 1.6.2). > > chris PAML, not 'PAMLM' (fat fingers). -c From lengjingmao at gmail.com Mon Feb 9 09:40:16 2009 From: lengjingmao at gmail.com (kevin fan) Date: Mon, 9 Feb 2009 15:40:16 +0100 Subject: [Bioperl-l] met a problem when run Bio::Tools::Run::Phylo::PAML::Yn00 In-Reply-To: <63C69C7C-2822-434D-A75C-652DD3D54118@illinois.edu> References: <517072a20902090357j429da547l873b1841841460e4@mail.gmail.com> <79F8333F-7EE7-4719-85D8-D4A2498B838A@illinois.edu> <517072a20902090539l71682be7q2e83c4582922a4d9@mail.gmail.com> <63C69C7C-2822-434D-A75C-652DD3D54118@illinois.edu> Message-ID: <517072a20902090640p65d513bej403a43fa40da629a@mail.gmail.com> hi, Chris and Dave, i will try to get permission from the boss. since the sequences i mentioned are from a new species. kevin. 2009/2/9 Chris Fields > kevin, > > As Dave mentioned, go ahead and attach any output from PAML 4.2 to the bug > report. I want to make it a priority to get the PAMLM parser working for > the 1.6 release series (if not 1.6.1, maybe 1.6.2). > > chris > > > On Feb 9, 2009, at 7:39 AM, kevin fan wrote: > > >> hi, chris >> thank you for the reply. i will use PAML 3.15 instead >> >> cheers, >> kevin >> 2009/2/9 Chris Fields >> We do not adequately support PAML v4.0 and up at this time. Much of this >> is due to PAML's constantly shifting output (every release seems to break >> our parsers). >> >> chris >> >> >> On Feb 9, 2009, at 5:57 AM, kevin fan wrote: >> >> Dear all, >> i am using Bio::Tools::Run::Phylo::PAML::Yn00 to calculate Ka/Ks ratio of >> pairwise sequences. >> bioperl version is 1.6, and PAML version is 4.2 >> >> when i run the Bio::Tools::Run::Phylo::PAML::Yn00 , it warns: >> >> -------------------- WARNING --------------------- >> MSG: There was an error - see error_string for the program output >> --------------------------------------------------- >> >> ------------- EXCEPTION: Bio::Root::NotImplemented ------------- >> MSG: Unknown format of PAML output did not see seqtype >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/local/share/perl/5.10.0/Bio/Root/Root.pm:359 >> STACK: Bio::Tools::Phylo::PAML::_parse_summary >> /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:441 >> STACK: Bio::Tools::Phylo::PAML::next_result >> /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:257 >> STACK: kaks.pl:10 >> ---------------------------------------------------------------- >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> > From maj at fortinbras.us Mon Feb 9 09:54:44 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Mon, 9 Feb 2009 09:54:44 -0500 Subject: [Bioperl-l] met a problem when runBio::Tools::Run::Phylo::PAML::Yn00 In-Reply-To: <517072a20902090640p65d513bej403a43fa40da629a@mail.gmail.com> References: <517072a20902090357j429da547l873b1841841460e4@mail.gmail.com><79F8333F-7EE7-4719-85D8-D4A2498B838A@illinois.edu><517072a20902090539l71682be7q2e83c4582922a4d9@mail.gmail.com><63C69C7C-2822-434D-A75C-652DD3D54118@illinois.edu> <517072a20902090640p65d513bej403a43fa40da629a@mail.gmail.com> Message-ID: <4C522485444B4B8DAF83D0E4E0A684FB@NewLife> Hi Kevin- maybe you can avoid the proprietary data problem-- can you reproduce the error using the example data that comes with the PAML distribution? That would be great, since it looks like our tests are based on (earlier version's) output from those examples, and we could do a direct comparison to what BioPerl thinks is right- cheers MAJ ----- Original Message ----- From: "kevin fan" To: "Chris Fields" Cc: Sent: Monday, February 09, 2009 9:40 AM Subject: Re: [Bioperl-l] met a problem when runBio::Tools::Run::Phylo::PAML::Yn00 > hi, Chris and Dave, > > i will try to get permission from the boss. since the sequences i mentioned > are from a new species. > > kevin. > > 2009/2/9 Chris Fields > >> kevin, >> >> As Dave mentioned, go ahead and attach any output from PAML 4.2 to the bug >> report. I want to make it a priority to get the PAMLM parser working for >> the 1.6 release series (if not 1.6.1, maybe 1.6.2). >> >> chris >> >> >> On Feb 9, 2009, at 7:39 AM, kevin fan wrote: >> >> >>> hi, chris >>> thank you for the reply. i will use PAML 3.15 instead >>> >>> cheers, >>> kevin >>> 2009/2/9 Chris Fields >>> We do not adequately support PAML v4.0 and up at this time. Much of this >>> is due to PAML's constantly shifting output (every release seems to break >>> our parsers). >>> >>> chris >>> >>> >>> On Feb 9, 2009, at 5:57 AM, kevin fan wrote: >>> >>> Dear all, >>> i am using Bio::Tools::Run::Phylo::PAML::Yn00 to calculate Ka/Ks ratio of >>> pairwise sequences. >>> bioperl version is 1.6, and PAML version is 4.2 >>> >>> when i run the Bio::Tools::Run::Phylo::PAML::Yn00 , it warns: >>> >>> -------------------- WARNING --------------------- >>> MSG: There was an error - see error_string for the program output >>> --------------------------------------------------- >>> >>> ------------- EXCEPTION: Bio::Root::NotImplemented ------------- >>> MSG: Unknown format of PAML output did not see seqtype >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw >>> /usr/local/share/perl/5.10.0/Bio/Root/Root.pm:359 >>> STACK: Bio::Tools::Phylo::PAML::_parse_summary >>> /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:441 >>> STACK: Bio::Tools::Phylo::PAML::next_result >>> /usr/local/share/perl/5.10.0/Bio/Tools/Phylo/PAML.pm:257 >>> STACK: kaks.pl:10 >>> ---------------------------------------------------------------- >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From MEC at stowers.org Mon Feb 9 09:50:46 2009 From: MEC at stowers.org (Cook, Malcolm) Date: Mon, 9 Feb 2009 08:50:46 -0600 Subject: [Bioperl-l] SeqIO-based parser for Vector NTI sequence files In-Reply-To: <48FE4051.2010700@accelrys.com> References: <48FE4051.2010700@accelrys.com> Message-ID: Scott, What do you expect to extract from the COMMENT lines? Malcolm Cook Database Applications Manager - Bioinformatics Stowers Institute for Medical Research - Kansas City, Missouri -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Scott Markel Sent: Tuesday, October 21, 2008 3:49 PM To: bioperl-ml Cc: smarkel at accelrys.com Subject: [Bioperl-l] SeqIO-based parser for Vector NTI sequence files I'm looking for a BioPerl-related solution to parsing Vector NTI sequence files. The genbank.pm parser will work, but it doesn't parse the COMMENT lines beyond grabbing the simple string value, so it misses all of the added information in those lines. If you know of any existing code, I'd be interesting in hearing about it. I checked BioPerl, BioJava, and EMBOSS documentation. I also checked the Invitrogen web site. Scott -- Scott Markel, Ph.D. Principal Bioinformatics Architect email: smarkel at accelrys.com Accelrys (SciTegic R&D) mobile: +1 858 205 3653 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 San Diego, CA 92121 fax: +1 858 799 5222 USA web: http://www.accelrys.com http://www.linkedin.com/in/smarkel Board of Directors: International Society for Computational Biology Co-chair: ISCB Publications Committee Associate Editor: PLoS Computational Biology Editorial Board: Briefings in Bioinformatics _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From jacoby at purdue.edu Mon Feb 9 11:54:59 2009 From: jacoby at purdue.edu (Dave Jacoby) Date: Mon, 09 Feb 2009 11:54:59 -0500 Subject: [Bioperl-l] Wanting to inject date into a SeqIO object Message-ID: <49905FE3.6080601@purdue.edu> I'm working with a database full of transposable elements. We input things in FASTA format, and save the date of the upload into the database separately. When we want to display in GCG format, specifically, Bio::SeqIO::gcg can't find a date in the object and uses the current date. Using a line out of the x2y.pl example from the wiki, we would like to go from this: while (my $inseq = $seq_in->next_seq) { $seq_out->write_seq($inseq); } to while (my $inseq = $seq_in->next_seq) { $inseq->add_date($DATE_FROM_DB) ; $seq_out->write_seq($inseq); } I have looked through the modules in Bio::SeqIO and I fail to understand how to do such a thing. Can anyone help me? -- Dave Jacoby Address: WSLR S049 Purdue Genomics Core Mail: jacoby at purdue.edu Jabber: jacoby at jabber.org Phone: hah! From maj at fortinbras.us Mon Feb 9 12:19:37 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Mon, 9 Feb 2009 12:19:37 -0500 Subject: [Bioperl-l] Wanting to inject date into a SeqIO object In-Reply-To: <49905FE3.6080601@purdue.edu> References: <49905FE3.6080601@purdue.edu> Message-ID: Hi Dave, One way to do it- You could tack the date onto the 'description' field: $inseq->desc( $inseq->desc . " {date: $DATE_FROM_DB}" ); And later do ($our_date) = ($inseq->desc =~ /{date: ([^}]+)}/); to retrieve it. If it screws up downstream processing, get the date first as above and then do my $desc = $inseq->desc; $desc =~ s/{date:.*?}//; $inseq->desc($desc); (or something like that...) to reset to "factory description". cheers MAJ ----- Original Message ----- From: "Dave Jacoby" To: "BioPerl-L" Sent: Monday, February 09, 2009 11:54 AM Subject: [Bioperl-l] Wanting to inject date into a SeqIO object > I'm working with a database full of transposable elements. We input > things in FASTA format, and save the date of the upload into the > database separately. When we want to display in GCG format, > specifically, Bio::SeqIO::gcg can't find a date in the object and uses > the current date. Using a line out of the x2y.pl example from the wiki, > we would like to go from this: > > while (my $inseq = $seq_in->next_seq) { > $seq_out->write_seq($inseq); > } > > to > while (my $inseq = $seq_in->next_seq) { > $inseq->add_date($DATE_FROM_DB) ; > $seq_out->write_seq($inseq); > } > > I have looked through the modules in Bio::SeqIO and I fail to understand > how to do such a thing. Can anyone help me? > > -- > Dave Jacoby Address: WSLR S049 > Purdue Genomics Core Mail: jacoby at purdue.edu > Jabber: jacoby at jabber.org > Phone: hah! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From hlapp at gmx.net Mon Feb 9 12:45:23 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 9 Feb 2009 12:45:23 -0500 Subject: [Bioperl-l] Wanting to inject date into a SeqIO object In-Reply-To: <49905FE3.6080601@purdue.edu> References: <49905FE3.6080601@purdue.edu> Message-ID: Hi Dave, $seq->add_date() is the right call, but not all output formats support dates. I don't remember exactly about the gcg format, but I do know that UniProt can have multiple dates, if that's what you want. To clarify, did you actually try your second version below and found it not to work in the sense that the date did not show up in the output file? -hilmar On Feb 9, 2009, at 11:54 AM, Dave Jacoby wrote: > I'm working with a database full of transposable elements. We input > things in FASTA format, and save the date of the upload into the > database separately. When we want to display in GCG format, > specifically, Bio::SeqIO::gcg can't find a date in the object and > uses the current date. Using a line out of the x2y.pl example from > the wiki, > we would like to go from this: > > while (my $inseq = $seq_in->next_seq) { > $seq_out->write_seq($inseq); > } > > to > while (my $inseq = $seq_in->next_seq) { > $inseq->add_date($DATE_FROM_DB) ; > $seq_out->write_seq($inseq); > } > > I have looked through the modules in Bio::SeqIO and I fail to > understand how to do such a thing. Can anyone help me? > > -- > Dave Jacoby Address: WSLR S049 > Purdue Genomics Core Mail: jacoby at purdue.edu > Jabber: jacoby at jabber.org > Phone: hah! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at illinois.edu Mon Feb 9 12:49:03 2009 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 9 Feb 2009 11:49:03 -0600 Subject: [Bioperl-l] SeqIO-based parser for Vector NTI sequence files In-Reply-To: References: <48FE4051.2010700@accelrys.com> Message-ID: <7E86F833-3488-4FC7-81FC-FB482F00C2CD@illinois.edu> I think the best short-term thing may be to wrap the genbank.pm parser and simply reparse/rework the relevant Bio::Annotation::Comment instance containing the COMMENT data. Long-term, I would like to have an XML-like parser that just takes the data and passes it in to a handler (so you could customize what happens to data, create objects, load databases, etc). Along these lines I've been (very slowly) reworking GenBank/EMBL/UniProt parsing so it generically parses data and passes it on to a relevant handler instance (in this case it just generates a Bio::Seq::Richseq as the regular parser does). It still needs a bit more work, though, particularly the internals. if you want to test them out the modules are in the last 1.6.0 release as Bio::SeqIO::gbdriver/embldriver/swissdriver. chris On Feb 9, 2009, at 8:50 AM, Cook, Malcolm wrote: > Scott, > > What do you expect to extract from the COMMENT lines? > > > Malcolm Cook > Database Applications Manager - Bioinformatics > Stowers Institute for Medical Research - Kansas City, Missouri > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org > ] On Behalf Of Scott Markel > Sent: Tuesday, October 21, 2008 3:49 PM > To: bioperl-ml > Cc: smarkel at accelrys.com > Subject: [Bioperl-l] SeqIO-based parser for Vector NTI sequence files > > I'm looking for a BioPerl-related solution to parsing Vector NTI > sequence files. The genbank.pm parser will work, but it doesn't > parse the COMMENT lines beyond grabbing the simple string value, so > it misses all of the added information in those lines. > > If you know of any existing code, I'd be interesting in hearing > about it. I checked BioPerl, BioJava, and EMBOSS documentation. > I also checked the Invitrogen web site. > > Scott > > -- > Scott Markel, Ph.D. > Principal Bioinformatics Architect email: smarkel at accelrys.com > Accelrys (SciTegic R&D) mobile: +1 858 205 3653 > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > San Diego, CA 92121 fax: +1 858 799 5222 > USA web: http://www.accelrys.com > > http://www.linkedin.com/in/smarkel > Board of Directors: International Society for Computational Biology > Co-chair: ISCB Publications Committee > Associate Editor: PLoS Computational Biology Editorial Board: > Briefings in Bioinformatics > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Mon Feb 9 13:02:17 2009 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 9 Feb 2009 12:02:17 -0600 Subject: [Bioperl-l] Wanting to inject date into a SeqIO object In-Reply-To: References: <49905FE3.6080601@purdue.edu> Message-ID: <3A437E16-8D69-4642-8BD8-F8C78245D723@illinois.edu> A bit off topic, but I've wondered about this for a while (never brought it up): for data like dates/timestamps, would it be convenient to have a Bio::AnnotationI that correctly deals with such information (maybe incorporates DateTime)? We could then have add_date() and other relevant methods just grab the proper date information. chris On Feb 9, 2009, at 11:45 AM, Hilmar Lapp wrote: > Hi Dave, > > $seq->add_date() is the right call, but not all output formats > support dates. I don't remember exactly about the gcg format, but I > do know that UniProt can have multiple dates, if that's what you want. > > To clarify, did you actually try your second version below and found > it not to work in the sense that the date did not show up in the > output file? > > -hilmar > > On Feb 9, 2009, at 11:54 AM, Dave Jacoby wrote: > >> I'm working with a database full of transposable elements. We input >> things in FASTA format, and save the date of the upload into the >> database separately. When we want to display in GCG format, >> specifically, Bio::SeqIO::gcg can't find a date in the object and >> uses the current date. Using a line out of the x2y.pl example from >> the wiki, >> we would like to go from this: >> >> while (my $inseq = $seq_in->next_seq) { >> $seq_out->write_seq($inseq); >> } >> >> to >> while (my $inseq = $seq_in->next_seq) { >> $inseq->add_date($DATE_FROM_DB) ; >> $seq_out->write_seq($inseq); >> } >> >> I have looked through the modules in Bio::SeqIO and I fail to >> understand how to do such a thing. Can anyone help me? >> >> -- >> Dave Jacoby Address: WSLR S049 >> Purdue Genomics Core Mail: jacoby at purdue.edu >> Jabber: jacoby at jabber.org >> Phone: hah! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jacoby at purdue.edu Mon Feb 9 12:59:04 2009 From: jacoby at purdue.edu (Dave Jacoby) Date: Mon, 09 Feb 2009 12:59:04 -0500 Subject: [Bioperl-l] Wanting to inject date into a SeqIO object In-Reply-To: References: <49905FE3.6080601@purdue.edu> Message-ID: <49906EE8.6020706@purdue.edu> Hilmar Lapp wrote: > Hi Dave, > > $seq->add_date() is the right call, but not all output formats support > dates. I don't remember exactly about the gcg format, but I do know that > UniProt can have multiple dates, if that's what you want. > > To clarify, did you actually try your second version below and found it > not to work in the sense that the date did not show up in the output file? No, I tried that second version and found it not to work in the sense that it made no sense of add_date(). Can't locate method "add_date" via package "Bio::Seq"> -hilmar > On Feb 9, 2009, at 11:54 AM, Dave Jacoby wrote: > >> I'm working with a database full of transposable elements. We input >> things in FASTA format, and save the date of the upload into the >> database separately. When we want to display in GCG format, >> specifically, Bio::SeqIO::gcg can't find a date in the object and uses >> the current date. Using a line out of the x2y.pl example from the wiki, >> we would like to go from this: >> >> while (my $inseq = $seq_in->next_seq) { >> $seq_out->write_seq($inseq); >> } >> >> to >> while (my $inseq = $seq_in->next_seq) { >> $inseq->add_date($DATE_FROM_DB) ; >> $seq_out->write_seq($inseq); >> } >> >> I have looked through the modules in Bio::SeqIO and I fail to >> understand how to do such a thing. Can anyone help me? >> >> -- >> Dave Jacoby Address: WSLR S049 >> Purdue Genomics Core Mail: jacoby at purdue.edu >> Jabber: jacoby at jabber.org >> Phone: hah! >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Dave Jacoby Address: WSLR S049 Purdue Genomics Core Mail: jacoby at purdue.edu Jabber: jacoby at jabber.org Phone: hah! From cjfields at illinois.edu Mon Feb 9 13:26:50 2009 From: cjfields at illinois.edu (Chris Fields) Date: Mon, 9 Feb 2009 12:26:50 -0600 Subject: [Bioperl-l] Wanting to inject date into a SeqIO object In-Reply-To: <49906EE8.6020706@purdue.edu> References: <49905FE3.6080601@purdue.edu> <49906EE8.6020706@purdue.edu> Message-ID: add_date() is Bio::Seq::RichSeqI (not in Bio::Seq). You could bless a Bio::Seq instance into a Bio::Seq::RichSeq: bless $seq, 'Bio::Seq::RichSeq'; Though the best way to do this would be to have a Builder object create the proper class in the first place. chris On Feb 9, 2009, at 11:59 AM, Dave Jacoby wrote: > Hilmar Lapp wrote: >> Hi Dave, >> $seq->add_date() is the right call, but not all output formats >> support dates. I don't remember exactly about the gcg format, but I >> do know that UniProt can have multiple dates, if that's what you >> want. >> To clarify, did you actually try your second version below and >> found it not to work in the sense that the date did not show up in >> the output file? > > No, I tried that second version and found it not to work in the > sense that it made no sense of add_date(). > > Can't locate method "add_date" via package "Bio::Seq"> -hilmar > >> On Feb 9, 2009, at 11:54 AM, Dave Jacoby wrote: >>> I'm working with a database full of transposable elements. We >>> input things in FASTA format, and save the date of the upload into >>> the database separately. When we want to display in GCG format, >>> specifically, Bio::SeqIO::gcg can't find a date in the object and >>> uses the current date. Using a line out of the x2y.pl example from >>> the wiki, >>> we would like to go from this: >>> >>> while (my $inseq = $seq_in->next_seq) { >>> $seq_out->write_seq($inseq); >>> } >>> >>> to >>> while (my $inseq = $seq_in->next_seq) { >>> $inseq->add_date($DATE_FROM_DB) ; >>> $seq_out->write_seq($inseq); >>> } >>> >>> I have looked through the modules in Bio::SeqIO and I fail to >>> understand how to do such a thing. Can anyone help me? >>> >>> -- >>> Dave Jacoby Address: WSLR S049 >>> Purdue Genomics Core Mail: jacoby at purdue.edu >>> Jabber: jacoby at jabber.org >>> Phone: hah! >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- > Dave Jacoby Address: WSLR S049 > Purdue Genomics Core Mail: jacoby at purdue.edu > Jabber: jacoby at jabber.org > Phone: hah! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From SMarkel at accelrys.com Mon Feb 9 13:34:51 2009 From: SMarkel at accelrys.com (Scott Markel) Date: Mon, 9 Feb 2009 13:34:51 -0500 Subject: [Bioperl-l] SeqIO-based parser for Vector NTI sequence files In-Reply-To: References: <48FE4051.2010700@accelrys.com> Message-ID: <1F1240778FB0AF46B4E5A72C44D2C7471D6534E6@exch1-hi.accelrys.net> Malcolm, It looks like Vector NTI puts features into COMMENT lines rather than leveraging the DDBJ/EMBL/GenBank Feature table syntax. I'd like to treat these features the same way I treat other features, hence my interest in parsing them. My only example file is from a customer so the following snippets have been tweaked a bit. My replacements are in angle brackets: <...>. COMMENT wrote: .COMMENT This file is created by Vector NTI http://www.informaxinc.com/ COMMENT ORIGDB|GenBank COMMENT VNTDATE|| COMMENT VNTDBDATE|| COMMENT LSOWNER| COMMENT VNTNAME|| COMMENT VNTAUTHORNAME|| COMMENT VNTREPLTYPE| COMMENT VNTEXTCHREPL|Animal/Other Eukaryotic COMMENT Vector_NTI_Display_Data_(Do_Not_Edit!) COMMENT (SXF COMMENT (CGexDoc "" 0 7616 COMMENT (CDBMol 0 0 1 1 1 0 0 0 0 "" "" 0 0 0 0 (CObList) (CObList) (CObList) COMMENT (CObList) -1 "") COMMENT (CDocSetData 1 1 0 1 0 1 "MAIN" 1 1 1 1 1 0 1 1 1 0 10 10 4294967295 50 0 COMMENT 1 0 (CHomObj 1 0 0 3 100) (CWordArray 23) (CWordArray) COMMENT (CStringList ) COMMENT (CStringList ) (CStringList ) COMMENT (CObList COMMENT #0=(COligo COMMENT "Tm: 52.1C Length: 16mer GC: 56.3%" 0 (CStringList) 0) COMMENT #1=(COligo COMMENT "Tm: 56.8C Length: 18mer GC: 61.1%" 0 (CStringList) 0) There are also some hierarchical sections. COMMENT (CObList) (CObList) (CObList) COMMENT (CTextView 0 COMMENT #120=(CGroupPar (CParagraph 0 (0 0) 1 2 0 0 180) COMMENT (CObjectList COMMENT #121=(CRefLinePar COMMENT (CLinePar (CParagraph 0 (0 0) 0 2 0 1 233) 2) 5 COMMENT "" 0 4) COMMENT #122=(CFolderPar COMMENT (CGroupPar (CParagraph 1 (0 0) 1 1 0 0 178) COMMENT (CObjectList COMMENT #123=(CLinePar (CParagraph 0 (0 0) 1 2 1 0 180) COMMENT 1) COMMENT #124=(CLinePar (CParagraph 0 (0 0) 1 2 1 0 180) COMMENT 1) Scott > -----Original Message----- > From: Cook, Malcolm [mailto:MEC at stowers.org] > Sent: Monday, 09 February 2009 6:51 AM > To: Scott Markel; 'bioperl-ml' > Subject: RE: [Bioperl-l] SeqIO-based parser for Vector NTI sequence files > > Scott, > > What do you expect to extract from the COMMENT lines? > > > Malcolm Cook > Database Applications Manager - Bioinformatics > Stowers Institute for Medical Research - Kansas City, Missouri > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Scott Markel > Sent: Tuesday, October 21, 2008 3:49 PM > To: bioperl-ml > Cc: smarkel at accelrys.com > Subject: [Bioperl-l] SeqIO-based parser for Vector NTI sequence files > > I'm looking for a BioPerl-related solution to parsing Vector NTI sequence > files. The genbank.pm parser will work, but it doesn't parse the COMMENT > lines beyond grabbing the simple string value, so it misses all of the > added information in those lines. > > If you know of any existing code, I'd be interesting in hearing about it. > I checked BioPerl, BioJava, and EMBOSS documentation. > I also checked the Invitrogen web site. > > Scott > > -- > Scott Markel, Ph.D. > Principal Bioinformatics Architect email: smarkel at accelrys.com > Accelrys (SciTegic R&D) mobile: +1 858 205 3653 > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > San Diego, CA 92121 fax: +1 858 799 5222 > USA web: http://www.accelrys.com > > http://www.linkedin.com/in/smarkel > Board of Directors: International Society for Computational Biology > Co-chair: ISCB Publications Committee > Associate Editor: PLoS Computational Biology Editorial Board: Briefings in > Bioinformatics _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From MEC at stowers.org Mon Feb 9 14:32:03 2009 From: MEC at stowers.org (Cook, Malcolm) Date: Mon, 9 Feb 2009 13:32:03 -0600 Subject: [Bioperl-l] SeqIO-based parser for Vector NTI sequence files In-Reply-To: <1F1240778FB0AF46B4E5A72C44D2C7471D6534E6@exch1-hi.accelrys.net> References: <48FE4051.2010700@accelrys.com> <1F1240778FB0AF46B4E5A72C44D2C7471D6534E6@exch1-hi.accelrys.net> Message-ID: Hi Scott, It is my understanding that Informax developer used the COMMENT line to encode molecular level attribute-value pairs. i.e. your VNTDATE|| Some dialect of LISP had a role in the products back-days. The attribute 'Vector_NTI_Display_Data_(Do_Not_Edit!)' is a LISP S-expression whose CAR is 'SXF' (ever program LISP?). It should be easily 'parseable' by any LISP interpreter or anything that can balance parens. The only 'rub' is the #NNN= tokens which appear to be some form of forward reference used by the lisp serializer to allow for internal references. I would be very surprised if you could learn from ABI (was Invitrogen, was Informax) anything about the internal structure of this. I once asked about 5 years ago. I am/was a supported user. It was arcane historical knowledge even then. I would also be very surprised if there was anything in it that was meaningful, unless the creator of the molecule in VNTI was using some very odd convention. I used to know which dialect of LISP was being used by them and might track it down if it were important to you.... If you are after the oligo descriptive text in the COMMENT, it is likely that the oligos ALSO have a genbank feature associated with them. But you probably already looked.... What are you really after...? One thing you should know (if you don't) that might bear on your underlying problem (whatever it may be....). Vector NTI can open GFF files and display them as an extenral analysis. The element of the analysis can then be promoted interactively by the user to be features (genbank style). Malcolm > -----Original Message----- > From: Scott Markel [mailto:SMarkel at accelrys.com] > Sent: Monday, February 09, 2009 12:35 PM > To: Cook, Malcolm; 'bioperl-ml' > Subject: RE: [Bioperl-l] SeqIO-based parser for Vector NTI > sequence files > > Malcolm, > > It looks like Vector NTI puts features into COMMENT lines > rather than leveraging the DDBJ/EMBL/GenBank Feature table > syntax. I'd like to treat these features the same way I > treat other features, hence my interest in parsing them. > > My only example file is from a customer so the following > snippets have been tweaked a bit. My replacements are in > angle brackets: <...>. > > COMMENT wrote: > > > .COMMENT This file is created by Vector NTI > http://www.informaxinc.com/ > COMMENT ORIGDB|GenBank > COMMENT VNTDATE|| > COMMENT VNTDBDATE|| > COMMENT LSOWNER| > COMMENT VNTNAME|| > COMMENT VNTAUTHORNAME|| > COMMENT VNTREPLTYPE| > COMMENT VNTEXTCHREPL|Animal/Other Eukaryotic > COMMENT Vector_NTI_Display_Data_(Do_Not_Edit!) > COMMENT (SXF > COMMENT (CGexDoc "" 0 7616 > COMMENT (CDBMol 0 0 1 1 1 0 0 0 0 "" "" 0 0 0 0 > (CObList) (CObList) (CObList) > COMMENT (CObList) -1 "") > COMMENT (CDocSetData 1 1 0 1 0 1 "MAIN" 1 1 1 1 1 0 1 1 > 1 0 10 10 4294967295 50 0 > COMMENT 1 0 (CHomObj 1 0 0 3 100) (CWordArray 23) (CWordArray) > COMMENT (CStringList ) > COMMENT (CStringList ) > (CStringList ) > COMMENT (CObList > COMMENT #0=(COligo > COMMENT "Tm: 52.1C Length: 16mer GC: 56.3%" 0 > (CStringList) 0) > COMMENT #1=(COligo > COMMENT "Tm: 56.8C Length: 18mer GC: 61.1%" 0 > (CStringList) 0) > > There are also some hierarchical sections. > > COMMENT (CObList) (CObList) (CObList) > COMMENT (CTextView 0 > COMMENT #120=(CGroupPar (CParagraph 0 (0 0) 1 2 0 0 180) > COMMENT (CObjectList > COMMENT #121=(CRefLinePar > COMMENT (CLinePar (CParagraph 0 (0 0) 0 2 > 0 1 233) 2) 5 > COMMENT "" 0 4) > COMMENT #122=(CFolderPar > COMMENT (CGroupPar (CParagraph 1 (0 0) 1 > 1 0 0 178) > COMMENT (CObjectList > COMMENT #123=(CLinePar (CParagraph 0 (0 > 0) 1 2 1 0 180) > COMMENT 1) > COMMENT #124=(CLinePar (CParagraph 0 (0 > 0) 1 2 1 0 180) > COMMENT 1) > > Scott > > > -----Original Message----- > > From: Cook, Malcolm [mailto:MEC at stowers.org] > > Sent: Monday, 09 February 2009 6:51 AM > > To: Scott Markel; 'bioperl-ml' > > Subject: RE: [Bioperl-l] SeqIO-based parser for Vector NTI sequence > > files > > > > Scott, > > > > What do you expect to extract from the COMMENT lines? > > > > > > Malcolm Cook > > Database Applications Manager - Bioinformatics Stowers > Institute for > > Medical Research - Kansas City, Missouri > > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Scott Markel > > Sent: Tuesday, October 21, 2008 3:49 PM > > To: bioperl-ml > > Cc: smarkel at accelrys.com > > Subject: [Bioperl-l] SeqIO-based parser for Vector NTI > sequence files > > > > I'm looking for a BioPerl-related solution to parsing Vector NTI > > sequence files. The genbank.pm parser will work, but it > doesn't parse > > the COMMENT lines beyond grabbing the simple string value, so it > > misses all of the added information in those lines. > > > > If you know of any existing code, I'd be interesting in > hearing about it. > > I checked BioPerl, BioJava, and EMBOSS documentation. > > I also checked the Invitrogen web site. > > > > Scott > > > > -- > > Scott Markel, Ph.D. > > Principal Bioinformatics Architect email: smarkel at accelrys.com > > Accelrys (SciTegic R&D) mobile: +1 858 205 3653 > > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > > San Diego, CA 92121 fax: +1 858 799 5222 > > USA web: http://www.accelrys.com > > > > http://www.linkedin.com/in/smarkel > > Board of Directors: International Society for Computational Biology > > Co-chair: ISCB Publications Committee > > Associate Editor: PLoS Computational Biology Editorial Board: > > Briefings in Bioinformatics > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From SMarkel at accelrys.com Mon Feb 9 14:46:12 2009 From: SMarkel at accelrys.com (Scott Markel) Date: Mon, 9 Feb 2009 14:46:12 -0500 Subject: [Bioperl-l] SeqIO-based parser for Vector NTI sequence files In-Reply-To: References: <48FE4051.2010700@accelrys.com> <1F1240778FB0AF46B4E5A72C44D2C7471D6534E6@exch1-hi.accelrys.net> Message-ID: <1F1240778FB0AF46B4E5A72C44D2C7471D6535DE@exch1-hi.accelrys.net> Malcolm, Thank you for your follow-up email. > What are you really after...? Our Pipeline Pilot-based Sequence Analysis Collection can read a variety of sequence file formats, largely thanks to the BioPerl parsers. One of my customers would like us to also be able to read Vector NTI files. They annotate sequences in Vector NTI and want to use these sequences, with features, in our product the same way they can work with other sequences. Since the Vector NTI file format is nominally GenBank format, I can "read" the file, but I miss the annotations that the customer added. Hence my interest in parsing these additional lines. > One thing you should know (if you don't) that might bear on your > underlying problem (whatever it may be....). Vector NTI can open GFF > files and display them as an extenral analysis. The element of the > analysis can then be promoted interactively by the user to be features > (genbank style). This is good to know. Maybe the solution to my problem is as simple as having the user appropriately promote the features they create. I haven't used Vector NTI in ages so I'm not familiar with any of the save/export options. Scott > -----Original Message----- > From: Cook, Malcolm [mailto:MEC at stowers.org] > Sent: Monday, 09 February 2009 11:32 AM > To: Scott Markel; 'bioperl-ml' > Subject: RE: [Bioperl-l] SeqIO-based parser for Vector NTI sequence files > > Hi Scott, > > It is my understanding that Informax developer used the COMMENT line to > encode molecular level attribute-value pairs. > > i.e. your VNTDATE|| > > Some dialect of LISP had a role in the products back-days. The attribute > 'Vector_NTI_Display_Data_(Do_Not_Edit!)' is a LISP S-expression whose CAR > is 'SXF' (ever program LISP?). > > It should be easily 'parseable' by any LISP interpreter or anything that > can balance parens. > > The only 'rub' is the #NNN= tokens which appear to be some form of forward > reference used by the lisp serializer to allow for internal references. > > I would be very surprised if you could learn from ABI (was Invitrogen, was > Informax) anything about the internal structure of this. I once asked > about 5 years ago. I am/was a supported user. It was arcane historical > knowledge even then. > > I would also be very surprised if there was anything in it that was > meaningful, unless the creator of the molecule in VNTI was using some very > odd convention. > > I used to know which dialect of LISP was being used by them and might > track it down if it were important to you.... > > If you are after the oligo descriptive text in the COMMENT, it is likely > that the oligos ALSO have a genbank feature associated with them. But you > probably already looked.... > > What are you really after...? > > One thing you should know (if you don't) that might bear on your > underlying problem (whatever it may be....). Vector NTI can open GFF > files and display them as an extenral analysis. The element of the > analysis can then be promoted interactively by the user to be features > (genbank style). > > Malcolm > > > > -----Original Message----- > > From: Scott Markel [mailto:SMarkel at accelrys.com] > > Sent: Monday, February 09, 2009 12:35 PM > > To: Cook, Malcolm; 'bioperl-ml' > > Subject: RE: [Bioperl-l] SeqIO-based parser for Vector NTI > > sequence files > > > > Malcolm, > > > > It looks like Vector NTI puts features into COMMENT lines > > rather than leveraging the DDBJ/EMBL/GenBank Feature table > > syntax. I'd like to treat these features the same way I > > treat other features, hence my interest in parsing them. > > > > My only example file is from a customer so the following > > snippets have been tweaked a bit. My replacements are in > > angle brackets: <...>. > > > > COMMENT wrote: > > > > > > .COMMENT This file is created by Vector NTI > > http://www.informaxinc.com/ > > COMMENT ORIGDB|GenBank > > COMMENT VNTDATE|| > > COMMENT VNTDBDATE|| > > COMMENT LSOWNER| > > COMMENT VNTNAME|| > > COMMENT VNTAUTHORNAME|| > > COMMENT VNTREPLTYPE| > > COMMENT VNTEXTCHREPL|Animal/Other Eukaryotic > > COMMENT Vector_NTI_Display_Data_(Do_Not_Edit!) > > COMMENT (SXF > > COMMENT (CGexDoc "" 0 7616 > > COMMENT (CDBMol 0 0 1 1 1 0 0 0 0 "" "" 0 0 0 0 > > (CObList) (CObList) (CObList) > > COMMENT (CObList) -1 "") > > COMMENT (CDocSetData 1 1 0 1 0 1 "MAIN" 1 1 1 1 1 0 1 1 > > 1 0 10 10 4294967295 50 0 > > COMMENT 1 0 (CHomObj 1 0 0 3 100) (CWordArray 23) (CWordArray) > > COMMENT (CStringList ) > > COMMENT (CStringList ) > > (CStringList ) > > COMMENT (CObList > > COMMENT #0=(COligo > > COMMENT "Tm: 52.1C Length: 16mer GC: 56.3%" 0 > > (CStringList) 0) > > COMMENT #1=(COligo > > COMMENT "Tm: 56.8C Length: 18mer GC: 61.1%" 0 > > (CStringList) 0) > > > > There are also some hierarchical sections. > > > > COMMENT (CObList) (CObList) (CObList) > > COMMENT (CTextView 0 > > COMMENT #120=(CGroupPar (CParagraph 0 (0 0) 1 2 0 0 180) > > COMMENT (CObjectList > > COMMENT #121=(CRefLinePar > > COMMENT (CLinePar (CParagraph 0 (0 0) 0 2 > > 0 1 233) 2) 5 > > COMMENT "" 0 4) > > COMMENT #122=(CFolderPar > > COMMENT (CGroupPar (CParagraph 1 (0 0) 1 > > 1 0 0 178) > > COMMENT (CObjectList > > COMMENT #123=(CLinePar (CParagraph 0 (0 > > 0) 1 2 1 0 180) > > COMMENT 1) > > COMMENT #124=(CLinePar (CParagraph 0 (0 > > 0) 1 2 1 0 180) > > COMMENT 1) > > > > Scott > > > > > -----Original Message----- > > > From: Cook, Malcolm [mailto:MEC at stowers.org] > > > Sent: Monday, 09 February 2009 6:51 AM > > > To: Scott Markel; 'bioperl-ml' > > > Subject: RE: [Bioperl-l] SeqIO-based parser for Vector NTI sequence > > > files > > > > > > Scott, > > > > > > What do you expect to extract from the COMMENT lines? > > > > > > > > > Malcolm Cook > > > Database Applications Manager - Bioinformatics Stowers > > Institute for > > > Medical Research - Kansas City, Missouri > > > > > > -----Original Message----- > > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > > bounces at lists.open-bio.org] On Behalf Of Scott Markel > > > Sent: Tuesday, October 21, 2008 3:49 PM > > > To: bioperl-ml > > > Cc: smarkel at accelrys.com > > > Subject: [Bioperl-l] SeqIO-based parser for Vector NTI > > sequence files > > > > > > I'm looking for a BioPerl-related solution to parsing Vector NTI > > > sequence files. The genbank.pm parser will work, but it > > doesn't parse > > > the COMMENT lines beyond grabbing the simple string value, so it > > > misses all of the added information in those lines. > > > > > > If you know of any existing code, I'd be interesting in > > hearing about it. > > > I checked BioPerl, BioJava, and EMBOSS documentation. > > > I also checked the Invitrogen web site. > > > > > > Scott > > > > > > -- > > > Scott Markel, Ph.D. > > > Principal Bioinformatics Architect email: smarkel at accelrys.com > > > Accelrys (SciTegic R&D) mobile: +1 858 205 3653 > > > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > > > San Diego, CA 92121 fax: +1 858 799 5222 > > > USA web: http://www.accelrys.com > > > > > > http://www.linkedin.com/in/smarkel > > > Board of Directors: International Society for Computational Biology > > > Co-chair: ISCB Publications Committee > > > Associate Editor: PLoS Computational Biology Editorial Board: > > > Briefings in Bioinformatics > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From mmuratet at hudsonalpha.org Mon Feb 9 15:36:24 2009 From: mmuratet at hudsonalpha.org (Michael Muratet) Date: Mon, 9 Feb 2009 14:36:24 -0600 Subject: [Bioperl-l] load_seqdatabase.pl memory requirements unusually large In-Reply-To: <821B9510-247B-4CA6-AC22-E0420698DD5F@gmx.net> References: <00C78AD9-02E4-4295-93E0-521DAE453842@hudsonalpha.org> <821B9510-247B-4CA6-AC22-E0420698DD5F@gmx.net> Message-ID: <7D70CEFA-1A2E-48A8-BFC0-31A24E02622E@hudsonalpha.org> On Feb 6, 2009, at 1:34 PM, Hilmar Lapp wrote: > Something seems to cause Perl to be crapping out. It it were a > programmatic exception you would see the message and the trace. > > Could you run these tests by themselves: > > ./Build test --test-files t/11locuslink.t > > If that doesn't reveal the error, add a verbose=1 argument. > > Let us know what you find. Here's the result: [root at srv-cf1 bioperl-db]# ../Build test --test-files t/11locuslink.t bash: ../Build: No such file or directory [root at srv-cf1 bioperl-db]# ./Build test --test-files t/11locuslink.t Copying scripts/biosql/terms/importrelation.pl -> blib/script/ importrelation.pl blib/script/importrelation.pl -> blib/script/bp_importrelation.pl Copying scripts/biosql/merge-unique-ann.pl -> blib/script/merge-unique- ann.pl blib/script/merge-unique-ann.pl -> blib/script/bp_merge-unique-ann.pl Copying scripts/biosql/update-on-new-date.pl -> blib/script/update-on- new-date.pl blib/script/update-on-new-date.pl -> blib/script/bp_update-on-new- date.pl Copying scripts/biosql/terms/add-term-annot.pl -> blib/script/add-term- annot.pl Deleting blib/script/add-term-annot.pl.bak blib/script/add-term-annot.pl -> blib/script/bp_add-term-annot.pl Copying scripts/corba/caching_corba_server.pl -> blib/script/ caching_corba_server.pl Deleting blib/script/caching_corba_server.pl.bak blib/script/caching_corba_server.pl -> blib/script/ bp_caching_corba_server.pl Copying scripts/biosql/load_ontology.pl -> blib/script/load_ontology.pl Deleting blib/script/load_ontology.pl.bak blib/script/load_ontology.pl -> blib/script/bp_load_ontology.pl Copying scripts/biosql/load_seqdatabase.pl -> blib/script/ load_seqdatabase.pl Deleting blib/script/load_seqdatabase.pl.bak blib/script/load_seqdatabase.pl -> blib/script/bp_load_seqdatabase.pl Copying scripts/biosql/terms/interpro2go.pl -> blib/script/ interpro2go.pl blib/script/interpro2go.pl -> blib/script/bp_interpro2go.pl Copying scripts/biosql/clean_ontology.pl -> blib/script/ clean_ontology.pl blib/script/clean_ontology.pl -> blib/script/bp_clean_ontology.pl Copying scripts/corba/test_bioenv.pl -> blib/script/test_bioenv.pl Deleting blib/script/test_bioenv.pl.bak blib/script/test_bioenv.pl -> blib/script/bp_test_bioenv.pl Copying scripts/biosql/update-on-new-version.pl -> blib/script/update- on-new-version.pl blib/script/update-on-new-version.pl -> blib/script/bp_update-on-new- version.pl Copying scripts/corba/bioenv_server.pl -> blib/script/bioenv_server.pl Deleting blib/script/bioenv_server.pl.bak blib/script/bioenv_server.pl -> blib/script/bp_bioenv_server.pl Copying scripts/biosql/bioentry2flat.pl -> blib/script/bioentry2flat.pl Deleting blib/script/bioentry2flat.pl.bak blib/script/bioentry2flat.pl -> blib/script/bp_bioentry2flat.pl Copying scripts/biosql/load_interpro.pl -> blib/script/load_interpro.pl blib/script/load_interpro.pl -> blib/script/bp_load_interpro.pl Copying scripts/biosql/cgi-bin/getentry.pl -> blib/script/getentry.pl Deleting blib/script/getentry.pl.bak blib/script/getentry.pl -> blib/script/bp_getentry.pl Copying scripts/biosql/del-assocs-sql.pl -> blib/script/del-assocs- sql.pl blib/script/del-assocs-sql.pl -> blib/script/bp_del-assocs-sql.pl Copying scripts/biosql/freshen-annot.pl -> blib/script/freshen-annot.pl blib/script/freshen-annot.pl -> blib/script/bp_freshen-annot.pl t/11locuslink.t....1/113 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unique key query in Bio::DB::BioSQL::SpeciesAdaptor returned 2 rows instead of 1. Query was [name_class="scientific name",binomial="Homo sapiens"] STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.5/Bio/Root/ Root.pm:357 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key / root/mmroot/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:965 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key / root/mmroot/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:860 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /root/mmroot/ bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182 STACK: Bio::DB::Persistent::PersistentObject::create /root/mmroot/ bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm:244 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /root/mmroot/ bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 STACK: Bio::DB::Persistent::PersistentObject::create /root/mmroot/ bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm:244 STACK: t/11locuslink.t:30 ----------------------------------------------------------- # Looks like you planned 113 tests but ran 7. # Looks like your test exited with 255 just after 7. t/11locuslink.t.... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 106/113 subtests Test Summary Report ------------------- t/11locuslink.t (Wstat: 65280 Tests: 7 Failed: 0) Non-zero exit status: 255 Parse errors: Bad plan. You planned 113 tests but ran 7. Files=1, Tests=7, 2 wallclock secs ( 0.01 usr 0.00 sys + 0.40 cusr 0.06 csys = 0.47 CPU) Result: FAIL Failed 1/1 test programs. 0/7 subtests failed. Thanks Mike > > > -hilmar > > On Feb 6, 2009, at 11:03 AM, Michael Muratet wrote: > >> Greetings >> >> I have use bioperl-db and load_seqdatabase.pl many times in the >> past and it's worked pretty much out of the box. >> >> I have been trying to load fasta files from hg18. For chr1, the >> virtual and resident memory quickly builds to 14 GB or so, then >> starts using up the 2GB swap until it's full and then the system >> hangs. The system is an 8-core Dell with 16GB of physical memory. >> chr1.fa is ~242MB. All of the disk storage is network mounted on an >> EMC system which (I am told) has a proprietary version of something >> that's NFS-like. >> >> I loaded chrM (~17kb) and load_seqdatabase grew to over 4 GB before >> it completed. >> >> I am using MySQL 5.0.51a-community, DBI 1.607, perl 5.85, bioperl >> 1.6.0, bioperl-db 1.006900. >> >> I have the innodb engine enabled in MySQL and the buffers and >> caches set for a 'large' system. >> >> I had some errors during the bioperl-db install: >> >> Test Summary Report >> ------------------- >> t/01dbadaptor.t (Wstat: 256 Tests: 23 Failed: 1) >> Failed test: 23 >> Non-zero exit status: 1 >> t/10ensembl.t (Wstat: 65280 Tests: 5 Failed: 0) >> Non-zero exit status: 255 >> Parse errors: Bad plan. You planned 18 tests but ran 5. >> t/11locuslink.t (Wstat: 65280 Tests: 7 Failed: 0) >> Non-zero exit status: 255 >> Parse errors: Bad plan. You planned 113 tests but ran 7. >> t/15cluster.t (Wstat: 65280 Tests: 7 Failed: 0) >> Non-zero exit status: 255 >> Parse errors: Bad plan. You planned 162 tests but ran 7. >> Files=16, Tests=1205, 21 wallclock secs ( 0.51 usr 0.12 sys + >> 14.90 cusr 2.69 csys = 18.22 CPU) >> Result: FAIL >> Failed 4/16 test programs. 1/1205 subtests failed. >> >> The most recent bioperl-db documentation says that a workable >> version may be possible after some errors and so I went ahead with >> the install. >> >> The mailing list archive has some discussion about throughput but >> nothing really about filling up memory. >> >> Can anyone offer any clues about what's going on or where to start >> looking? >> >> Thanks >> >> Mike >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > Michael Muratet mmuratet at hudsonalpha.org From hlapp at gmx.net Mon Feb 9 16:01:58 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 9 Feb 2009 16:01:58 -0500 Subject: [Bioperl-l] load_seqdatabase.pl memory requirements unusually large In-Reply-To: <7D70CEFA-1A2E-48A8-BFC0-31A24E02622E@hudsonalpha.org> References: <00C78AD9-02E4-4295-93E0-521DAE453842@hudsonalpha.org> <821B9510-247B-4CA6-AC22-E0420698DD5F@gmx.net> <7D70CEFA-1A2E-48A8-BFC0-31A24E02622E@hudsonalpha.org> Message-ID: <183A1C3B-8E07-452E-AB59-E5501D8143AD@gmx.net> Thanks Mike. Apparently in your version of Perl you don't see the stack traces unless you run them individually. Based on the output below I think that Chris was on the right track when he suspected lack of transaction support. To be sure, could you run ./Build test --test-files t/01dbadaptor.t The last test is for transaction support. If the above prints something like "your RDBMS does not have transactions enabled" and a failure of test #23, then that is the problem. In this case, which RDBMS are you using, and if it is MySQL, did you enable Innodb fully in the configuration file (you need to enable it *and* give the location of the data files)? Once you fixed that, you'll need to empty your database. -hilma On Feb 9, 2009, at 3:36 PM, Michael Muratet wrote: > > On Feb 6, 2009, at 1:34 PM, Hilmar Lapp wrote: > >> Something seems to cause Perl to be crapping out. It it were a >> programmatic exception you would see the message and the trace. >> >> Could you run these tests by themselves: >> >> ./Build test --test-files t/11locuslink.t >> >> If that doesn't reveal the error, add a verbose=1 argument. >> >> Let us know what you find. > > Here's the result: > > [root at srv-cf1 bioperl-db]# ../Build test --test-files t/11locuslink.t > bash: ../Build: No such file or directory > [root at srv-cf1 bioperl-db]# ./Build test --test-files t/11locuslink.t > Copying scripts/biosql/terms/importrelation.pl -> blib/script/ > importrelation.pl > blib/script/importrelation.pl -> blib/script/bp_importrelation.pl > Copying scripts/biosql/merge-unique-ann.pl -> blib/script/merge- > unique-ann.pl > blib/script/merge-unique-ann.pl -> blib/script/bp_merge-unique-ann.pl > Copying scripts/biosql/update-on-new-date.pl -> blib/script/update- > on-new-date.pl > blib/script/update-on-new-date.pl -> blib/script/bp_update-on-new- > date.pl > Copying scripts/biosql/terms/add-term-annot.pl -> blib/script/add- > term-annot.pl > Deleting blib/script/add-term-annot.pl.bak > blib/script/add-term-annot.pl -> blib/script/bp_add-term-annot.pl > Copying scripts/corba/caching_corba_server.pl -> blib/script/ > caching_corba_server.pl > Deleting blib/script/caching_corba_server.pl.bak > blib/script/caching_corba_server.pl -> blib/script/ > bp_caching_corba_server.pl > Copying scripts/biosql/load_ontology.pl -> blib/script/ > load_ontology.pl > Deleting blib/script/load_ontology.pl.bak > blib/script/load_ontology.pl -> blib/script/bp_load_ontology.pl > Copying scripts/biosql/load_seqdatabase.pl -> blib/script/ > load_seqdatabase.pl > Deleting blib/script/load_seqdatabase.pl.bak > blib/script/load_seqdatabase.pl -> blib/script/bp_load_seqdatabase.pl > Copying scripts/biosql/terms/interpro2go.pl -> blib/script/ > interpro2go.pl > blib/script/interpro2go.pl -> blib/script/bp_interpro2go.pl > Copying scripts/biosql/clean_ontology.pl -> blib/script/ > clean_ontology.pl > blib/script/clean_ontology.pl -> blib/script/bp_clean_ontology.pl > Copying scripts/corba/test_bioenv.pl -> blib/script/test_bioenv.pl > Deleting blib/script/test_bioenv.pl.bak > blib/script/test_bioenv.pl -> blib/script/bp_test_bioenv.pl > Copying scripts/biosql/update-on-new-version.pl -> blib/script/ > update-on-new-version.pl > blib/script/update-on-new-version.pl -> blib/script/bp_update-on-new- > version.pl > Copying scripts/corba/bioenv_server.pl -> blib/script/bioenv_server.pl > Deleting blib/script/bioenv_server.pl.bak > blib/script/bioenv_server.pl -> blib/script/bp_bioenv_server.pl > Copying scripts/biosql/bioentry2flat.pl -> blib/script/ > bioentry2flat.pl > Deleting blib/script/bioentry2flat.pl.bak > blib/script/bioentry2flat.pl -> blib/script/bp_bioentry2flat.pl > Copying scripts/biosql/load_interpro.pl -> blib/script/ > load_interpro.pl > blib/script/load_interpro.pl -> blib/script/bp_load_interpro.pl > Copying scripts/biosql/cgi-bin/getentry.pl -> blib/script/getentry.pl > Deleting blib/script/getentry.pl.bak > blib/script/getentry.pl -> blib/script/bp_getentry.pl > Copying scripts/biosql/del-assocs-sql.pl -> blib/script/del-assocs- > sql.pl > blib/script/del-assocs-sql.pl -> blib/script/bp_del-assocs-sql.pl > Copying scripts/biosql/freshen-annot.pl -> blib/script/freshen- > annot.pl > blib/script/freshen-annot.pl -> blib/script/bp_freshen-annot.pl > t/11locuslink.t....1/113 > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Unique key query in Bio::DB::BioSQL::SpeciesAdaptor returned 2 > rows instead of 1. Query was [name_class="scientific > name",binomial="Homo sapiens"] > STACK: Error::throw > STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.5/Bio/ > Root/Root.pm:357 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key / > root/mmroot/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:965 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key / > root/mmroot/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:860 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /root/mmroot/ > bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182 > STACK: Bio::DB::Persistent::PersistentObject::create /root/mmroot/ > bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /root/mmroot/ > bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::Persistent::PersistentObject::create /root/mmroot/ > bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm:244 > STACK: t/11locuslink.t:30 > ----------------------------------------------------------- > # Looks like you planned 113 tests but ran 7. > # Looks like your test exited with 255 just after 7. > t/11locuslink.t.... Dubious, test returned 255 (wstat 65280, 0xff00) > Failed 106/113 subtests > > Test Summary Report > ------------------- > t/11locuslink.t (Wstat: 65280 Tests: 7 Failed: 0) > Non-zero exit status: 255 > Parse errors: Bad plan. You planned 113 tests but ran 7. > Files=1, Tests=7, 2 wallclock secs ( 0.01 usr 0.00 sys + 0.40 > cusr 0.06 csys = 0.47 CPU) > Result: FAIL > Failed 1/1 test programs. 0/7 subtests failed. > > Thanks > > Mike > >> >> >> -hilmar >> >> On Feb 6, 2009, at 11:03 AM, Michael Muratet wrote: >> >>> Greetings >>> >>> I have use bioperl-db and load_seqdatabase.pl many times in the >>> past and it's worked pretty much out of the box. >>> >>> I have been trying to load fasta files from hg18. For chr1, the >>> virtual and resident memory quickly builds to 14 GB or so, then >>> starts using up the 2GB swap until it's full and then the system >>> hangs. The system is an 8-core Dell with 16GB of physical memory. >>> chr1.fa is ~242MB. All of the disk storage is network mounted on >>> an EMC system which (I am told) has a proprietary version of >>> something that's NFS-like. >>> >>> I loaded chrM (~17kb) and load_seqdatabase grew to over 4 GB >>> before it completed. >>> >>> I am using MySQL 5.0.51a-community, DBI 1.607, perl 5.85, bioperl >>> 1.6.0, bioperl-db 1.006900. >>> >>> I have the innodb engine enabled in MySQL and the buffers and >>> caches set for a 'large' system. >>> >>> I had some errors during the bioperl-db install: >>> >>> Test Summary Report >>> ------------------- >>> t/01dbadaptor.t (Wstat: 256 Tests: 23 Failed: 1) >>> Failed test: 23 >>> Non-zero exit status: 1 >>> t/10ensembl.t (Wstat: 65280 Tests: 5 Failed: 0) >>> Non-zero exit status: 255 >>> Parse errors: Bad plan. You planned 18 tests but ran 5. >>> t/11locuslink.t (Wstat: 65280 Tests: 7 Failed: 0) >>> Non-zero exit status: 255 >>> Parse errors: Bad plan. You planned 113 tests but ran 7. >>> t/15cluster.t (Wstat: 65280 Tests: 7 Failed: 0) >>> Non-zero exit status: 255 >>> Parse errors: Bad plan. You planned 162 tests but ran 7. >>> Files=16, Tests=1205, 21 wallclock secs ( 0.51 usr 0.12 sys + >>> 14.90 cusr 2.69 csys = 18.22 CPU) >>> Result: FAIL >>> Failed 4/16 test programs. 1/1205 subtests failed. >>> >>> The most recent bioperl-db documentation says that a workable >>> version may be possible after some errors and so I went ahead with >>> the install. >>> >>> The mailing list archive has some discussion about throughput but >>> nothing really about filling up memory. >>> >>> Can anyone offer any clues about what's going on or where to start >>> looking? >>> >>> Thanks >>> >>> Mike >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> > > Michael Muratet > mmuratet at hudsonalpha.org > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From sviiya at gmail.com Tue Feb 10 02:28:34 2009 From: sviiya at gmail.com (Sviya) Date: Tue, 10 Feb 2009 12:58:34 +0530 Subject: [Bioperl-l] Indexing CDS file Message-ID: Hello,I am trying to index cds files. The following script is run on cds_ann_hum_01_r99.dat obtained from EMBL. It is able to read and print the ids and sequences, but strangely it is unable to find any of the ids! Could you please explain the mistake I am committing? I have tried variously but it doesn't work. --- --- #!/usr/bin/perl -w use strict; use Bio::SeqIO; use Bio::Index::Fasta; my $file = "cds100.dat"; my $idx_file = "cds100.idx"; unlink($idx_file); # delete the file, if exists my $in_seq = Bio::SeqIO->new( -file => "< $file", -format => "embl"); while ( my $seq = $in_seq->next_seq) { print $seq->id, "\n"; } my $inx = Bio::Index::Fasta->new( -filename => $idx_file, -write_flag => 1); $inx->make_index($file); # creates the index file print "Done indexing!\n"; my $out = Bio::SeqIO->new( '-format' => 'fasta'); $inx = Bio::Index::Fasta->new('-filename' => $idx_file); my $query_seq=$inx->fetch('EAL24405SV'); if ( ! defined($query_seq) ) { print "Sequence not found!\n"; exit 1; } else { print $query_seq; } exit 0; --- --- Output is: --- --- EAL24405; EAL24406; . . . EAL24407; EAL24408; Done indexing! Sequence not found! --- --- TIA, Sviya From David.Messina at sbc.su.se Tue Feb 10 07:37:32 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Tue, 10 Feb 2009 13:37:32 +0100 Subject: [Bioperl-l] Indexing CDS file In-Reply-To: References: Message-ID: <628aabb70902100437o186c1f34i4da76715ba6b3306@mail.gmail.com> Hi Sviya, You've almost got it. You need to use Bio::Index::EMBL if you're going to be indexing EMBL-format files. I modified your code to do so, and it worked for me (attached below). Please try it and report back if you have any problems. One thing: note that there aren't any 'AC' lines in that file you're indexing, so I saw errors like this: --------------------- WARNING --------------------- MSG: For id [EAL24318;] in embl flat file, got no accession number. Storing id index anyway --------------------------------------------------- I did see accession-like identifiers on the 'PA' lines, however. Does anybody know if this is a change in EMBL format that we need to adapt B::I::EMBL to? Dave ----------------------------CODE BEGINS----------------------------- #!/usr/bin/perl -w use strict; use Bio::SeqIO; use Bio::Index::EMBL; my $file = "cds100.dat"; my $idx_file = "cds100.idx"; unlink($idx_file); # delete the file, if exists my $in_seq = Bio::SeqIO->new( -file => "< $file", -format => "embl"); while ( my $seq = $in_seq->next_seq) { print $seq->id, "\n"; } my $inx = Bio::Index::EMBL->new(-filename => $idx_file, -write_flag => 'WRITE'); my $retval = $inx->make_index($file,); # creates the index file print "Done indexing!\n"; my $out = Bio::SeqIO->new( '-format' => 'fasta', '-fh' => \*STDOUT); my $query_seq = $inx->fetch('EAL24309;'); if ( ! defined($query_seq) ) { print "Sequence not found!\n"; exit 1; } else { $out->write_seq($query_seq); } exit 0; -------------------------CODE ENDS-------------------------------- From markus.liebscher at gmx.de Tue Feb 10 09:40:49 2009 From: markus.liebscher at gmx.de (manni122) Date: Tue, 10 Feb 2009 06:40:49 -0800 (PST) Subject: [Bioperl-l] Cannot read in alignment data with Bio::AlignIO Message-ID: <21935058.post@talk.nabble.com> Hi, I am trying to read in a file with multiple pairwise alignments. Some IDs appear frequently. So if I am using this code below I get the error message: --- MSG: Replacing one sequence xxx --- Is there a way to read the data even with those similar names? Regards, manni122. use Bio::AlignIO; my $in = Bio::AlignIO->new(-file => "align.fas" , -format => 'fasta'); my $aln = $in->next_result(); foreach my $seqObj ($aln->each_seq) { print $seqObj->display_id, "\n"; } -- View this message in context: http://www.nabble.com/Cannot-read-in-alignment-data-with-Bio%3A%3AAlignIO-tp21935058p21935058.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From cjfields at illinois.edu Tue Feb 10 10:27:23 2009 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 10 Feb 2009 09:27:23 -0600 Subject: [Bioperl-l] Cannot read in alignment data with Bio::AlignIO In-Reply-To: <21935058.post@talk.nabble.com> References: <21935058.post@talk.nabble.com> Message-ID: On Feb 10, 2009, at 8:40 AM, manni122 wrote: > > Hi, > I am trying to read in a file with multiple pairwise alignments. > Some IDs > appear frequently. So if I am using this code below I get the error > message: > --- MSG: Replacing one sequence xxx --- > Is there a way to read the data even with those similar names? > Regards, manni122. [...] The NSE (Name.version/start-end) is used to distinguish the sequences from one another, so if each sequence has one or more unique accession/ version/start/end there should be no replacement (and no warning). If you think about it that's a feature. Any single sequence that appears in an alignment more than once is either (1) matching multiple regions (i.e. repeats, motifs, etc) so the location varies, or (2) the sequence was modified so the version changes (the last one is fairly new). Beyond that one has to question the logic of including multiple copies of exactly the same sequence record in a multiple alignment, so unless additional information distinguishing the potential duplicates is provided we assume unintentional (and erroneous) duplication and punt. Weighing the options I would rather have the warning indicating a problem than nothing at all. If you absolutely need duplicates (I am curious as to why) I suggest changing the version number: use Bio::LocatableSeq; use Bio::SimpleAlign; use Bio::AlignIO; my $aln = Bio::SimpleAlign->new(); my $out = Bio::AlignIO->new(-format => 'clustalw'); for my $v (1..10) { my $ls = Bio::LocatableSeq->new(-id => 'ABCD1234', -version => $v, -alphabet => 'dna', -seq => '--atg---gta--'); $aln->add_seq($ls); } $out->write_aln($aln); # output below ___DATA___ CLUSTAL W(1.81) multiple sequence alignment ABCD1234.1/1-6 --atg---gta-- ABCD1234.2/1-6 --atg---gta-- ABCD1234.3/1-6 --atg---gta-- ABCD1234.4/1-6 --atg---gta-- ABCD1234.5/1-6 --atg---gta-- ABCD1234.6/1-6 --atg---gta-- ABCD1234.7/1-6 --atg---gta-- ABCD1234.8/1-6 --atg---gta-- ABCD1234.9/1-6 --atg---gta-- ABCD1234.10/1-6 --atg---gta-- *** *** chris From cjfields at illinois.edu Wed Feb 11 00:28:43 2009 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 10 Feb 2009 23:28:43 -0600 Subject: [Bioperl-l] [ANNOUNCEMENT] Second alpha 1.6 releases of BioPerl-run, BioPerl-db, BioPerl-network Message-ID: <909E8705-7010-4625-ADB8-B10465CEA9EB@illinois.edu> All, I would like to announce that the second alpha releases for BioPerl- run, BioPerl-db, and BioPerl-network are now available. These are designated as 1.005009_002, with a requirement for BioPerl 1.6 and higher (1.006000). TODO: 1) All subdistributions need updates to the Changes, BUGS, AUTHORS, etc files. Please make updates as needed to the main trunk (I'll merge them over to the relevant branch). FIXED: 1) BioPerl-run TCoffee tests are now passing with latest TCoffee version (Charles Plessy and myself) 2) Bio::ConfigData (config module installed via Module::Build) is should now be skipped over until we can decide what to do with it (Alex Lancaster). This will also be incorporated into trunk. 3) Several module doc fixes made (Dave Messina and Charles Plessy for reporting) 4) Doc updates (BIO!) 5) bioperl-db now handles undefined variables correctly (Johann Pellet and Hilmar Lapp) 6) GeneMark tests now work for older gene models (Mark Johnson) 7) Build.PL should now die with a warning if the proper BioPerl core version isn't installed The archives can be downloaded from here: BioPerl-run: http://bioperl.org/DIST/BioPerl-run-1.5.9_2.tar.bz2 http://bioperl.org/DIST/BioPerl-run-1.5.9_2.tar.gz http://bioperl.org/DIST/BioPerl-run-1.5.9_2.zip BioPerl-db: http://bioperl.org/DIST/BioPerl-db-1.5.9_2.tar.bz2 http://bioperl.org/DIST/BioPerl-db-1.5.9_2.tar.gz http://bioperl.org/DIST/BioPerl-db-1.5.9_2.zip BioPerl-network: http://bioperl.org/DIST/BioPerl-network-1.5.9_2.tar.bz2 http://bioperl.org/DIST/BioPerl-network-1.5.9_2.tar.gz http://bioperl.org/DIST/BioPerl-network-1.5.9_2.zip SIGNATURES: http://bioperl.org/DIST/SIGNATURES.md5 I will likely release the final 1.6 releases for these distributions in the next week unless serious issues pop up. The next point release (1.6.1, all distributions) should land sometime in mid-April. Cheers! chris From heikki.lehvaslaiho at gmail.com Wed Feb 11 02:57:12 2009 From: heikki.lehvaslaiho at gmail.com (Heikki Lehvaslaiho) Date: Wed, 11 Feb 2009 09:57:12 +0200 Subject: [Bioperl-l] Indexing CDS file In-Reply-To: <628aabb70902100437o186c1f34i4da76715ba6b3306@mail.gmail.com> References: <628aabb70902100437o186c1f34i4da76715ba6b3306@mail.gmail.com> Message-ID: Dave, PA line is not mentioned in http://www.ebi.ac.uk/embl/Documentation/User_manual/usrman.html so I assume it is a local modification (or error). If you have the location of the file with PA lines in the EMBL FTP server, why not report this strange behaviour at http://www.ebi.ac.uk/support/ -Heikki 2009/2/10 Dave Messina : > I did see accession-like identifiers on the 'PA' lines, however. Does > anybody know if this is a change in EMBL format that we need to adapt > B::I::EMBL to? From David.Messina at sbc.su.se Wed Feb 11 05:29:41 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 11 Feb 2009 11:29:41 +0100 Subject: [Bioperl-l] Indexing CDS file In-Reply-To: References: <628aabb70902100437o186c1f34i4da76715ba6b3306@mail.gmail.com> Message-ID: <628aabb70902110229h783c2f9ch177284611eae4b46@mail.gmail.com> Thanks, Heikki. I took a closer look at the EBI ftp site where Sviya and I got the file, and in their README (ftp://ftp.ebi.ac.uk/pub/databases/embl/cds/README.txt) it says: PA line - contains the accession.version of the "parent" EMBL entry (entry where the CDS is annotated) So, unfortunately they've decided that a CDS record, which has no accession of its own, doesn't get its parent's accession number, but gets to refer to its parent's accession number via the PA line. Furthermore, there's an OX line - contains the NCBI taxid for the organism; taxonomic data are taken from the parent EMBL entries which is also not part of the the formal spec. (although this one is a more worthwhile addition, IMO) Sooooo, I think we'll need to add support for these. 'PA' seems easy enough -- the EMBL parser can look for it if there isn't an 'AC' line. As for 'OX', is there a standard slot for a taxonID in a RichSeq SeqFeature table? Coming from a Genbank record or a vanilla EMBL record, this is normally encoded as primary tag: source tag: db_xref value: taxon:9606 right? Should do the same if we're coming from an EMBL entry, even though it's not actually in the feature table? Dave From avilella at gmail.com Wed Feb 11 07:34:29 2009 From: avilella at gmail.com (Albert Vilella) Date: Wed, 11 Feb 2009 12:34:29 +0000 Subject: [Bioperl-l] supporting hmmer3 in bioperl-run and bioperl-live Message-ID: <358f4d650902110434m2033b25k173ca4c45744a5d8@mail.gmail.com> Hi, Is there anyone interested in updating the support for running and parsing results from HMMER3 in bioperl? Cheers, Albert. From heikki.lehvaslaiho at gmail.com Wed Feb 11 07:44:08 2009 From: heikki.lehvaslaiho at gmail.com (Heikki Lehvaslaiho) Date: Wed, 11 Feb 2009 14:44:08 +0200 Subject: [Bioperl-l] Indexing CDS file In-Reply-To: <628aabb70902110229h783c2f9ch177284611eae4b46@mail.gmail.com> References: <628aabb70902100437o186c1f34i4da76715ba6b3306@mail.gmail.com> <628aabb70902110229h783c2f9ch177284611eae4b46@mail.gmail.com> Message-ID: Dave, Looks good. Are you going to do the changes in to the EMBL parser? -Heikki 2009/2/11 Dave Messina : > Thanks, Heikki. > > I took a closer look at the EBI ftp site where Sviya and I got the file, and > in their README (ftp://ftp.ebi.ac.uk/pub/databases/embl/cds/README.txt) it > says: > > PA line - contains the accession.version of the "parent" EMBL entry > (entry where the CDS is annotated) > > > So, unfortunately they've decided that a CDS record, which has no accession > of its own, doesn't get its parent's accession number, but gets to refer to > its parent's accession number via the PA line. > > Furthermore, there's an > > OX line - contains the NCBI taxid for the organism; taxonomic data are taken > from the parent EMBL entries > > which is also not part of the the formal spec. (although this one is a more > worthwhile addition, IMO) > > Sooooo, I think we'll need to add support for these. > > 'PA' seems easy enough -- the EMBL parser can look for it if there isn't an > 'AC' line. > > As for 'OX', is there a standard slot for a taxonID in a RichSeq SeqFeature > table? Coming from a Genbank record or a vanilla EMBL record, this is > normally encoded as > > primary tag: source > tag: db_xref > value: taxon:9606 > > right? > > Should do the same if we're coming from an EMBL entry, even though it's not > actually in the feature table? > > > Dave > > -- -Heikki Heikki Lehvaslaiho - heikki lehvaslaiho gmail com Sent from: Johannesburg Gauteng South Africa. From cjfields at illinois.edu Wed Feb 11 08:22:30 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 11 Feb 2009 07:22:30 -0600 Subject: [Bioperl-l] supporting hmmer3 in bioperl-run and bioperl-live In-Reply-To: <358f4d650902110434m2033b25k173ca4c45744a5d8@mail.gmail.com> References: <358f4d650902110434m2033b25k173ca4c45744a5d8@mail.gmail.com> Message-ID: Interest, yes. Tuits are another thing... HMMER3 is still in early alpha IIRC, but I did get an Infernal parser working pre-1.0 release so... chris On Feb 11, 2009, at 6:34 AM, Albert Vilella wrote: > Hi, > > Is there anyone interested in updating the support for running and > parsing > results from HMMER3 in bioperl? > > Cheers, > > Albert. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Wed Feb 11 08:24:30 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 11 Feb 2009 07:24:30 -0600 Subject: [Bioperl-l] Indexing CDS file In-Reply-To: References: <628aabb70902100437o186c1f34i4da76715ba6b3306@mail.gmail.com> <628aabb70902110229h783c2f9ch177284611eae4b46@mail.gmail.com> Message-ID: <41E657AB-07E4-422B-9AB2-5FFD6DFC20C2@illinois.edu> I'm guessing that line would be similar to DBSOURCE in GenPept files. Could probably use Bio::Annotation::DBLink or Bio::Annotation::Target for it (if it corresponds to a particular subset of the sequence). chris On Feb 11, 2009, at 6:44 AM, Heikki Lehvaslaiho wrote: > Dave, > > Looks good. Are you going to do the changes in to the EMBL parser? > > -Heikki > > 2009/2/11 Dave Messina : >> Thanks, Heikki. >> >> I took a closer look at the EBI ftp site where Sviya and I got the >> file, and >> in their README (ftp://ftp.ebi.ac.uk/pub/databases/embl/cds/README.txt >> ) it >> says: >> >> PA line - contains the accession.version of the "parent" EMBL entry >> (entry where the CDS is annotated) >> >> >> So, unfortunately they've decided that a CDS record, which has no >> accession >> of its own, doesn't get its parent's accession number, but gets to >> refer to >> its parent's accession number via the PA line. >> >> Furthermore, there's an >> >> OX line - contains the NCBI taxid for the organism; taxonomic data >> are taken >> from the parent EMBL entries >> >> which is also not part of the the formal spec. (although this one >> is a more >> worthwhile addition, IMO) >> >> Sooooo, I think we'll need to add support for these. >> >> 'PA' seems easy enough -- the EMBL parser can look for it if there >> isn't an >> 'AC' line. >> >> As for 'OX', is there a standard slot for a taxonID in a RichSeq >> SeqFeature >> table? Coming from a Genbank record or a vanilla EMBL record, this is >> normally encoded as >> >> primary tag: source >> tag: db_xref >> value: taxon:9606 >> >> right? >> >> Should do the same if we're coming from an EMBL entry, even though >> it's not >> actually in the feature table? >> >> >> Dave >> >> > > > > -- > -Heikki > Heikki Lehvaslaiho - heikki lehvaslaiho gmail com > Sent from: Johannesburg Gauteng South Africa. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Wed Feb 11 08:39:55 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 11 Feb 2009 14:39:55 +0100 Subject: [Bioperl-l] Indexing CDS file In-Reply-To: <41E657AB-07E4-422B-9AB2-5FFD6DFC20C2@illinois.edu> References: <628aabb70902100437o186c1f34i4da76715ba6b3306@mail.gmail.com> <628aabb70902110229h783c2f9ch177284611eae4b46@mail.gmail.com> <41E657AB-07E4-422B-9AB2-5FFD6DFC20C2@illinois.edu> Message-ID: <628aabb70902110539x5c2c14q945bd370f085c090@mail.gmail.com> [Chris] > Could probably use Bio::Annotation::DBLink or Bio::Annotation::Target for > it (if it corresponds to a particular subset of the sequence). > Okay, will do. I took a look at the DBSOURCE handling code in SeqIO::genbank and will model it after that. > [Heikki] > >> Dave, >> >> Looks good. Are you going to do the changes in to the EMBL parser? >> > Thanks. Yes, I'll do it, probably later today. D From David.Messina at sbc.su.se Wed Feb 11 08:48:52 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Wed, 11 Feb 2009 14:48:52 +0100 Subject: [Bioperl-l] supporting hmmer3 in bioperl-run and bioperl-live In-Reply-To: References: <358f4d650902110434m2033b25k173ca4c45744a5d8@mail.gmail.com> Message-ID: <628aabb70902110548n384472aeq39787b47dbb346f7@mail.gmail.com> HMMER3 is still in early alpha IIRC, but I did get an Infernal parser > working pre-1.0 release so... I am somewhat interested in doing this, but I'm a little reluctant to do so yet because formats and options are a moving target. HMMer's author Sean Eddy wrote on his blog: > The core of H3's functionality seems stable to me, but all the stuff that > *you * see ? the applications, the command line options, the i/o formats ? > is deliberately still protoypical and fluid. > So personally I'm inclined to wait, but don't let that stop anyone else from jumping ahead. D From cjfields at illinois.edu Wed Feb 11 09:05:46 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 11 Feb 2009 08:05:46 -0600 Subject: [Bioperl-l] supporting hmmer3 in bioperl-run and bioperl-live In-Reply-To: <628aabb70902110548n384472aeq39787b47dbb346f7@mail.gmail.com> References: <358f4d650902110434m2033b25k173ca4c45744a5d8@mail.gmail.com> <628aabb70902110548n384472aeq39787b47dbb346f7@mail.gmail.com> Message-ID: On Feb 11, 2009, at 7:48 AM, Dave Messina wrote: > > > HMMER3 is still in early alpha IIRC, but I did get an Infernal > parser working pre-1.0 release so... > > > I am somewhat interested in doing this, but I'm a little reluctant > to do so yet because formats and options are a moving target. > > HMMer's author Sean Eddy wrote on his blog: > The core of H3's functionality seems stable to me, but all the stuff > that you see ? the applications, the command line options, the i/o > formats ? is deliberately still protoypical and fluid. > > So personally I'm inclined to wait, but don't let that stop anyone > else from jumping ahead. > > > D If someone pursues this, I suggest keeping a separate HMMER3 set of modules. Either use something like Bio::SearchIO::hmmer3, Bio::Tools::Run::Hmmer3, or separate out the HMMER2 code from HMMER3 (wrap them in their own distinct 'plugin' and load them on the fly within Bio::SearchIO::hmmer/Bio::Tools:Run::Hmmer). There will likely be significant enough differences between output (as indicated by Sean Eddy's blog) and program parameters/options, it keeps the HMMER2 code fast (no additional regex/parameter checks for HMMER3-specific output), and it makes deprecating the HMMER2 code easier a few years or so down the road. chris From brunovecchi at yahoo.com.ar Wed Feb 11 16:02:09 2009 From: brunovecchi at yahoo.com.ar (Bruno) Date: Wed, 11 Feb 2009 13:02:09 -0800 (PST) Subject: [Bioperl-l] Protein families Message-ID: <974162.66959.qm@web110506.mail.gq1.yahoo.com> Hello everyone, This question is somewhat unrelated to Bioperl technical issues, but I hope I can get some answers. What would be a sane way to address whether a sequence is part of a family? Since it's too broad of an issue, I'll restrict it: - It doesn't have to use online services. - It has to be scriptable. - It has to rely only on the aminoacidic sequence (ie, no experimental evidence, including 3D structure). - If possible, it should be fast. - For extra points, it should be simple (or complicated, but have a ready-to-use library). The context is this: I want to perform some GA randomization on a protein sequence to optimize for an arbitrary target function (for instance, increase occurrence of certain type of proteolytic enzymes) , but I also want to minimize the chance of losing the protein's original function. So I thought that I'd need some sort of quantitative measure of how close the sequence is to belonging to the original's family. The simplest way that I can think of for doing this is to first build a profile for the family, based on a multiple sequence alignment; then to align each random sequence against the profile and calculate an e-value. But since I don't know much about this things, I really can't judge whether it makes sense or is completely wrong. Using Bio::Tools::HMM sounded fine, but unfortunately it doesn't offer a method for calculating the probability of an observation sequence, given the profile. What would you suggest? Thanks in advance! PS: If there is a more appropriate mailing list for this sort of questions, please don't hesitate to educate me. Bruno. Yahoo! Cocina Recetas pr?cticas y comida saludable http://ar.mujer.yahoo.com/cocina/ From jason at bioperl.org Wed Feb 11 16:16:52 2009 From: jason at bioperl.org (Jason Stajich) Date: Wed, 11 Feb 2009 13:16:52 -0800 Subject: [Bioperl-l] Protein families In-Reply-To: <974162.66959.qm@web110506.mail.gq1.yahoo.com> References: <974162.66959.qm@web110506.mail.gq1.yahoo.com> Message-ID: > > The simplest way that I can think of for doing this is to first > build a profile for the family, based on a multiple sequence > alignment; then to align each random sequence against the profile and > calculate an e-value. But since I don't know much about this things, I > really can't judge whether it makes sense or is completely wrong. > Using Bio::Tools::HMM sounded fine, but unfortunately it doesn't offer > a method for calculating the probability of an observation sequence, > given the profile. > I'm not entirely sure about the whole problem you describe but if you are using HMMER for this - "hmmsearch" does give you e-value of similarity of sequence to the profile - you need to do the hmmcalibrate beforehand though - this is covered in HMMER user manual. > What would you suggest? Thanks in advance! > > PS: If there is a more appropriate mailing list for this sort of > questions, please don't hesitate to educate me. > > Bruno. > > > > Yahoo! Cocina > Recetas pr?cticas y comida saludable > http://ar.mujer.yahoo.com/cocina/ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason at bioperl.org From manni122 at hotmail.com Fri Feb 6 04:09:18 2009 From: manni122 at hotmail.com (manni122) Date: Fri, 6 Feb 2009 01:09:18 -0800 (PST) Subject: [Bioperl-l] Alignment optimization using degenerate code Message-ID: <21868936.post@talk.nabble.com> I have a problem in programming, that once solved might be worth adding it to the BioPerl bundle. I want to pairwise align DNA sequences with as little gaps as possible. To reach this I want to run a first round of alignment, than compare the alignment pair and search for opposite codons that can be exchanged each with another one coding for the corresponding amino acid. This should increase their identity score. Does anyone have a good starting point for this? Appreciate any help with this! Markus -- View this message in context: http://www.nabble.com/Alignment-optimization-using-degenerate-code-tp21868936p21868936.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From suedejohnny at hotmail.com Sun Feb 8 14:25:04 2009 From: suedejohnny at hotmail.com (Julio514) Date: Sun, 8 Feb 2009 11:25:04 -0800 (PST) Subject: [Bioperl-l] Plse help! Probs trying with SeqIO in web service (web form)!!! Message-ID: <21902369.post@talk.nabble.com> Hello everyone, I've been using bioperl for more than one year now. Recently, I started a project of establishing a web service that accept for input one or many fasta sequences. I tested my script offline with .fa files directly on my HD to make sure everything was fine. And it was... After that, I modified the script with some CGI lines to make it compatible with web forms. (btw I still am a noob in web services:)). Anyway, the fasta input sequences seems to cause a prob I never encountered before and I am clueless... Here it is: Error in tempdir() using \\\\XXXXXXXXXX: Could not create directory \\\\6RzqdL7zU9: Invalid argument at C:/perl/site/lib/Bio/Root/IO.pm line 744, Anyone ever saw that ? I made sure that my $ENV{TMPDIR} was NOT read-only... Cheers, -- View this message in context: http://www.nabble.com/Plse-help%21-Probs-trying-with-SeqIO-in-web-service-%28web-form%29%21%21%21-tp21902369p21902369.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From suedejohnny at hotmail.com Sun Feb 8 14:26:03 2009 From: suedejohnny at hotmail.com (Julio514) Date: Sun, 8 Feb 2009 11:26:03 -0800 (PST) Subject: [Bioperl-l] Plse help! Probs with SeqIO in web service (web form)!!! Message-ID: <21902369.post@talk.nabble.com> Hello everyone, I've been using bioperl for more than one year now. Recently, I started a project of establishing a web service that accept for input one or many fasta sequences. I tested my script offline with .fa files directly on my HD to make sure everything was fine. And it was... After that, I modified the script with some CGI lines to make it compatible with web forms. (btw I still am a noob in web services:)). Anyway, the fasta input sequences seems to cause a prob I never encountered before and I am clueless... Here it is: Error in tempdir() using \\\\XXXXXXXXXX: Could not create directory \\\\6RzqdL7zU9: Invalid argument at C:/perl/site/lib/Bio/Root/IO.pm line 744, Anyone ever saw that ? I made sure that my $ENV{TMPDIR} was NOT read-only... Cheers, -- View this message in context: http://www.nabble.com/Plse-help%21-Probs-with-SeqIO-in-web-service-%28web-form%29%21%21%21-tp21902369p21902369.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From vecchi.b at gmail.com Wed Feb 11 16:37:52 2009 From: vecchi.b at gmail.com (Bruno Vecchi) Date: Wed, 11 Feb 2009 19:37:52 -0200 Subject: [Bioperl-l] Protein families In-Reply-To: References: <974162.66959.qm@web110506.mail.gq1.yahoo.com> Message-ID: <1a0c1b750902111337uccadc89y5a0bfb6268d97ff7@mail.gmail.com> Thanks! Coincidentally, after submitting my question I found about HMMER. It seems to be the right tool, and It also has a BioPerl module to parse its results. 2009/2/11 Jason Stajich > >> The simplest way that I can think of for doing this is to first >> build a profile for the family, based on a multiple sequence >> alignment; then to align each random sequence against the profile and >> calculate an e-value. But since I don't know much about this things, I >> really can't judge whether it makes sense or is completely wrong. >> Using Bio::Tools::HMM sounded fine, but unfortunately it doesn't offer >> a method for calculating the probability of an observation sequence, >> given the profile. >> >> I'm not entirely sure about the whole problem you describe but if you are > using HMMER for this - "hmmsearch" does give you e-value of similarity of > sequence to the profile - you need to do the hmmcalibrate beforehand though > - this is covered in HMMER user manual. > > > What would you suggest? Thanks in advance! >> >> PS: If there is a more appropriate mailing list for this sort of >> questions, please don't hesitate to educate me. >> >> Bruno. >> >> >> >> Yahoo! Cocina >> Recetas pr?cticas y comida saludable >> http://ar.mujer.yahoo.com/cocina/ >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > Jason Stajich > jason at bioperl.org > > > > From bosborne11 at verizon.net Wed Feb 11 16:28:40 2009 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 11 Feb 2009 16:28:40 -0500 Subject: [Bioperl-l] bp_biblio.pl doesn't work In-Reply-To: <5b2271350902090101n2887644ah84512a324496a4c5@mail.gmail.com> References: <5b2271350902090101n2887644ah84512a324496a4c5@mail.gmail.com> Message-ID: Spring, A bit late here, sorry about that. I am not certain that this SOAP server is still completely operational. I would use NCBI eutils instead, so take a look at examples/biblio-eutils-example.pl and see if it meets your needs. Brian O. On Feb 9, 2009, at 4:01 AM, spring gao wrote: > Hi, all, > > > I have some questions about the usage of bp_biblio.pl . > when execute bp_biblio.pl example, return the following Errors. > who knows how to repair this problem? Thank you! > > Information: > bp_biblio.pl -v > 1.006 > > Shell OUTPUT: > bp_biblio.pl - -find java -attrs abstract -find perl > Looking for 'java' in attributes 'abstract'... > ------------- EXCEPTION ------------- > MSG: --- SOAP FAULT --- > soapenv:Server.userException embl.ebi.BibShare.BQSException: An > empty query. > It may happen because of using non-existing attributes. > STACK Bio::DB::Biblio::soap::__ANON__ > /usr/local/share/perl/5.8.8/Bio/DB/Biblio/soap.pm:109 > STACK SOAP::Lite::call /usr/share/perl5/SOAP/Lite.pm:3412 > STACK SOAP::Lite::__ANON__ /usr/share/perl5/SOAP/Lite.pm:3377 > STACK Bio::DB::Biblio::soap::find /usr/local/share/perl/5.8.8/Bio/DB/ > Biblio/ > soap.pm:393 > STACK main::_find /usr/local/bin/bp_biblio.pl:214 > STACK toplevel /usr/local/bin/bp_biblio.pl:107 > ------------------------------------- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From David.Messina at sbc.su.se Wed Feb 11 18:24:21 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 12 Feb 2009 00:24:21 +0100 Subject: [Bioperl-l] Indexing CDS file In-Reply-To: <628aabb70902110539x5c2c14q945bd370f085c090@mail.gmail.com> References: <628aabb70902100437o186c1f34i4da76715ba6b3306@mail.gmail.com> <628aabb70902110229h783c2f9ch177284611eae4b46@mail.gmail.com> <41E657AB-07E4-422B-9AB2-5FFD6DFC20C2@illinois.edu> <628aabb70902110539x5c2c14q945bd370f085c090@mail.gmail.com> Message-ID: <628aabb70902111524k504911ffwc4782d2eabf4b04f@mail.gmail.com> I just committed support for these weirdo CDS records, along with some tests. One unexpected issue cropped up: I had to make a change in _read_EMBL_Species to have it pushback() a line after slurping all of the 'OS' and 'OC' lines, since the slurping loop was improperly consuming the next line after the final 'OC' line. I don't think anybody noticed before because vanilla EMBL records have a spacer 'XX' line as the next line, whereas the CDS records have the 'OX' line. Tests pass, but please let me know if you see any issues. Dave From cjfields at illinois.edu Wed Feb 11 19:41:12 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 11 Feb 2009 18:41:12 -0600 Subject: [Bioperl-l] Indexing CDS file In-Reply-To: <628aabb70902111524k504911ffwc4782d2eabf4b04f@mail.gmail.com> References: <628aabb70902100437o186c1f34i4da76715ba6b3306@mail.gmail.com> <628aabb70902110229h783c2f9ch177284611eae4b46@mail.gmail.com> <41E657AB-07E4-422B-9AB2-5FFD6DFC20C2@illinois.edu> <628aabb70902110539x5c2c14q945bd370f085c090@mail.gmail.com> <628aabb70902111524k504911ffwc4782d2eabf4b04f@mail.gmail.com> Message-ID: I'll likely merge it over tonight to branch. Thanks Dave! chris On Feb 11, 2009, at 5:24 PM, Dave Messina wrote: > I just committed support for these weirdo CDS records, along with some > tests. > One unexpected issue cropped up: > I had to make a change in _read_EMBL_Species to have it pushback() a > line > after slurping all of the 'OS' and 'OC' lines, since the slurping > loop was > improperly consuming the next line after the final 'OC' line. > > I don't think anybody noticed before because vanilla EMBL records > have a > spacer 'XX' line as the next line, whereas the CDS records have the > 'OX' > line. > > Tests pass, but please let me know if you see any issues. > > > Dave > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From chrysain at gmail.com Wed Feb 11 21:13:29 2009 From: chrysain at gmail.com (Chrysanthi A.) Date: Thu, 12 Feb 2009 02:13:29 +0000 Subject: [Bioperl-l] problem parsing a newick format Message-ID: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com> Is the code below correct?? Why it does not print anything??? use strict; use Bio::TreeIO; my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", -format => "newick"); while(my $tree = $input->next_tree){ for my $node(grep{!$_->is_Leaf}$tree->get_nodes){ next if !$node->ancestor; print "Node:", $node->id, "length:", $node->branch_length, " "; for my $child($node->get_Descendents){ print "child:", $child->id, "", $child->branch_length, " "; } print "\n"; } } Any ideas? I want to read a tree and mainly get the duplication events. Could someone help me? Thanks a lot, Chrysanthi From maj at fortinbras.us Wed Feb 11 21:24:13 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Wed, 11 Feb 2009 21:24:13 -0500 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com> References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com> Message-ID: <8468E65F03F84EC09E9EB152B78AB71C@NewLife> C- I think you maybe want my $input = new Bio::TreeIO(-file =>"nexusCytochrome7R.newick", -format => "newick"); and not > my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", > -format => "newick"); ? Mark ----- Original Message ----- From: "Chrysanthi A." To: "BioPerl List" Sent: Wednesday, February 11, 2009 9:13 PM Subject: [Bioperl-l] problem parsing a newick format > Is the code below correct?? Why it does not print anything??? > use strict; > > use Bio::TreeIO; > > > my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", > -format => "newick"); > > while(my $tree = $input->next_tree){ > for my $node(grep{!$_->is_Leaf}$tree->get_nodes){ > next if !$node->ancestor; > print "Node:", $node->id, "length:", $node->branch_length, " "; > for my $child($node->get_Descendents){ > print "child:", $child->id, "", $child->branch_length, " "; > } > print "\n"; > } > } > > Any ideas? I want to read a tree and mainly get the duplication events. > Could someone help me? > > Thanks a lot, > > Chrysanthi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From maj at fortinbras.us Wed Feb 11 21:37:04 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Wed, 11 Feb 2009 21:37:04 -0500 Subject: [Bioperl-l] Plse help! Probs with SeqIO in web service (web form)!!! In-Reply-To: <21902369.post@talk.nabble.com> References: <21902369.post@talk.nabble.com> Message-ID: <179144D306034A38BB92DE80910F9225@NewLife> The 'Invalid argument' appears to be thrown in DOS when there is not enuf space in the partition. In any case, this is pretty clearly an OS issue- Don't forget too that the web server prob doesn't have the same permissions when running the script as the user, even when running it in the same place. MAJ ----- Original Message ----- From: "Julio514" To: Sent: Sunday, February 08, 2009 2:26 PM Subject: [Bioperl-l] Plse help! Probs with SeqIO in web service (web form)!!! > > Hello everyone, > > I've been using bioperl for more than one year now. Recently, I started a > project of establishing a web service that accept for input one or many > fasta sequences. I tested my script offline with .fa files directly on my HD > to make sure everything was fine. And it was... After that, I modified the > script with some CGI lines to make it compatible with web forms. (btw I > still am a noob in web services:)). Anyway, the fasta input sequences seems > to cause a prob I never encountered before and I am clueless... Here it is: > > Error in tempdir() using \\\\XXXXXXXXXX: Could not create directory > \\\\6RzqdL7zU9: Invalid argument at C:/perl/site/lib/Bio/Root/IO.pm line > 744, > > Anyone ever saw that ? > I made sure that my $ENV{TMPDIR} was NOT read-only... > > Cheers, > -- > View this message in context: > http://www.nabble.com/Plse-help%21-Probs-with-SeqIO-in-web-service-%28web-form%29%21%21%21-tp21902369p21902369.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From chrysain at gmail.com Thu Feb 12 05:42:57 2009 From: chrysain at gmail.com (Chrysanthi A.) Date: Thu, 12 Feb 2009 10:42:57 +0000 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: <8468E65F03F84EC09E9EB152B78AB71C@NewLife> References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com> <8468E65F03F84EC09E9EB152B78AB71C@NewLife> Message-ID: <66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com> I tried also that, but it does not work.. It does not give me any error message.. It seems that the code is correct, but It does not print anything..Why??? Thanks, Chrysanthi. 2009/2/12 Mark A. Jensen > C- I think you maybe want > > my $input = new Bio::TreeIO(-file =>"nexusCytochrome7R.newick", > -format => "newick"); > > and not > > my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", >> -format => "newick"); >> > > ? > > Mark > > ----- Original Message ----- From: "Chrysanthi A." > To: "BioPerl List" > Sent: Wednesday, February 11, 2009 9:13 PM > Subject: [Bioperl-l] problem parsing a newick format > > > Is the code below correct?? Why it does not print anything??? >> use strict; >> >> use Bio::TreeIO; >> >> >> my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", >> -format => "newick"); >> >> while(my $tree = $input->next_tree){ >> for my $node(grep{!$_->is_Leaf}$tree->get_nodes){ >> next if !$node->ancestor; >> print "Node:", $node->id, "length:", $node->branch_length, " "; >> for my $child($node->get_Descendents){ >> print "child:", $child->id, "", $child->branch_length, " "; >> } >> print "\n"; >> } >> } >> >> Any ideas? I want to read a tree and mainly get the duplication events. >> Could someone help me? >> >> Thanks a lot, >> >> Chrysanthi >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> From maj at fortinbras.us Thu Feb 12 07:39:36 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Thu, 12 Feb 2009 07:39:36 -0500 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: <66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com> References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com> <8468E65F03F84EC09E9EB152B78AB71C@NewLife> <66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com> Message-ID: <6E6237ABF3C64B6DB6980966665A4D88@NewLife> Chrysanthi- Do the trees in your test file end with a semicolon? When I do use Bio::TreeIO; $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); $tre=$inp->next_tree; __END__ (A:1,(B:2,C:3)) $tre is empty, but when use Bio::TreeIO; $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); $tre=$inp->next_tree; __END__ (A:1,(B:2,C:3)); $tre contains the tree. If this is the problem, it sounds like a bug to me- Mark ----- Original Message ----- From: Chrysanthi A. To: Mark A. Jensen Cc: BioPerl List Sent: Thursday, February 12, 2009 5:42 AM Subject: Re: [Bioperl-l] problem parsing a newick format I tried also that, but it does not work.. It does not give me any error message.. It seems that the code is correct, but It does not print anything..Why??? Thanks, Chrysanthi. 2009/2/12 Mark A. Jensen C- I think you maybe want my $input = new Bio::TreeIO(-file =>"nexusCytochrome7R.newick", -format => "newick"); and not my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", -format => "newick"); ? Mark ----- Original Message ----- From: "Chrysanthi A." To: "BioPerl List" Sent: Wednesday, February 11, 2009 9:13 PM Subject: [Bioperl-l] problem parsing a newick format Is the code below correct?? Why it does not print anything??? use strict; use Bio::TreeIO; my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", -format => "newick"); while(my $tree = $input->next_tree){ for my $node(grep{!$_->is_Leaf}$tree->get_nodes){ next if !$node->ancestor; print "Node:", $node->id, "length:", $node->branch_length, " "; for my $child($node->get_Descendents){ print "child:", $child->id, "", $child->branch_length, " "; } print "\n"; } } Any ideas? I want to read a tree and mainly get the duplication events. Could someone help me? Thanks a lot, Chrysanthi _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Thu Feb 12 08:03:10 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Thu, 12 Feb 2009 08:03:10 -0500 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: <66b602900902120447l6bcd0d41s8738e8844d8047b5@mail.gmail.com> References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com> <8468E65F03F84EC09E9EB152B78AB71C@NewLife> <66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com> <6E6237ABF3C64B6DB6980966665A4D88@NewLife> <66b602900902120447l6bcd0d41s8738e8844d8047b5@mail.gmail.com> Message-ID: No problem, Chyrsanthi-- Jason-- may I loosen up the parser a bit on this? MAJ ----- Original Message ----- From: Chrysanthi A. To: Mark A. Jensen Sent: Thursday, February 12, 2009 7:47 AM Subject: Re: [Bioperl-l] problem parsing a newick format Yes, that was the problem.. Now its working perfect!!!! Thanks a lot! Chrysanthi 2009/2/12 Mark A. Jensen Chrysanthi- Do the trees in your test file end with a semicolon? When I do use Bio::TreeIO; $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); $tre=$inp->next_tree; __END__ (A:1,(B:2,C:3)) $tre is empty, but when use Bio::TreeIO; $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); $tre=$inp->next_tree; __END__ (A:1,(B:2,C:3)); $tre contains the tree. If this is the problem, it sounds like a bug to me- Mark ----- Original Message ----- From: Chrysanthi A. To: Mark A. Jensen Cc: BioPerl List Sent: Thursday, February 12, 2009 5:42 AM Subject: Re: [Bioperl-l] problem parsing a newick format I tried also that, but it does not work.. It does not give me any error message.. It seems that the code is correct, but It does not print anything..Why??? Thanks, Chrysanthi. 2009/2/12 Mark A. Jensen C- I think you maybe want my $input = new Bio::TreeIO(-file =>"nexusCytochrome7R.newick", -format => "newick"); and not my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", -format => "newick"); ? Mark ----- Original Message ----- From: "Chrysanthi A." To: "BioPerl List" Sent: Wednesday, February 11, 2009 9:13 PM Subject: [Bioperl-l] problem parsing a newick format Is the code below correct?? Why it does not print anything??? use strict; use Bio::TreeIO; my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", -format => "newick"); while(my $tree = $input->next_tree){ for my $node(grep{!$_->is_Leaf}$tree->get_nodes){ next if !$node->ancestor; print "Node:", $node->id, "length:", $node->branch_length, " "; for my $child($node->get_Descendents){ print "child:", $child->id, "", $child->branch_length, " "; } print "\n"; } } Any ideas? I want to read a tree and mainly get the duplication events. Could someone help me? Thanks a lot, Chrysanthi _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Thu Feb 12 08:12:57 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Thu, 12 Feb 2009 08:12:57 -0500 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com><8468E65F03F84EC09E9EB152B78AB71C@NewLife><66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com><6E6237ABF3C64B6DB6980966665A4D88@NewLife><66b602900902120447l6bcd0d41s8738e8844d8047b5@mail.gmail.com> Message-ID: <08A832C24C6047BE80358C0F9D06CE6D@NewLife> Hold on-- I see that if you wanted to parse long trees over multiple lines, you need the end character ; However, since correctly formed trees naturally end, is it reasonable to end them automatically? ----- Original Message ----- From: "Mark A. Jensen" To: "Chrysanthi A." Cc: "bioPerl List" Sent: Thursday, February 12, 2009 8:03 AM Subject: Re: [Bioperl-l] problem parsing a newick format > No problem, Chyrsanthi-- > > Jason-- may I loosen up the parser a bit on this? > MAJ > ----- Original Message ----- > From: Chrysanthi A. > To: Mark A. Jensen > Sent: Thursday, February 12, 2009 7:47 AM > Subject: Re: [Bioperl-l] problem parsing a newick format > > > Yes, that was the problem.. Now its working perfect!!!! Thanks a lot! > > Chrysanthi > > > 2009/2/12 Mark A. Jensen > > Chrysanthi- > Do the trees in your test file end with a semicolon? When I do > > use Bio::TreeIO; > $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); > $tre=$inp->next_tree; > __END__ > (A:1,(B:2,C:3)) > > $tre is empty, but when > > use Bio::TreeIO; > $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); > $tre=$inp->next_tree; > __END__ > (A:1,(B:2,C:3)); > > $tre contains the tree. > > If this is the problem, it sounds like a bug to me- > Mark > ----- Original Message ----- > From: Chrysanthi A. > To: Mark A. Jensen > Cc: BioPerl List > Sent: Thursday, February 12, 2009 5:42 AM > Subject: Re: [Bioperl-l] problem parsing a newick format > > > I tried also that, but it does not work.. It does not give me any error > message.. It > seems that the code is correct, but It does not print anything..Why??? > > Thanks, > > Chrysanthi. > > > > > 2009/2/12 Mark A. Jensen > > C- I think you maybe want > > my $input = new Bio::TreeIO(-file =>"nexusCytochrome7R.newick", > -format => "newick"); > > and not > > > my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", > -format => "newick"); > > > ? > > Mark > > ----- Original Message ----- From: "Chrysanthi A." > To: "BioPerl List" > Sent: Wednesday, February 11, 2009 9:13 PM > Subject: [Bioperl-l] problem parsing a newick format > > > > Is the code below correct?? Why it does not print anything??? > use strict; > > use Bio::TreeIO; > > > my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", > -format => "newick"); > > while(my $tree = $input->next_tree){ > for my $node(grep{!$_->is_Leaf}$tree->get_nodes){ > next if !$node->ancestor; > print "Node:", $node->id, "length:", $node->branch_length, " "; > for my $child($node->get_Descendents){ > print "child:", $child->id, "", $child->branch_length, " "; > } > print "\n"; > } > } > > Any ideas? I want to read a tree and mainly get the duplication > events. > Could someone help me? > > Thanks a lot, > > Chrysanthi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From David.Messina at sbc.su.se Thu Feb 12 08:41:14 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 12 Feb 2009 14:41:14 +0100 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com> <8468E65F03F84EC09E9EB152B78AB71C@NewLife> <66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com> <6E6237ABF3C64B6DB6980966665A4D88@NewLife> <66b602900902120447l6bcd0d41s8738e8844d8047b5@mail.gmail.com> Message-ID: <628aabb70902120541g81846c5sc9a751f0e42a96ac@mail.gmail.com> By the way, if I read the thread correctly, this my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", > -format => "newick"); > did not throw an error, because the file nexusCytochrome7R.newick is not a filehandle. Presumably it should, no? Dave From maj at fortinbras.us Thu Feb 12 08:44:47 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Thu, 12 Feb 2009 08:44:47 -0500 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: <628aabb70902120541g81846c5sc9a751f0e42a96ac@mail.gmail.com> References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com> <8468E65F03F84EC09E9EB152B78AB71C@NewLife> <66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com> <6E6237ABF3C64B6DB6980966665A4D88@NewLife> <66b602900902120447l6bcd0d41s8738e8844d8047b5@mail.gmail.com> <628aabb70902120541g81846c5sc9a751f0e42a96ac@mail.gmail.com> Message-ID: Aye- I was surprised it didn't croak. ----- Original Message ----- From: Dave Messina To: Mark A. Jensen Cc: Chrysanthi A. ; bioPerl List Sent: Thursday, February 12, 2009 8:41 AM Subject: Re: [Bioperl-l] problem parsing a newick format By the way, if I read the thread correctly, this my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", -format => "newick"); did not throw an error, because the file nexusCytochrome7R.newick is not a filehandle. Presumably it should, no? Dave From hlapp at gmx.net Thu Feb 12 09:53:48 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 12 Feb 2009 09:53:48 -0500 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: <628aabb70902120541g81846c5sc9a751f0e42a96ac@mail.gmail.com> References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com> <8468E65F03F84EC09E9EB152B78AB71C@NewLife> <66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com> <6E6237ABF3C64B6DB6980966665A4D88@NewLife> <66b602900902120447l6bcd0d41s8738e8844d8047b5@mail.gmail.com> <628aabb70902120541g81846c5sc9a751f0e42a96ac@mail.gmail.com> Message-ID: <31619FA4-34A0-491D-B127-6E85E5B6A3AD@gmx.net> On Feb 12, 2009, at 8:41 AM, Dave Messina wrote: > By the way, if I read the thread correctly, this > > my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", >> -format => "newick"); >> > > did not throw an error, because the file nexusCytochrome7R.newick is > not a > filehandle. > > Presumably it should, no? I guess yes. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Thu Feb 12 09:58:09 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 12 Feb 2009 09:58:09 -0500 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com> <8468E65F03F84EC09E9EB152B78AB71C@NewLife> <66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com> <6E6237ABF3C64B6DB6980966665A4D88@NewLife> <66b602900902120447l6bcd0d41s8738e8844d8047b5@mail.gmail.com> Message-ID: <85C2CDFC-BFC9-4ED3-A0FC-21AB7F78ED45@gmx.net> Note that the terminal semi-colon is part of the format spec. I've been bitten by this a few times in other programs - it's quite common that programs reading newick will throw an error or ignore the tree if it's not terminated by semi-colon. Having said that, along the lines of being strict on what we emit but liberal in what we accept, I guess it can be loosened up. But what if there is more than one tree in the file? -hilmar On Feb 12, 2009, at 8:03 AM, Mark A. Jensen wrote: > No problem, Chyrsanthi-- > > Jason-- may I loosen up the parser a bit on this? > MAJ > ----- Original Message ----- > From: Chrysanthi A. > To: Mark A. Jensen > Sent: Thursday, February 12, 2009 7:47 AM > Subject: Re: [Bioperl-l] problem parsing a newick format > > > Yes, that was the problem.. Now its working perfect!!!! Thanks a lot! > > Chrysanthi > > > 2009/2/12 Mark A. Jensen > > Chrysanthi- > Do the trees in your test file end with a semicolon? When I do > > use Bio::TreeIO; > $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); > $tre=$inp->next_tree; > __END__ > (A:1,(B:2,C:3)) > > $tre is empty, but when > > use Bio::TreeIO; > $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); > $tre=$inp->next_tree; > __END__ > (A:1,(B:2,C:3)); > > $tre contains the tree. > > If this is the problem, it sounds like a bug to me- > Mark > ----- Original Message ----- > From: Chrysanthi A. > To: Mark A. Jensen > Cc: BioPerl List > Sent: Thursday, February 12, 2009 5:42 AM > Subject: Re: [Bioperl-l] problem parsing a newick format > > > I tried also that, but it does not work.. It does not give me > any error message.. It > seems that the code is correct, but It does not print anything..Why??? > > Thanks, > > Chrysanthi. > > > > > 2009/2/12 Mark A. Jensen > > C- I think you maybe want > > my $input = new Bio::TreeIO(-file =>"nexusCytochrome7R.newick", > -format => "newick"); > > and not > > > my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", > -format => "newick"); > > > ? > > Mark > > ----- Original Message ----- From: "Chrysanthi A." > > To: "BioPerl List" > Sent: Wednesday, February 11, 2009 9:13 PM > Subject: [Bioperl-l] problem parsing a newick format > > > > Is the code below correct?? Why it does not print anything??? > use strict; > > use Bio::TreeIO; > > > my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", > -format => "newick"); > > while(my $tree = $input->next_tree){ > for my $node(grep{!$_->is_Leaf}$tree->get_nodes){ > next if !$node->ancestor; > print "Node:", $node->id, "length:", $node->branch_length, > " "; > for my $child($node->get_Descendents){ > print "child:", $child->id, "", $child->branch_length, " "; > } > print "\n"; > } > } > > Any ideas? I want to read a tree and mainly get the > duplication events. > Could someone help me? > > Thanks a lot, > > Chrysanthi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From maj at fortinbras.us Thu Feb 12 10:07:39 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Thu, 12 Feb 2009 10:07:39 -0500 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: <85C2CDFC-BFC9-4ED3-A0FC-21AB7F78ED45@gmx.net> References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com><8468E65F03F84EC09E9EB152B78AB71C@NewLife><66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com><6E6237ABF3C64B6DB6980966665A4D88@NewLife><66b602900902120447l6bcd0d41s8738e8844d8047b5@mail.gmail.com> <85C2CDFC-BFC9-4ED3-A0FC-21AB7F78ED45@gmx.net> Message-ID: <141183C5F32B4CE08611C98B4D33BF72@NewLife> I'm pretty close to (what I think is) a solution to this very issue--more soon. In the meantime--rom the looks of Bio::Root::IO, it looks like there is no explicit check of $fh when it is defined and $file is not. (The arg $input is well-checked, maybe $fh got lost in the shuffle)-- I suggest --- IO.pm (revision 15529) +++ IO.pm (working copy) @@ -311,6 +311,9 @@ $self->throw("Could not open $file: $!"); $self->file($file); } + if ( defined($fh) && !ref($fh) ) { + $self->throw("file handle $fh doesn't appear to be a handle"); + } $self->_fh($fh) if $fh; # if not provided, defaults to STDIN and STDOUT $self->_flush_on_write(defined $flush ? $flush : 1); MAJ ----- Original Message ----- From: "Hilmar Lapp" To: "Mark A. Jensen" Cc: "bioPerl List" ; "Chrysanthi A." Sent: Thursday, February 12, 2009 9:58 AM Subject: Re: [Bioperl-l] problem parsing a newick format > Note that the terminal semi-colon is part of the format spec. I've been > bitten by this a few times in other programs - it's quite common that > programs reading newick will throw an error or ignore the tree if it's not > terminated by semi-colon. > > Having said that, along the lines of being strict on what we emit but liberal > in what we accept, I guess it can be loosened up. But what if there is more > than one tree in the file? > > -hilmar > > On Feb 12, 2009, at 8:03 AM, Mark A. Jensen wrote: > >> No problem, Chyrsanthi-- >> >> Jason-- may I loosen up the parser a bit on this? >> MAJ >> ----- Original Message ----- >> From: Chrysanthi A. >> To: Mark A. Jensen >> Sent: Thursday, February 12, 2009 7:47 AM >> Subject: Re: [Bioperl-l] problem parsing a newick format >> >> >> Yes, that was the problem.. Now its working perfect!!!! Thanks a lot! >> >> Chrysanthi >> >> >> 2009/2/12 Mark A. Jensen >> >> Chrysanthi- >> Do the trees in your test file end with a semicolon? When I do >> >> use Bio::TreeIO; >> $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); >> $tre=$inp->next_tree; >> __END__ >> (A:1,(B:2,C:3)) >> >> $tre is empty, but when >> >> use Bio::TreeIO; >> $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); >> $tre=$inp->next_tree; >> __END__ >> (A:1,(B:2,C:3)); >> >> $tre contains the tree. >> >> If this is the problem, it sounds like a bug to me- >> Mark >> ----- Original Message ----- >> From: Chrysanthi A. >> To: Mark A. Jensen >> Cc: BioPerl List >> Sent: Thursday, February 12, 2009 5:42 AM >> Subject: Re: [Bioperl-l] problem parsing a newick format >> >> >> I tried also that, but it does not work.. It does not give me any error >> message.. It >> seems that the code is correct, but It does not print anything..Why??? >> >> Thanks, >> >> Chrysanthi. >> >> >> >> >> 2009/2/12 Mark A. Jensen >> >> C- I think you maybe want >> >> my $input = new Bio::TreeIO(-file =>"nexusCytochrome7R.newick", >> -format => "newick"); >> >> and not >> >> >> my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", >> -format => "newick"); >> >> >> ? >> >> Mark >> >> ----- Original Message ----- From: "Chrysanthi A." > > >> To: "BioPerl List" >> Sent: Wednesday, February 11, 2009 9:13 PM >> Subject: [Bioperl-l] problem parsing a newick format >> >> >> >> Is the code below correct?? Why it does not print anything??? >> use strict; >> >> use Bio::TreeIO; >> >> >> my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", >> -format => "newick"); >> >> while(my $tree = $input->next_tree){ >> for my $node(grep{!$_->is_Leaf}$tree->get_nodes){ >> next if !$node->ancestor; >> print "Node:", $node->id, "length:", $node->branch_length, " "; >> for my $child($node->get_Descendents){ >> print "child:", $child->id, "", $child->branch_length, " "; >> } >> print "\n"; >> } >> } >> >> Any ideas? I want to read a tree and mainly get the duplication >> events. >> Could someone help me? >> >> Thanks a lot, >> >> Chrysanthi >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From hlapp at gmx.net Thu Feb 12 10:53:47 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 12 Feb 2009 10:53:47 -0500 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: <141183C5F32B4CE08611C98B4D33BF72@NewLife> References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com><8468E65F03F84EC09E9EB152B78AB71C@NewLife><66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com><6E6237ABF3C64B6DB6980966665A4D88@NewLife><66b602900902120447l6bcd0d41s8738e8844d8047b5@mail.gmail.com> <85C2CDFC-BFC9-4ED3-A0FC-21AB7F78ED45@gmx.net> <141183C5F32B4CE08611C98B4D33BF72@NewLife> Message-ID: On Feb 12, 2009, at 10:07 AM, Mark A. Jensen wrote: > + if ( defined($fh) && !ref($fh) ) { > + $self->throw("file handle $fh doesn't appear to be a handle"); > + } You could go beyond that (a hashref would probably not do fine as a filehandle?) and test for it being a symbol. But the ref test alone will catch the most common case, which is passing a file name. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From maj at fortinbras.us Thu Feb 12 11:13:52 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Thu, 12 Feb 2009 11:13:52 -0500 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com><8468E65F03F84EC09E9EB152B78AB71C@NewLife><66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com><6E6237ABF3C64B6DB6980966665A4D88@NewLife><66b602900902120447l6bcd0d41s8738e8844d8047b5@mail.gmail.com><85C2CDFC-BFC9-4ED3-A0FC-21AB7F78ED45@gmx.net><141183C5F32B4CE08611C98B4D33BF72@NewLife> Message-ID: right you are- I committed if ( defined($fh) && !(ref($fh) && ((ref($fh) eq "GLOB") || $fh->isa('IO::Handle'))) ) { $self->throw("file handle $fh doesn't appear to be a handle"); } in keeping with style further up in the code at this point- MAJ ----- Original Message ----- From: "Hilmar Lapp" To: "Mark A. Jensen" Cc: "bioPerl List" ; "Chrysanthi A." Sent: Thursday, February 12, 2009 10:53 AM Subject: Re: [Bioperl-l] problem parsing a newick format > > On Feb 12, 2009, at 10:07 AM, Mark A. Jensen wrote: > >> + if ( defined($fh) && !ref($fh) ) { >> + $self->throw("file handle $fh doesn't appear to be a handle"); >> + } > > > You could go beyond that (a hashref would probably not do fine as a > filehandle?) and test for it being a symbol. > > But the ref test alone will catch the most common case, which is passing a > file name. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Thu Feb 12 11:15:09 2009 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 12 Feb 2009 10:15:09 -0600 Subject: [Bioperl-l] problem parsing a newick format In-Reply-To: <85C2CDFC-BFC9-4ED3-A0FC-21AB7F78ED45@gmx.net> References: <66b602900902111813oef7bb23k45be3c97e4fcefe6@mail.gmail.com> <8468E65F03F84EC09E9EB152B78AB71C@NewLife> <66b602900902120242t69fc0d9cp442a23f308721c19@mail.gmail.com> <6E6237ABF3C64B6DB6980966665A4D88@NewLife> <66b602900902120447l6bcd0d41s8738e8844d8047b5@mail.gmail.com> <85C2CDFC-BFC9-4ED3-A0FC-21AB7F78ED45@gmx.net> Message-ID: Then the semicolon is required. The split is along ";\n". Committed a fix for this; it was an event handler issue (just needed a tail check for any remaining data and return it if the tree is defined). chris On Feb 12, 2009, at 8:58 AM, Hilmar Lapp wrote: > Note that the terminal semi-colon is part of the format spec. I've > been bitten by this a few times in other programs - it's quite > common that programs reading newick will throw an error or ignore > the tree if it's not terminated by semi-colon. > > Having said that, along the lines of being strict on what we emit > but liberal in what we accept, I guess it can be loosened up. But > what if there is more than one tree in the file? > > -hilmar > > On Feb 12, 2009, at 8:03 AM, Mark A. Jensen wrote: > >> No problem, Chyrsanthi-- >> >> Jason-- may I loosen up the parser a bit on this? >> MAJ >> ----- Original Message ----- >> From: Chrysanthi A. >> To: Mark A. Jensen >> Sent: Thursday, February 12, 2009 7:47 AM >> Subject: Re: [Bioperl-l] problem parsing a newick format >> >> >> Yes, that was the problem.. Now its working perfect!!!! Thanks a lot! >> >> Chrysanthi >> >> >> 2009/2/12 Mark A. Jensen >> >> Chrysanthi- >> Do the trees in your test file end with a semicolon? When I do >> >> use Bio::TreeIO; >> $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); >> $tre=$inp->next_tree; >> __END__ >> (A:1,(B:2,C:3)) >> >> $tre is empty, but when >> >> use Bio::TreeIO; >> $inp = Bio::TreeIO->new(-fh=>\*DATA, -format=>'newick); >> $tre=$inp->next_tree; >> __END__ >> (A:1,(B:2,C:3)); >> >> $tre contains the tree. >> >> If this is the problem, it sounds like a bug to me- >> Mark >> ----- Original Message ----- >> From: Chrysanthi A. >> To: Mark A. Jensen >> Cc: BioPerl List >> Sent: Thursday, February 12, 2009 5:42 AM >> Subject: Re: [Bioperl-l] problem parsing a newick format >> >> >> I tried also that, but it does not work.. It does not give me >> any error message.. It >> seems that the code is correct, but It does not print >> anything..Why??? >> >> Thanks, >> >> Chrysanthi. >> >> >> >> >> 2009/2/12 Mark A. Jensen >> >> C- I think you maybe want >> >> my $input = new Bio::TreeIO(-file =>"nexusCytochrome7R.newick", >> -format => "newick"); >> >> and not >> >> >> my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", >> -format => "newick"); >> >> >> ? >> >> Mark >> >> ----- Original Message ----- From: "Chrysanthi A." > > >> To: "BioPerl List" >> Sent: Wednesday, February 11, 2009 9:13 PM >> Subject: [Bioperl-l] problem parsing a newick format >> >> >> >> Is the code below correct?? Why it does not print anything??? >> use strict; >> >> use Bio::TreeIO; >> >> >> my $input = new Bio::TreeIO(-fh =>"nexusCytochrome7R.newick", >> -format => "newick"); >> >> while(my $tree = $input->next_tree){ >> for my $node(grep{!$_->is_Leaf}$tree->get_nodes){ >> next if !$node->ancestor; >> print "Node:", $node->id, "length:", $node->branch_length, >> " "; >> for my $child($node->get_Descendents){ >> print "child:", $child->id, "", $child->branch_length, " "; >> } >> print "\n"; >> } >> } >> >> Any ideas? I want to read a tree and mainly get the >> duplication events. >> Could someone help me? >> >> Thanks a lot, >> >> Chrysanthi >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From suedejohnny at hotmail.com Thu Feb 12 16:26:45 2009 From: suedejohnny at hotmail.com (Julio514) Date: Thu, 12 Feb 2009 13:26:45 -0800 (PST) Subject: [Bioperl-l] Plse help! Probs with SeqIO in web service (web form)!!! In-Reply-To: <179144D306034A38BB92DE80910F9225@NewLife> References: <21902369.post@talk.nabble.com> <179144D306034A38BB92DE80910F9225@NewLife> Message-ID: <21985362.post@talk.nabble.com> Thank you for your reply. I solved the prob by declaringthe env variable CLUSTALDIR in the httpd.conf file... Cheers, Mark A. Jensen wrote: > > The 'Invalid argument' appears to be thrown in DOS when there is not enuf > space > in the partition. In any case, this is pretty clearly an OS issue- Don't > forget > too that > the web server prob doesn't have the same permissions when running the > script > as the user, even when running it in the same place. MAJ > ----- Original Message ----- > From: "Julio514" > To: > Sent: Sunday, February 08, 2009 2:26 PM > Subject: [Bioperl-l] Plse help! Probs with SeqIO in web service (web > form)!!! > > >> >> Hello everyone, >> >> I've been using bioperl for more than one year now. Recently, I started a >> project of establishing a web service that accept for input one or many >> fasta sequences. I tested my script offline with .fa files directly on my >> HD >> to make sure everything was fine. And it was... After that, I modified >> the >> script with some CGI lines to make it compatible with web forms. (btw I >> still am a noob in web services:)). Anyway, the fasta input sequences >> seems >> to cause a prob I never encountered before and I am clueless... Here it >> is: >> >> Error in tempdir() using \\\\XXXXXXXXXX: Could not create directory >> \\\\6RzqdL7zU9: Invalid argument at C:/perl/site/lib/Bio/Root/IO.pm line >> 744, >> >> Anyone ever saw that ? >> I made sure that my $ENV{TMPDIR} was NOT read-only... >> >> Cheers, >> -- >> View this message in context: >> http://www.nabble.com/Plse-help%21-Probs-with-SeqIO-in-web-service-%28web-form%29%21%21%21-tp21902369p21902369.html >> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/Plse-help%21-Probs-with-SeqIO-in-web-service-%28web-form%29%21%21%21-tp21902369p21985362.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From hlapp at gmx.net Fri Feb 13 11:53:32 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 13 Feb 2009 11:53:32 -0500 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers Message-ID: Google is committed to run the Summer of Code program [1] again this year. It will be for the 5th time. In broad strokes, the program funds what you might call remote summer internships for students to contribute to an open-source software project. Participating projects (or umbrella organizations) provide project ideas and supply mentors that guide the work on those. Students apply to a project within the program with specific project ideas, based on those suggested or based on their own idea, get ranked by the mentors of the project, and those accepted into the program get paired up with mentors. Projects are chiefly about programming, the coding period is 3 months (Jun-Aug), and there is no travel required by either student or mentor. The program is global; other than the US trade restrictions that Google is under, there are no restrictions as to where student or mentor reside. The main motivations behind the program are to recruit new contributors to open-source projects, and to produce more open-source code. See the program FAQs [2] for more information. I've had the honor of being part of the program for the last two years, administering NESCent's participation as an organization [3] and in 2007 mentoring a student. I have to say I find it the most awesome open-source program since sliced bread (or the invention of BLAST if that means more to you). Despite that and sadly enough, there has been a dearth of participating bioinformatics projects (though some notable ones, such as CytoScape have participated). There have been two Bio* Summer of Code projects under the NESCent umbrella, one in 2007 [4] and one in 2008 [5]. I would be willing to volunteer to take the lead on and administer a full-blown participation of O|B|F as a Bio* umbrella organization, provided 1) at least one Bio* person volunteers to serve as backup administrator, and 2) enough Bio* contributors volunteer to serve as prospective mentors. Mentoring involves participating in creating the page of project ideas (I'd provide template and guidance), corresponding with applicants who have questions, participating in student application ranking, and for primary mentors (those directly assigned to a student) based on empirical evidence at least 5hrs/week of time spent with the student to help him/her get over obstacles or avoid wrong paths. I think almost all mentors would concur that the experience was very gratifying, but as a mentor you will be spending a non-negligible amount of time with the student. I think it is the student-mentor pairing and interaction, not the stipend, that in the end makes the participation for students uniquely productive in terms of learning, and different from simply contributing to the project of choice (which they could always do). For a personal impression for how the program is from a mentor perspective, I'll let Chris Fields speak who was the mentor for the 2008 phyloXML in BioPerl project. From a student's perspective, I'll leave it to the 2007 Biojava student Bohyun Lee (blee34-at- mail.gatech.edu) and the 2008 BioPerl student Mira Han (mirhan-at- indiana.edu) to comment (if they are still on the list). So if you think this is a good idea for Bio* to be part of, if you would like to help in some way, if you can see yourself as a mentor, or if you are a lurking would-be student, please let yourself be heard. Email either to the list or to me. Cheers, -hilmar [1] http://code.google.com/soc/2008 [2] http://code.google.com/opensource/gsoc/2009/faqs.html [3] http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2007 http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2008 [4] http://biojava.org/wiki/BioJava:PhyloSOC07 [5] http://bioperl.org/wiki/PhyloXML_support_in_BioPerl -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From maj at fortinbras.us Fri Feb 13 12:14:36 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 13 Feb 2009 12:14:36 -0500 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers In-Reply-To: References: Message-ID: <37ABB1B8019D480CAC42C26A00DA97CE@NewLife> If my newbie status is not a barrier, I would be pleased to mentor a student. If it is a barrier, I would be pleased to look at applications or what have you. Mark ----- Original Message ----- From: "Hilmar Lapp" To: "bioPerl List" Sent: Friday, February 13, 2009 11:53 AM Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers > Google is committed to run the Summer of Code program [1] again this > year. It will be for the 5th time. > > In broad strokes, the program funds what you might call remote summer > internships for students to contribute to an open-source software > project. Participating projects (or umbrella organizations) provide > project ideas and supply mentors that guide the work on those. > Students apply to a project within the program with specific project > ideas, based on those suggested or based on their own idea, get ranked > by the mentors of the project, and those accepted into the program get > paired up with mentors. Projects are chiefly about programming, the > coding period is 3 months (Jun-Aug), and there is no travel required > by either student or mentor. The program is global; other than the US > trade restrictions that Google is under, there are no restrictions as > to where student or mentor reside. The main motivations behind the > program are to recruit new contributors to open-source projects, and > to produce more open-source code. See the program FAQs [2] for more > information. > > I've had the honor of being part of the program for the last two > years, administering NESCent's participation as an organization [3] > and in 2007 mentoring a student. I have to say I find it the most > awesome open-source program since sliced bread (or the invention of > BLAST if that means more to you). Despite that and sadly enough, there > has been a dearth of participating bioinformatics projects (though > some notable ones, such as CytoScape have participated). > > There have been two Bio* Summer of Code projects under the NESCent > umbrella, one in 2007 [4] and one in 2008 [5]. I would be willing to > volunteer to take the lead on and administer a full-blown > participation of O|B|F as a Bio* umbrella organization, provided 1) at > least one Bio* person volunteers to serve as backup administrator, and > 2) enough Bio* contributors volunteer to serve as prospective mentors. > > Mentoring involves participating in creating the page of project ideas > (I'd provide template and guidance), corresponding with applicants who > have questions, participating in student application ranking, and for > primary mentors (those directly assigned to a student) based on > empirical evidence at least 5hrs/week of time spent with the student > to help him/her get over obstacles or avoid wrong paths. > > I think almost all mentors would concur that the experience was very > gratifying, but as a mentor you will be spending a non-negligible > amount of time with the student. I think it is the student-mentor > pairing and interaction, not the stipend, that in the end makes the > participation for students uniquely productive in terms of learning, > and different from simply contributing to the project of choice (which > they could always do). > > For a personal impression for how the program is from a mentor > perspective, I'll let Chris Fields speak who was the mentor for the > 2008 phyloXML in BioPerl project. From a student's perspective, I'll > leave it to the 2007 Biojava student Bohyun Lee (blee34-at- > mail.gatech.edu) and the 2008 BioPerl student Mira Han (mirhan-at- > indiana.edu) to comment (if they are still on the list). > > So if you think this is a good idea for Bio* to be part of, if you > would like to help in some way, if you can see yourself as a mentor, > or if you are a lurking would-be student, please let yourself be > heard. Email either to the list or to me. > > Cheers, > > -hilmar > > [1] http://code.google.com/soc/2008 > > [2] http://code.google.com/opensource/gsoc/2009/faqs.html > > [3] http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2007 > http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2008 > > [4] http://biojava.org/wiki/BioJava:PhyloSOC07 > > [5] http://bioperl.org/wiki/PhyloXML_support_in_BioPerl > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From jaudall at gmail.com Fri Feb 13 12:25:22 2009 From: jaudall at gmail.com (Joshua Udall) Date: Fri, 13 Feb 2009 10:25:22 -0700 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers In-Reply-To: <37ABB1B8019D480CAC42C26A00DA97CE@NewLife> References: <37ABB1B8019D480CAC42C26A00DA97CE@NewLife> Message-ID: <52cea20c0902130925x6d831303q5144020f06a42638@mail.gmail.com> Ditto here. I would be happy to mentor a student or pitch in some other way. Josh On Fri, Feb 13, 2009 at 10:14 AM, Mark A. Jensen wrote: > If my newbie status is not a barrier, I would be pleased to mentor a > student. If it is a barrier, I would be pleased to look at applications > or what have you. > Mark > ----- Original Message ----- From: "Hilmar Lapp" > To: "bioPerl List" > Sent: Friday, February 13, 2009 11:53 AM > Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers > > >> Google is committed to run the Summer of Code program [1] again this >> year. It will be for the 5th time. >> >> In broad strokes, the program funds what you might call remote summer >> internships for students to contribute to an open-source software project. >> Participating projects (or umbrella organizations) provide project ideas >> and supply mentors that guide the work on those. Students apply to a >> project within the program with specific project ideas, based on those >> suggested or based on their own idea, get ranked by the mentors of the >> project, and those accepted into the program get paired up with mentors. >> Projects are chiefly about programming, the coding period is 3 months >> (Jun-Aug), and there is no travel required by either student or mentor. The >> program is global; other than the US trade restrictions that Google is >> under, there are no restrictions as to where student or mentor reside. The >> main motivations behind the program are to recruit new contributors to >> open-source projects, and to produce more open-source code. See the program >> FAQs [2] for more information. >> >> I've had the honor of being part of the program for the last two years, >> administering NESCent's participation as an organization [3] and in 2007 >> mentoring a student. I have to say I find it the most awesome open-source >> program since sliced bread (or the invention of BLAST if that means more to >> you). Despite that and sadly enough, there has been a dearth of >> participating bioinformatics projects (though some notable ones, such as >> CytoScape have participated). >> >> There have been two Bio* Summer of Code projects under the NESCent >> umbrella, one in 2007 [4] and one in 2008 [5]. I would be willing to >> volunteer to take the lead on and administer a full-blown participation of >> O|B|F as a Bio* umbrella organization, provided 1) at least one Bio* person >> volunteers to serve as backup administrator, and 2) enough Bio* >> contributors volunteer to serve as prospective mentors. >> >> Mentoring involves participating in creating the page of project ideas >> (I'd provide template and guidance), corresponding with applicants who >> have questions, participating in student application ranking, and for >> primary mentors (those directly assigned to a student) based on empirical >> evidence at least 5hrs/week of time spent with the student to help him/her >> get over obstacles or avoid wrong paths. >> >> I think almost all mentors would concur that the experience was very >> gratifying, but as a mentor you will be spending a non-negligible amount >> of time with the student. I think it is the student-mentor pairing and >> interaction, not the stipend, that in the end makes the participation for >> students uniquely productive in terms of learning, and different from >> simply contributing to the project of choice (which they could always do). >> >> For a personal impression for how the program is from a mentor >> perspective, I'll let Chris Fields speak who was the mentor for the 2008 >> phyloXML in BioPerl project. From a student's perspective, I'll leave it to >> the 2007 Biojava student Bohyun Lee (blee34-at- mail.gatech.edu) and the >> 2008 BioPerl student Mira Han (mirhan-at- indiana.edu) to comment (if they >> are still on the list). >> >> So if you think this is a good idea for Bio* to be part of, if you would >> like to help in some way, if you can see yourself as a mentor, or if you >> are a lurking would-be student, please let yourself be heard. Email either >> to the list or to me. >> >> Cheers, >> >> -hilmar >> >> [1] http://code.google.com/soc/2008 >> >> [2] http://code.google.com/opensource/gsoc/2009/faqs.html >> >> [3] http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2007 >> http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2008 >> >> [4] http://biojava.org/wiki/BioJava:PhyloSOC07 >> >> [5] http://bioperl.org/wiki/PhyloXML_support_in_BioPerl >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Joshua Udall Assistant Professor 295 WIDB Plant and Wildlife Science Dept. Brigham Young University Provo, UT 84602 801-422-9307 Fax: 801-422-0008 USA From dalloliogm at gmail.com Fri Feb 13 12:57:31 2009 From: dalloliogm at gmail.com (Giovanni Marco Dall'Olio) Date: Fri, 13 Feb 2009 18:57:31 +0100 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers In-Reply-To: References: Message-ID: <5aa3b3570902130957u4bf0790aleec5431a6025661b@mail.gmail.com> The past december I have posted an idea[1] on the open-bio list that could be interesting. It is about creating a common repository of tests and use cases for all the bio.* projects, to make it easier the development of all of them and make them more compatible, easier to compare. For example, if we store all the possible use cases on sequence handling (e.g. which format to parse, which tests, which are the most common things that people need to do with sequences), then it would be easier to coordinate all the bio.* projects and see which are the better way to implement them. What do you think? Is it a good idea? Is it feasible? I don't have much experience in writing use cases, so we will need a good mentor. Moreover, how much time of the day does it take to partecipate to a summer of code program? Is it a full-time work, or I can do it only on the evenings and on my free time? cheers! :-) [1] http://lists.open-bio.org/pipermail/open-bio-l/2008-December/000502.html On Fri, Feb 13, 2009 at 5:53 PM, Hilmar Lapp wrote: > Google is committed to run the Summer of Code program [1] again this year. > It will be for the 5th time. > > In broad strokes, the program funds what you might call remote summer > internships for students to contribute to an open-source software project. > Participating projects (or umbrella organizations) provide project ideas and > supply mentors that guide the work on those. Students apply to a project > within the program with specific project ideas, based on those suggested or > based on their own idea, get ranked by the mentors of the project, and those > accepted into the program get paired up with mentors. Projects are chiefly > about programming, the coding period is 3 months (Jun-Aug), and there is no > travel required by either student or mentor. The program is global; other > than the US trade restrictions that Google is under, there are no > restrictions as to where student or mentor reside. The main motivations > behind the program are to recruit new contributors to open-source projects, > and to produce more open-source code. See the program FAQs [2] for more > information. > > I've had the honor of being part of the program for the last two years, > administering NESCent's participation as an organization [3] and in 2007 > mentoring a student. I have to say I find it the most awesome open-source > program since sliced bread (or the invention of BLAST if that means more to > you). Despite that and sadly enough, there has been a dearth of > participating bioinformatics projects (though some notable ones, such as > CytoScape have participated). > > There have been two Bio* Summer of Code projects under the NESCent umbrella, > one in 2007 [4] and one in 2008 [5]. I would be willing to volunteer to take > the lead on and administer a full-blown participation of O|B|F as a Bio* > umbrella organization, provided 1) at least one Bio* person volunteers to > serve as backup administrator, and 2) enough Bio* contributors volunteer to > serve as prospective mentors. > > Mentoring involves participating in creating the page of project ideas (I'd > provide template and guidance), corresponding with applicants who have > questions, participating in student application ranking, and for primary > mentors (those directly assigned to a student) based on empirical evidence > at least 5hrs/week of time spent with the student to help him/her get over > obstacles or avoid wrong paths. > > I think almost all mentors would concur that the experience was very > gratifying, but as a mentor you will be spending a non-negligible amount of > time with the student. I think it is the student-mentor pairing and > interaction, not the stipend, that in the end makes the participation for > students uniquely productive in terms of learning, and different from simply > contributing to the project of choice (which they could always do). > > For a personal impression for how the program is from a mentor perspective, > I'll let Chris Fields speak who was the mentor for the 2008 phyloXML in > BioPerl project. From a student's perspective, I'll leave it to the 2007 > Biojava student Bohyun Lee (blee34-at-mail.gatech.edu) and the 2008 BioPerl > student Mira Han (mirhan-at-indiana.edu) to comment (if they are still on > the list). > > So if you think this is a good idea for Bio* to be part of, if you would > like to help in some way, if you can see yourself as a mentor, or if you are > a lurking would-be student, please let yourself be heard. Email either to > the list or to me. > > Cheers, > > -hilmar > > [1] http://code.google.com/soc/2008 > > [2] http://code.google.com/opensource/gsoc/2009/faqs.html > > [3] http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2007 > http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2008 > > [4] http://biojava.org/wiki/BioJava:PhyloSOC07 > > [5] http://bioperl.org/wiki/PhyloXML_support_in_BioPerl > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- My blog on bioinformatics (now in English): http://bioinfoblog.it From hlapp at gmx.net Fri Feb 13 13:31:18 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 13 Feb 2009 13:31:18 -0500 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers In-Reply-To: <5aa3b3570902130957u4bf0790aleec5431a6025661b@mail.gmail.com> References: <5aa3b3570902130957u4bf0790aleec5431a6025661b@mail.gmail.com> Message-ID: <33933A1E-1C9E-4036-B856-460B5B094C95@gmx.net> On Feb 13, 2009, at 12:57 PM, Giovanni Marco Dall'Olio wrote: > how much time of the day does it take to partecipate to a summer of > code program? > Is it a full-time work, or I can do it only on the evenings and on > my free time? That's a great question. In principle (i.e., the official position) is that being a Summer of Code student is meant to be a full-time occupation for the duration of the coding period. That said, some people manage to work a full-time job on their evenings and weekends. It also doesn't mean that you can't be doing something else on the side. But empirical experience shows very clearly that if as a student you try to do Summer of Code on the side, at best, i.e., if you're a strong programmer and have the necessary domain background, you'll be on the brink of failure. The great majority of students is less strong, however (and making them stronger is one of the goals of the program). At the end of the day, success isn't measured in hours spent. It's whether you reach your set goals or not. Every student is different in terms of skills and personality they bring to the table, and every project is different, so there is no hard-and-fast rule as to how much time you need to be able to spend. But if it's not going to be your primary activity that takes priority over everything else, then you're not set up to succeed. And setting students up for success is another goal of the program. Does that make sense? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Fri Feb 13 13:38:39 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 13 Feb 2009 13:38:39 -0500 Subject: [Bioperl-l] [bip] summer of code In-Reply-To: <4995AFE7.3020503@gmail.com> References: <5aa3b3570902130927o222f0411tfc2728b7c25c944a@mail.gmail.com> <4995AFE7.3020503@gmail.com> Message-ID: <97D828AD-3E4F-4502-A1C9-84985142AC5A@gmx.net> Giovanni Marco Dall'Olio wrote: > they just have sent an email explaining that they have been included > in the google-summer of code program. Just to prevent any misunderstandings, whoever is meant by "they", nobody has been included in the Summer of Code program yet for this year. Organizations who want to participate can start applying March 9. Google expects to announce who has been accepted on March 18. Last year, about 2/3 of applying orgs were not accepted. This year the rejection ratio is going to be higher b/c fewer orgs will be accepted. There is no guarantee that we would be accepted if we apply. I'm relatively optimistic though that we'd have a fair chance, but I'm often optimistic. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at illinois.edu Fri Feb 13 15:04:43 2009 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 13 Feb 2009 14:04:43 -0600 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers In-Reply-To: <37ABB1B8019D480CAC42C26A00DA97CE@NewLife> References: <37ABB1B8019D480CAC42C26A00DA97CE@NewLife> Message-ID: <898DCC9A-B8E9-4612-9398-6F1319AB021E@illinois.edu> Hilmar, I second Mark as a mentor. Or would that be 'pushing him over the line?' ;> chris On Feb 13, 2009, at 11:14 AM, Mark A. Jensen wrote: > If my newbie status is not a barrier, I would be pleased to mentor a > student. If it is a barrier, I would be pleased to look at > applications > or what have you. > Mark > ----- Original Message ----- From: "Hilmar Lapp" > To: "bioPerl List" > Sent: Friday, February 13, 2009 11:53 AM > Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers > > >> Google is committed to run the Summer of Code program [1] again >> this year. It will be for the 5th time. >> In broad strokes, the program funds what you might call remote >> summer internships for students to contribute to an open-source >> software project. Participating projects (or umbrella >> organizations) provide project ideas and supply mentors that guide >> the work on those. Students apply to a project within the program >> with specific project ideas, based on those suggested or based on >> their own idea, get ranked by the mentors of the project, and >> those accepted into the program get paired up with mentors. >> Projects are chiefly about programming, the coding period is 3 >> months (Jun-Aug), and there is no travel required by either >> student or mentor. The program is global; other than the US trade >> restrictions that Google is under, there are no restrictions as to >> where student or mentor reside. The main motivations behind the >> program are to recruit new contributors to open-source projects, >> and to produce more open-source code. See the program FAQs [2] for >> more information. >> I've had the honor of being part of the program for the last two >> years, administering NESCent's participation as an organization >> [3] and in 2007 mentoring a student. I have to say I find it the >> most awesome open-source program since sliced bread (or the >> invention of BLAST if that means more to you). Despite that and >> sadly enough, there has been a dearth of participating >> bioinformatics projects (though some notable ones, such as >> CytoScape have participated). >> There have been two Bio* Summer of Code projects under the NESCent >> umbrella, one in 2007 [4] and one in 2008 [5]. I would be willing >> to volunteer to take the lead on and administer a full-blown >> participation of O|B|F as a Bio* umbrella organization, provided 1) >> at least one Bio* person volunteers to serve as backup >> administrator, and 2) enough Bio* contributors volunteer to serve >> as prospective mentors. >> Mentoring involves participating in creating the page of project >> ideas (I'd provide template and guidance), corresponding with >> applicants who have questions, participating in student >> application ranking, and for primary mentors (those directly >> assigned to a student) based on empirical evidence at least 5hrs/ >> week of time spent with the student to help him/her get over >> obstacles or avoid wrong paths. >> I think almost all mentors would concur that the experience was >> very gratifying, but as a mentor you will be spending a non- >> negligible amount of time with the student. I think it is the >> student-mentor pairing and interaction, not the stipend, that in >> the end makes the participation for students uniquely productive >> in terms of learning, and different from simply contributing to >> the project of choice (which they could always do). >> For a personal impression for how the program is from a mentor >> perspective, I'll let Chris Fields speak who was the mentor for >> the 2008 phyloXML in BioPerl project. From a student's >> perspective, I'll leave it to the 2007 Biojava student Bohyun Lee >> (blee34-at- mail.gatech.edu) and the 2008 BioPerl student Mira Han >> (mirhan-at- indiana.edu) to comment (if they are still on the list). >> So if you think this is a good idea for Bio* to be part of, if you >> would like to help in some way, if you can see yourself as a >> mentor, or if you are a lurking would-be student, please let >> yourself be heard. Email either to the list or to me. >> Cheers, >> -hilmar >> [1] http://code.google.com/soc/2008 >> [2] http://code.google.com/opensource/gsoc/2009/faqs.html >> [3] http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2007 >> http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2008 >> [4] http://biojava.org/wiki/BioJava:PhyloSOC07 >> [5] http://bioperl.org/wiki/PhyloXML_support_in_BioPerl >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Feb 13 15:17:29 2009 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 13 Feb 2009 14:17:29 -0600 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers In-Reply-To: References: Message-ID: <7C21CA74-9694-4962-9C64-A7D95D06CD53@illinois.edu> On Feb 13, 2009, at 10:53 AM, Hilmar Lapp wrote: > Google is committed to run the Summer of Code program [1] again this > year. It will be for the 5th time. > [....] > > For a personal impression for how the program is from a mentor > perspective, I'll let Chris Fields speak who was the mentor for the > 2008 phyloXML in BioPerl project. From a student's perspective, I'll > leave it to the 2007 Biojava student Bohyun Lee (blee34-at- > mail.gatech.edu) and the 2008 BioPerl student Mira Han (mirhan-at- > indiana.edu) to comment (if they are still on the list). It's a wonderful program to be involved in, and Hilmar did a wonderful job last year organizing everything. I encourage anyone interested to apply as a student or take on the role of a mentor; GSoC is a great opportunity both as a mentor and as a student. Not to mention Mira Han also wrote a wonderful bit of code (the phyloxml parser and related modules), and we all got great T-shirts! Hilmar, is there a particular focus on projects this year? I think Google announced earlier than normal to allow a bit more time for abstracts and such. chris > So if you think this is a good idea for Bio* to be part of, if you > would like to help in some way, if you can see yourself as a mentor, > or if you are a lurking would-be student, please let yourself be > heard. Email either to the list or to me. > > Cheers, > > -hilmar > > [1] http://code.google.com/soc/2008 > > [2] http://code.google.com/opensource/gsoc/2009/faqs.html > > [3] http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2007 > http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2008 > > [4] http://biojava.org/wiki/BioJava:PhyloSOC07 > > [5] http://bioperl.org/wiki/PhyloXML_support_in_BioPerl > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Feb 13 15:20:04 2009 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 13 Feb 2009 14:20:04 -0600 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers In-Reply-To: <52cea20c0902130925x6d831303q5144020f06a42638@mail.gmail.com> References: <37ABB1B8019D480CAC42C26A00DA97CE@NewLife> <52cea20c0902130925x6d831303q5144020f06a42638@mail.gmail.com> Message-ID: <0F41DEF6-1A63-4F4E-A31F-E3D515D474BA@illinois.edu> We had co-mentors last year for most projects (though in general there is one primary mentor). Not sure if the same will occur for this year. chris On Feb 13, 2009, at 11:25 AM, Joshua Udall wrote: > Ditto here. I would be happy to mentor a student or pitch in some > other way. > > Josh > > On Fri, Feb 13, 2009 at 10:14 AM, Mark A. Jensen > wrote: >> If my newbie status is not a barrier, I would be pleased to mentor a >> student. If it is a barrier, I would be pleased to look at >> applications >> or what have you. >> Mark >> ----- Original Message ----- From: "Hilmar Lapp" >> To: "bioPerl List" >> Sent: Friday, February 13, 2009 11:53 AM >> Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers >> >> >>> Google is committed to run the Summer of Code program [1] again this >>> year. It will be for the 5th time. >>> >>> In broad strokes, the program funds what you might call remote >>> summer >>> internships for students to contribute to an open-source software >>> project. >>> Participating projects (or umbrella organizations) provide >>> project ideas >>> and supply mentors that guide the work on those. Students apply >>> to a >>> project within the program with specific project ideas, based on >>> those >>> suggested or based on their own idea, get ranked by the mentors >>> of the >>> project, and those accepted into the program get paired up with >>> mentors. >>> Projects are chiefly about programming, the coding period is 3 >>> months >>> (Jun-Aug), and there is no travel required by either student or >>> mentor. The >>> program is global; other than the US trade restrictions that >>> Google is >>> under, there are no restrictions as to where student or mentor >>> reside. The >>> main motivations behind the program are to recruit new >>> contributors to >>> open-source projects, and to produce more open-source code. See >>> the program >>> FAQs [2] for more information. >>> >>> I've had the honor of being part of the program for the last two >>> years, >>> administering NESCent's participation as an organization [3] and >>> in 2007 >>> mentoring a student. I have to say I find it the most awesome >>> open-source >>> program since sliced bread (or the invention of BLAST if that >>> means more to >>> you). Despite that and sadly enough, there has been a dearth of >>> participating bioinformatics projects (though some notable ones, >>> such as >>> CytoScape have participated). >>> >>> There have been two Bio* Summer of Code projects under the NESCent >>> umbrella, one in 2007 [4] and one in 2008 [5]. I would be willing to >>> volunteer to take the lead on and administer a full-blown >>> participation of >>> O|B|F as a Bio* umbrella organization, provided 1) at least one >>> Bio* person >>> volunteers to serve as backup administrator, and 2) enough Bio* >>> contributors volunteer to serve as prospective mentors. >>> >>> Mentoring involves participating in creating the page of project >>> ideas >>> (I'd provide template and guidance), corresponding with applicants >>> who >>> have questions, participating in student application ranking, and >>> for >>> primary mentors (those directly assigned to a student) based on >>> empirical >>> evidence at least 5hrs/week of time spent with the student to >>> help him/her >>> get over obstacles or avoid wrong paths. >>> >>> I think almost all mentors would concur that the experience was very >>> gratifying, but as a mentor you will be spending a non-negligible >>> amount >>> of time with the student. I think it is the student-mentor >>> pairing and >>> interaction, not the stipend, that in the end makes the >>> participation for >>> students uniquely productive in terms of learning, and different >>> from >>> simply contributing to the project of choice (which they could >>> always do). >>> >>> For a personal impression for how the program is from a mentor >>> perspective, I'll let Chris Fields speak who was the mentor for >>> the 2008 >>> phyloXML in BioPerl project. From a student's perspective, I'll >>> leave it to >>> the 2007 Biojava student Bohyun Lee (blee34-at- mail.gatech.edu) >>> and the >>> 2008 BioPerl student Mira Han (mirhan-at- indiana.edu) to comment >>> (if they >>> are still on the list). >>> >>> So if you think this is a good idea for Bio* to be part of, if >>> you would >>> like to help in some way, if you can see yourself as a mentor, or >>> if you >>> are a lurking would-be student, please let yourself be heard. >>> Email either >>> to the list or to me. >>> >>> Cheers, >>> >>> -hilmar >>> >>> [1] http://code.google.com/soc/2008 >>> >>> [2] http://code.google.com/opensource/gsoc/2009/faqs.html >>> >>> [3] http://hackathon.nescent.org/ >>> Phyloinformatics_Summer_of_Code_2007 >>> http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2008 >>> >>> [4] http://biojava.org/wiki/BioJava:PhyloSOC07 >>> >>> [5] http://bioperl.org/wiki/PhyloXML_support_in_BioPerl >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > > -- > Joshua Udall > Assistant Professor > 295 WIDB > Plant and Wildlife Science Dept. > Brigham Young University > Provo, UT 84602 > 801-422-9307 > Fax: 801-422-0008 > USA > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Fri Feb 13 15:27:44 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 13 Feb 2009 15:27:44 -0500 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers In-Reply-To: <898DCC9A-B8E9-4612-9398-6F1319AB021E@illinois.edu> References: <37ABB1B8019D480CAC42C26A00DA97CE@NewLife> <898DCC9A-B8E9-4612-9398-6F1319AB021E@illinois.edu> Message-ID: <880D6C4314024669BF76B5EEED94BAA3@NewLife> I think I passed the point of no return when I actually started *reading* Higher-Order Perl..... ----- Original Message ----- From: "Chris Fields" To: "Mark A. Jensen" Cc: "Hilmar Lapp" ; "bioPerl List" Sent: Friday, February 13, 2009 3:04 PM Subject: Re: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers > Hilmar, > > I second Mark as a mentor. Or would that be 'pushing him over the line?' > ;> > > chris > > On Feb 13, 2009, at 11:14 AM, Mark A. Jensen wrote: > >> If my newbie status is not a barrier, I would be pleased to mentor a >> student. If it is a barrier, I would be pleased to look at applications >> or what have you. >> Mark >> ----- Original Message ----- From: "Hilmar Lapp" >> To: "bioPerl List" >> Sent: Friday, February 13, 2009 11:53 AM >> Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers >> >> >>> Google is committed to run the Summer of Code program [1] again this year. >>> It will be for the 5th time. >>> In broad strokes, the program funds what you might call remote summer >>> internships for students to contribute to an open-source software project. >>> Participating projects (or umbrella organizations) provide project ideas >>> and supply mentors that guide the work on those. Students apply to a >>> project within the program with specific project ideas, based on those >>> suggested or based on their own idea, get ranked by the mentors of the >>> project, and those accepted into the program get paired up with mentors. >>> Projects are chiefly about programming, the coding period is 3 months >>> (Jun-Aug), and there is no travel required by either student or mentor. >>> The program is global; other than the US trade restrictions that Google is >>> under, there are no restrictions as to where student or mentor reside. The >>> main motivations behind the program are to recruit new contributors to >>> open-source projects, and to produce more open-source code. See the >>> program FAQs [2] for more information. >>> I've had the honor of being part of the program for the last two years, >>> administering NESCent's participation as an organization [3] and in 2007 >>> mentoring a student. I have to say I find it the most awesome open-source >>> program since sliced bread (or the invention of BLAST if that means more >>> to you). Despite that and sadly enough, there has been a dearth of >>> participating bioinformatics projects (though some notable ones, such as >>> CytoScape have participated). >>> There have been two Bio* Summer of Code projects under the NESCent >>> umbrella, one in 2007 [4] and one in 2008 [5]. I would be willing to >>> volunteer to take the lead on and administer a full-blown participation of >>> O|B|F as a Bio* umbrella organization, provided 1) at least one Bio* >>> person volunteers to serve as backup administrator, and 2) enough Bio* >>> contributors volunteer to serve as prospective mentors. >>> Mentoring involves participating in creating the page of project ideas >>> (I'd provide template and guidance), corresponding with applicants who >>> have questions, participating in student application ranking, and for >>> primary mentors (those directly assigned to a student) based on empirical >>> evidence at least 5hrs/ week of time spent with the student to help him/her >>> get over obstacles or avoid wrong paths. >>> I think almost all mentors would concur that the experience was very >>> gratifying, but as a mentor you will be spending a non- negligible amount >>> of time with the student. I think it is the student-mentor pairing and >>> interaction, not the stipend, that in the end makes the participation for >>> students uniquely productive in terms of learning, and different from >>> simply contributing to the project of choice (which they could always do). >>> For a personal impression for how the program is from a mentor >>> perspective, I'll let Chris Fields speak who was the mentor for the 2008 >>> phyloXML in BioPerl project. From a student's perspective, I'll leave it >>> to the 2007 Biojava student Bohyun Lee (blee34-at- mail.gatech.edu) and the >>> 2008 BioPerl student Mira Han (mirhan-at- indiana.edu) to comment (if they >>> are still on the list). >>> So if you think this is a good idea for Bio* to be part of, if you would >>> like to help in some way, if you can see yourself as a mentor, or if you >>> are a lurking would-be student, please let yourself be heard. Email either >>> to the list or to me. >>> Cheers, >>> -hilmar >>> [1] http://code.google.com/soc/2008 >>> [2] http://code.google.com/opensource/gsoc/2009/faqs.html >>> [3] http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2007 >>> http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2008 >>> [4] http://biojava.org/wiki/BioJava:PhyloSOC07 >>> [5] http://bioperl.org/wiki/PhyloXML_support_in_BioPerl >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From jestill at plantbio.uga.edu Fri Feb 13 15:28:08 2009 From: jestill at plantbio.uga.edu (James Estill) Date: Fri, 13 Feb 2009 15:28:08 -0500 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers In-Reply-To: 7C21CA74-9694-4962-9C64-A7D95D06CD53@illinois.edu Message-ID: <20090213202808.987be2da@dogwood.plantbio.uga.edu> I can also say that GSOC is a great experience from the student point of view. I was a student working with Hilmar as my mentor in 2007. -- Jamie Estill -- jestill at uga.edu -- http://jestill.myweb.uga.edu -- http://www.epernicus.com/people/jestill _____ From: Chris Fields [mailto:cjfields at illinois.edu] To: Hilmar Lapp [mailto:hlapp at gmx.net] Cc: bioPerl List [mailto:bioperl-l at lists.open-bio.org] Sent: Fri, 13 Feb 2009 15:17:29 -0500 Subject: Re: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers On Feb 13, 2009, at 10:53 AM, Hilmar Lapp wrote: > Google is committed to run the Summer of Code program [1] again this > year. It will be for the 5th time. > [....] > > For a personal impression for how the program is from a mentor > perspective, I'll let Chris Fields speak who was the mentor for the > 2008 phyloXML in BioPerl project. From a student's perspective, I'll > leave it to the 2007 Biojava student Bohyun Lee (blee34-at- > mail.gatech.edu) and the 2008 BioPerl student Mira Han (mirhan-at- > indiana.edu) to comment (if they are still on the list). It's a wonderful program to be involved in, and Hilmar did a wonderful job last year organizing everything. I encourage anyone interested to apply as a student or take on the role of a mentor; GSoC is a great opportunity both as a mentor and as a student. Not to mention Mira Han also wrote a wonderful bit of code (the phyloxml parser and related modules), and we all got great T-shirts! Hilmar, is there a particular focus on projects this year? I think Google announced earlier than normal to allow a bit more time for abstracts and such. chris > So if you think this is a good idea for Bio* to be part of, if you > would like to help in some way, if you can see yourself as a mentor, > or if you are a lurking would-be student, please let yourself be > heard. Email either to the list or to me. > > Cheers, > > -hilmar > > [1] http://code.google.com/soc/2008 > > [2] http://code.google.com/opensource/gsoc/2009/faqs.html > > [3] http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2007 > http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2008 > > [4] http://biojava.org/wiki/BioJava:PhyloSOC07 > > [5] http://bioperl.org/wiki/PhyloXML_support_in_BioPerl > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From yzhernand at gmail.com Fri Feb 13 17:31:39 2009 From: yzhernand at gmail.com (=?ISO-8859-1?Q?Y=F6zen_Hern=E1ndez?=) Date: Fri, 13 Feb 2009 17:31:39 -0500 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers Message-ID: <4825aaae0902131431j7190c3e3o31ba28798e3db9f2@mail.gmail.com> I would be interested in participating as a student. I would love the experience and have the time over summer to contribute. I don't have any ideas of my own at the moment, but I'm sure I'd like the projects mentors will be offering. Y?zen From vecchi.b at gmail.com Fri Feb 13 21:45:01 2009 From: vecchi.b at gmail.com (Bruno Vecchi) Date: Sat, 14 Feb 2009 00:45:01 -0200 Subject: [Bioperl-l] Extending Bio::Tools::Run::Clustalw alignment score parser. Message-ID: <1a0c1b750902131845s55d62704ncaaef077b44959d@mail.gmail.com> Hi, I was pleased to notice that the newest bioperl-run release had a Clustalw.pm module with an alignment score parser. However, I noticed that the regex does not account for negative scores which, although unlikely, they are possible. I propose the following change to the Bio/Tools/Run/Alignment/Clustalw.pm module: - $score = $1 if ($_ =~ /Alignment Score (\d+)/); + $score = $1 if ($_ =~ /Alignment Score (-?\d+)/); This seems to fix the issue. Bruno. From cjfields at illinois.edu Fri Feb 13 22:14:32 2009 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 13 Feb 2009 21:14:32 -0600 Subject: [Bioperl-l] Extending Bio::Tools::Run::Clustalw alignment score parser. In-Reply-To: <1a0c1b750902131845s55d62704ncaaef077b44959d@mail.gmail.com> References: <1a0c1b750902131845s55d62704ncaaef077b44959d@mail.gmail.com> Message-ID: <0D8AFAF6-6F26-4859-9E50-C0A40E844A8B@illinois.edu> I've added that the subversion. Thanks for pointing that out! chris On Feb 13, 2009, at 8:45 PM, Bruno Vecchi wrote: > Hi, > > I was pleased to notice that the newest bioperl-run release had a > Clustalw.pm module with an alignment score parser. However, I noticed > that the regex does not account for negative scores which, although > unlikely, they are possible. I propose the following change to the > Bio/Tools/Run/Alignment/Clustalw.pm module: > > - $score = $1 if ($_ =~ /Alignment Score (\d+)/); > + $score = $1 if ($_ =~ /Alignment Score (-?\d+)/); > > This seems to fix the issue. > > Bruno. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From miraceti at gmail.com Sat Feb 14 21:41:35 2009 From: miraceti at gmail.com (miraceti) Date: Sat, 14 Feb 2009 21:41:35 -0500 Subject: [Bioperl-l] Google Summer of Code: Call for Bio* Volunteers In-Reply-To: <4825aaae0902131431j7190c3e3o31ba28798e3db9f2@mail.gmail.com> References: <4825aaae0902131431j7190c3e3o31ba28798e3db9f2@mail.gmail.com> Message-ID: As a student, I second that GSOC was a great experience for me. It is an opportunity for people who always wanted to get involved in opensource/bio* but never actually did. It gives you the first nudge, and then encourages you constantly along the way through the assigned mentors. Chris Fields above was my mentor and he was great. I found that the interaction with Chris and the tightly bound schedule with a limited scope of the project made the process very effective. In a limited amount of time you learn a lot of things and you can keep it up till the end to produce an output that you can call your contribution. It was a very gratifying experience for me. Mira On Fri, Feb 13, 2009 at 5:31 PM, Y?zen Hern?ndez wrote: > I would be interested in participating as a student. I would love the > experience and have the time over summer to contribute. I don't have any > ideas of my own at the moment, but I'm sure I'd like the projects mentors > will be offering. > > Y?zen > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From johnsonm at gmail.com Sun Feb 15 13:54:20 2009 From: johnsonm at gmail.com (Mark Johnson) Date: Sun, 15 Feb 2009 12:54:20 -0600 Subject: [Bioperl-l] bioperl-db (Bio::DB::BioSQL::Oracle::BasePersistenceAdaptorDriver) LongTruncOk / LongReadLen In-Reply-To: <86AA0A32-C1BA-42AA-AE01-58A3DD3BF6D0@gmx.net> References: <86AA0A32-C1BA-42AA-AE01-58A3DD3BF6D0@gmx.net> Message-ID: On Thu, Jan 29, 2009 at 4:22 PM, Hilmar Lapp wrote: > > On Jan 29, 2009, at 5:03 PM, Mark Johnson wrote: > >> # execute and fetch >> $sth->execute(); >> $row = $sth->fetchall_arrayref(); >> return (@$row ? $row->[0]->[0] : undef); >> >> ...which is in get_biosequence() in >> Bio::DB::BioSQL::Oracle::BiosequenceAdaptorDriver at around line 257?. >> I don't see RaiseError being set anywhere, so shouldn't there be a >> check here to throw an exception if the execute fails (such as if >> LongTruncOk is 0 and a LOB is > LongReadLen?). > > > RaiseError is one of the initialization parameters (in Bio::DB::BioDB->new() > call) and is set when the database connection is opened. > > Not checking the return value in the above surely looks like a bug. Would > you mind filing it in bugzilla? > > -hilmar Done. From maj at fortinbras.us Mon Feb 16 13:58:49 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Mon, 16 Feb 2009 13:58:49 -0500 Subject: [Bioperl-l] Unwise elimination of nodesinB:T:Node::remove_Descendent? In-Reply-To: References: <92BB44BA-CFF6-4D6A-A2B6-2BD68E45ABA8@illinois.edu><2EF8144E065A45808E8EECACF51124A4@NewLife><17337564-0D4C-437C-BB82-6337D967F246@illinois.edu><25E8B6CF45F145FDA0548979D0D9C231@NewLife><38157EB0-359D-47DC-9214-348134FA3220@illinois.edu> Message-ID: <4019875B2D5640BEA0E81177D3FD6C70@NewLife> Chris et al- Any objections to a commit on these mods, plus the .t changes? Just pinging. Mark ----- Original Message ----- From: "Mark A. Jensen" To: "Chris Fields" Cc: Sent: Saturday, February 07, 2009 3:39 PM Subject: Re: [Bioperl-l] Unwise elimination of nodesinB:T:Node::remove_Descendent? > Ok- some modified tests and editorial analysis up under Bug #2456- > cheers MAJ > ----- Original Message ----- > From: "Chris Fields" > To: "Mark A. Jensen" > Cc: > Sent: Friday, February 06, 2009 11:09 PM > Subject: Re: [Bioperl-l] Unwise elimination of nodes > inB:T:Node::remove_Descendent? > > >> Mark, >> >> Saw some errors pop up when running Tree tests (see the attachment on the >> bug report). They may be due to bad test data and not your patch so it'll >> need further investigating; a few appear to be the same test data using in >> various TreeIO formats. >> >> chris >> >> On Feb 6, 2009, at 7:13 PM, Mark A. Jensen wrote: >> >>> Interested parties please have a look at fixes --- >>> http://bugzilla.open-bio.org/show_bug.cgi?id=2456 >>> cheers- >>> MAJ >>> ----- Original Message ----- From: "Chris Fields" >> > >>> To: "Mark A. Jensen" >>> Cc: "Hilmar Lapp" ; >>> Sent: Friday, February 06, 2009 4:11 PM >>> Subject: Re: [Bioperl-l] Unwise elimination of nodes >>> inB:T:Node::remove_Descendent? >>> >>> >>>> >>>> On Feb 6, 2009, at 8:59 AM, Mark A. Jensen wrote: >>>> >>>>>> I suppose the best way to deal with some of these questions (and >>>>>> ensure Node/Tree is acting as expected) is to come up with several >>>>>> vetted test cases indicating what we expect the proper behavior to be >>>>>> for remove_Descendant(), contract_linear_paths(), and any other >>>>>> problematic Node/Tree/ TreeFunctionI methods. In fact, I highly >>>>>> recommend any code changes like this add tests to the test suite >>>>>> demonstrating the issue. >>>>> >>>>> I can work the example of the thread into a test, adding some >>>>> of the points brought in by Hilmar- >>>> >>>> Any other areas of worry? >>>> >>>>>> Possibly related to all this is a fairly significant lingering bug >>>>>> dealing with Bio::Tree::TreeFunctionsI::reroot() >>>>>> (http://bugzilla.open-bio.org/show_bug.cgi?id=2456 ). Any takers? >>>>> >>>>> I take this one, if I have those privileges ( it is a privilege to >>>>> serve, isn't it?)... >>>> >>>> Cool, thanks Mark! >>>> >>>> -c >>>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From jay at jays.net Tue Feb 17 00:31:09 2009 From: jay at jays.net (Jay Hannah) Date: Mon, 16 Feb 2009 23:31:09 -0600 Subject: [Bioperl-l] wiki HOWTO:Graphics - history gap? Message-ID: Howdy, In May 2008 I added a demonstration of Bio::Graphics::Glyph::dna. Strangely, my additions have disappeared with no trace in the history of the page, nor in the history of the discussion tab... http://www.bioperl.org/wiki/HOWTO:Graphics http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics (this history is completely hidden from me) I didn't think Mediawiki erased history? Except perhaps if someone did a rollback? But I assumed rollback was reserved for spam? Perhaps an admin can see things I can't see? (Or was MediaWiki restored from backup or something?) The only trace still on the wiki seems to be the image I uploaded on the same day: http://www.bioperl.org/wiki/Image:Bio_Graphics_Glyph_dna.png Any thoughts? Thanks, j http://www.bioperl.org/wiki/User:Jhannah From hsa_rim at yahoo.co.in Tue Feb 17 04:30:43 2009 From: hsa_rim at yahoo.co.in (shafeeq rim) Date: Tue, 17 Feb 2009 15:00:43 +0530 (IST) Subject: [Bioperl-l] Sequence Character Visualization Message-ID: <76496.48037.qm@web8804.mail.in.yahoo.com> Hi guys, I want to just visualize a sequence in horizontal manner reading from a genome fasta file / genbank file as per the given genomic location.. Suppose I want to visualize a sequence string from 1000 - 1100 nucleotides in horizontal string like manner in terms of an image. Is there any such tool or library that can do it for me... I am not looking for alignments but just linear visualization. Thanks Add more friends to your messenger and enjoy! Go to http://messenger.yahoo.com/invite/ From maj at fortinbras.us Tue Feb 17 07:42:42 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Tue, 17 Feb 2009 07:42:42 -0500 Subject: [Bioperl-l] Sequence Character Visualization In-Reply-To: <76496.48037.qm@web8804.mail.in.yahoo.com> References: <76496.48037.qm@web8804.mail.in.yahoo.com> Message-ID: <5D0908B43ABD4351B3C828FDE0A912BA@NewLife> Hi Shafeeq- Check out http://www.bioperl.org/wiki/Simple_graphical_alignment_overview - might be what you're looking for- Mark ----- Original Message ----- From: "shafeeq rim" To: Sent: Tuesday, February 17, 2009 4:30 AM Subject: [Bioperl-l] Sequence Character Visualization > Hi guys, > > I want to just visualize a sequence in horizontal manner reading from a genome > fasta file / genbank file as per the given genomic location.. > > Suppose I want to visualize a sequence string from 1000 - 1100 nucleotides in > horizontal string like manner in terms of an image. > Is there any such tool or library that can do it for me... I am not looking > for alignments but just linear visualization. > > Thanks > > > Add more friends to your messenger and enjoy! Go to > http://messenger.yahoo.com/invite/ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From chrysain at gmail.com Tue Feb 17 08:08:20 2009 From: chrysain at gmail.com (Chrysanthi A.) Date: Tue, 17 Feb 2009 13:08:20 +0000 Subject: [Bioperl-l] get_duplications: any function? Message-ID: <66b602900902170508w68b777dbibf0ea9ba21632c14@mail.gmail.com> Hi, I was wondering if there is a method in order to visualize a tree by parsing a nexus format and also, I would like to identify the duplication events. Are there any functions that will give me the duplication events? thanks a lot, Chrysanthi From cjfields at illinois.edu Tue Feb 17 08:29:14 2009 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 17 Feb 2009 07:29:14 -0600 Subject: [Bioperl-l] Sequence Character Visualization In-Reply-To: <5D0908B43ABD4351B3C828FDE0A912BA@NewLife> References: <76496.48037.qm@web8804.mail.in.yahoo.com> <5D0908B43ABD4351B3C828FDE0A912BA@NewLife> Message-ID: <854FCF52-8471-4A41-BBA8-E5710E80722E@illinois.edu> I suggest the Bio::Graphics HOWTO for the graphics part, Bio::Graphics::glyph::dna for the DNA part (if you actually want to see the DNA). Note that Bio::Graphics is no longer in bioperl core but is on CPAN. chris On Feb 17, 2009, at 6:42 AM, Mark A. Jensen wrote: > Hi Shafeeq- > Check out http://www.bioperl.org/wiki/Simple_graphical_alignment_overview > - might be what you're looking for- > Mark > ----- Original Message ----- From: "shafeeq rim" > To: > Sent: Tuesday, February 17, 2009 4:30 AM > Subject: [Bioperl-l] Sequence Character Visualization > > >> Hi guys, >> >> I want to just visualize a sequence in horizontal manner reading >> from a genome fasta file / genbank file as per the given genomic >> location.. >> >> Suppose I want to visualize a sequence string from 1000 - 1100 >> nucleotides in horizontal string like manner in terms of an image. >> Is there any such tool or library that can do it for me... I am not >> looking for alignments but just linear visualization. >> >> Thanks >> >> >> Add more friends to your messenger and enjoy! Go to http://messenger.yahoo.com/invite/ >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Tue Feb 17 08:23:12 2009 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 17 Feb 2009 07:23:12 -0600 Subject: [Bioperl-l] wiki HOWTO:Graphics - history gap? In-Reply-To: References: Message-ID: <2B47AC44-4974-4BF6-9009-0E177D3664B4@illinois.edu> Odd. I think rollbacks are noted, so it should show up in the history somewhere. I couldn't find anything indicating a deletion specifically. I suggest re-adding this back to the Graphics page or (better yet) to the Scrapbook: http://www.bioperl.org/wiki/Category:Scrapbook chris On Feb 16, 2009, at 11:31 PM, Jay Hannah wrote: > Howdy, > > In May 2008 I added a demonstration of Bio::Graphics::Glyph::dna. > Strangely, my additions have disappeared with no trace in the > history of the page, nor in the history of the discussion tab... > > http://www.bioperl.org/wiki/HOWTO:Graphics > http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > (this history is completely hidden from me) > > I didn't think Mediawiki erased history? Except perhaps if someone > did a rollback? But I assumed rollback was reserved for spam? > Perhaps an admin can see things I can't see? (Or was MediaWiki > restored from backup or something?) > > The only trace still on the wiki seems to be the image I uploaded on > the same day: > > http://www.bioperl.org/wiki/Image:Bio_Graphics_Glyph_dna.png > > Any thoughts? > > Thanks, > > j > http://www.bioperl.org/wiki/User:Jhannah > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jay at jays.net Tue Feb 17 08:53:31 2009 From: jay at jays.net (Jay Hannah) Date: Tue, 17 Feb 2009 07:53:31 -0600 Subject: [Bioperl-l] wiki HOWTO:Graphics - history gap? In-Reply-To: <2B47AC44-4974-4BF6-9009-0E177D3664B4@illinois.edu> References: <2B47AC44-4974-4BF6-9009-0E177D3664B4@illinois.edu> Message-ID: <427B865F-6F89-442A-AE35-3647D7D12DDE@jays.net> >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics On Feb 17, 2009, at 7:23 AM, Chris Fields wrote: > Odd. I think rollbacks are noted, so it should show up in the > history somewhere. I couldn't find anything indicating a deletion > specifically. > > I suggest re-adding this back to the Graphics page or (better yet) > to the Scrapbook: > > http://www.bioperl.org/wiki/Category:Scrapbook I would, but I don't have a copy. I assumed Mediawiki would have it forever, at least in the page history. I'm a Mediawiki addict across several installs and employment venues, so I thought I pretty much knew how Mediawiki works, but this scenario has me stumped. It wasn't just my contribution, someone else (I don't recall who), had staged a demo or two in the discusssion tab for a couple months. That history isn't visible to me at all. I was sure I must be misremembering all of this, and I actually put it elsewhere, but it's even missing from my contributions history. The only record I have is the image I uploaded the same day is still out there, and my personal blog where I linked straight to that discussion page... which now denies any knowledge of my actions. Very strange. Maybe I need to switch medications. :) No one did any database hackery to reload the HOWTOs from scratch or anything, did they? Incidentally, I stumbled into this because I seem to be experiencing an off-by-one bug in my new glyph::dna project this week, so was going to refer back to that "Hello world" version and compare. Thanks, j http://www.bioperl.org/wiki/User:Jhannah From maj at fortinbras.us Tue Feb 17 10:39:09 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Tue, 17 Feb 2009 10:39:09 -0500 Subject: [Bioperl-l] wiki HOWTO:Graphics - history gap? In-Reply-To: <427B865F-6F89-442A-AE35-3647D7D12DDE@jays.net> References: <2B47AC44-4974-4BF6-9009-0E177D3664B4@illinois.edu> <427B865F-6F89-442A-AE35-3647D7D12DDE@jays.net> Message-ID: <6A4B5D640862475D947F8F33A750E0F7@NewLife> Jay: I found the following on www.archive.org 's "Wayback Machine", dated Feb 08, 2008: http://web.archive.org/web/20080208210835/http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics Also the following archived versions of HOWTO:Graphics are available: http://web.archive.org/web/*/http://www.bioperl.org/wiki/HOWTO:Graphics happy hunting- Mark ----- Original Message ----- From: "Jay Hannah" To: Sent: Tuesday, February 17, 2009 8:53 AM Subject: Re: [Bioperl-l] wiki HOWTO:Graphics - history gap? >>> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > > On Feb 17, 2009, at 7:23 AM, Chris Fields wrote: >> Odd. I think rollbacks are noted, so it should show up in the history >> somewhere. I couldn't find anything indicating a deletion specifically. >> >> I suggest re-adding this back to the Graphics page or (better yet) to the >> Scrapbook: >> >> http://www.bioperl.org/wiki/Category:Scrapbook > > I would, but I don't have a copy. I assumed Mediawiki would have it forever, > at least in the page history. I'm a Mediawiki addict across several installs > and employment venues, so I thought I pretty much knew how Mediawiki works, > but this scenario has me stumped. > > It wasn't just my contribution, someone else (I don't recall who), had staged > a demo or two in the discusssion tab for a couple months. That history isn't > visible to me at all. > > I was sure I must be misremembering all of this, and I actually put it > elsewhere, but it's even missing from my contributions history. The only > record I have is the image I uploaded the same day is still out there, and my > personal blog where I linked straight to that discussion page... which now > denies any knowledge of my actions. > > Very strange. Maybe I need to switch medications. :) > > No one did any database hackery to reload the HOWTOs from scratch or > anything, did they? > > Incidentally, I stumbled into this because I seem to be experiencing an > off-by-one bug in my new glyph::dna project this week, so was going to refer > back to that "Hello world" version and compare. > > Thanks, > > j > http://www.bioperl.org/wiki/User:Jhannah > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From alperyilmaz at gmail.com Tue Feb 17 23:42:31 2009 From: alperyilmaz at gmail.com (Alper Yilmaz) Date: Tue, 17 Feb 2009 23:42:31 -0500 Subject: [Bioperl-l] Bio::DB::GFF with additional information Message-ID: Hi, I am using Bio::DB::GFF so that I can overlay data from different sources. Currently, I'm uploading a GFF file to database whenever I want to add new data. I was wondering is there a way to keep not only start-end location of features but also their "level" data (such as expression level, e-value, peak heght). In other words, can I integrate a non-GFF formatted data with an existing database created by Bio::DB::GFF? thanks, alper From jason at bioperl.org Wed Feb 18 00:24:50 2009 From: jason at bioperl.org (Jason Stajich) Date: Tue, 17 Feb 2009 21:24:50 -0800 Subject: [Bioperl-l] Bio::DB::GFF with additional information In-Reply-To: References: Message-ID: <8BE100B7-F0DC-4126-81AA-4ECA69198AA5@bioperl.org> if it is a single data point you want to include you can use the score column for GFF, if you want to store more complicated data you can just add that to the ninth column, like (if using GFF3 - slightly different format if you are using GFF2) ID=Probe_ax102242;ArrayProbeVersion=2;Experiment1=0.30;Experiment2=0.40 The Gbrowse tutorial at GMOD site should give you some better examples. -jason On Feb 17, 2009, at 8:42 PM, Alper Yilmaz wrote: > Hi, > > I am using Bio::DB::GFF so that I can overlay data from different > sources. > Currently, I'm uploading a GFF file to database whenever I want to > add new > data. I was wondering is there a way to keep not only start-end > location of > features but also their "level" data (such as expression level, e- > value, > peak heght). In other words, can I integrate a non-GFF formatted > data with > an existing database created by Bio::DB::GFF? > > thanks, > > alper > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason at bioperl.org From jason at bioperl.org Wed Feb 18 01:26:16 2009 From: jason at bioperl.org (Jason Stajich) Date: Tue, 17 Feb 2009 22:26:16 -0800 Subject: [Bioperl-l] Bio::DB::GFF with additional information In-Reply-To: References: <8BE100B7-F0DC-4126-81AA-4ECA69198AA5@bioperl.org> Message-ID: Alper - When you have a feature you can get (or set) the tags for this type of information. # note, list context because this method returns a list my ($height) = $f->get_tag_values('Peak_Height'); # or you can set it, and then use a Bio::DB::GFF store method to update the data in the database $f->add_tag_values('Peak_Height',$height); There are also some Bio::DB::GFF specific method for getting and setting tags, forgetting what they are off the top of my head, but I encourage you to read the Bio::DB::GFF documentation, (perldoc Bio::DB::GFF) or link to the wiki page for this module and see the links to CPAN and our Pdoc generated HTML. Lincoln as usual provides extensive documentation for his contributed modules. Eventually you may want to take a look at Bio::DB::SeqFeature::Store which more properly implements GFF3, but for many purposes Bio::DB::GFF will be speedy and and just what you need. -jason On Feb 17, 2009, at 10:06 PM, Alper Yilmaz wrote: > Hi Jason, > > Thanks for quick reply. I modified the ninth column but couldn't > figure out > how to extract that information. If I have Peak_Height information > in 9th > column for gene locations gff file, how do I get that particular > peak height > after I select the segment by: > > my $segment=$db->segment('gene1'); > > Same goes for the score column, if I had utilized it, how do I > retrieve the > data in that column? > > thanks, > > Alper Yilmaz > Post-doctoral Researcher > Plant Biotechnology Center > The Ohio State University > 1060 Carmack Rd > Columbus, OH 43210 > (614)688-4954 > > > On Wed, Feb 18, 2009 at 12:24 AM, Jason Stajich > wrote: > >> if it is a single data point you want to include you can use the >> score >> column for GFF, if you want to store more complicated data you can >> just add >> that to the ninth column, like (if using GFF3 - slightly different >> format if >> you are using GFF2) >> >> ID >> =Probe_ax102242;ArrayProbeVersion=2;Experiment1=0.30;Experiment2=0.40 >> >> The Gbrowse tutorial at GMOD site should give you some better >> examples. >> >> -jason >> >> On Feb 17, 2009, at 8:42 PM, Alper Yilmaz wrote: >> >> Hi, >>> >>> I am using Bio::DB::GFF so that I can overlay data from different >>> sources. >>> Currently, I'm uploading a GFF file to database whenever I want to >>> add new >>> data. I was wondering is there a way to keep not only start-end >>> location >>> of >>> features but also their "level" data (such as expression level, e- >>> value, >>> peak heght). In other words, can I integrate a non-GFF formatted >>> data with >>> an existing database created by Bio::DB::GFF? >>> >>> thanks, >>> >>> alper >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> Jason Stajich >> jason at bioperl.org >> >> >> >> Jason Stajich jason at bioperl.org From jay at jays.net Wed Feb 18 08:08:04 2009 From: jay at jays.net (Jay Hannah) Date: Wed, 18 Feb 2009 07:08:04 -0600 Subject: [Bioperl-l] wiki HOWTO:Graphics - history gap? In-Reply-To: <6A4B5D640862475D947F8F33A750E0F7@NewLife> References: <2B47AC44-4974-4BF6-9009-0E177D3664B4@illinois.edu> <427B865F-6F89-442A-AE35-3647D7D12DDE@jays.net> <6A4B5D640862475D947F8F33A750E0F7@NewLife> Message-ID: <1E47BB99-836F-4779-AB1E-0BCBA2D27D6E@jays.net> On Feb 17, 2009, at 9:39 AM, Mark A. Jensen wrote: > I found the following on www.archive.org 's "Wayback Machine", > dated Feb 08, 2008: > > http://web.archive.org/web/20080208210835/http://www.bioperl.org/ > wiki/HOWTO_Discussion:Graphics > > Also the following archived versions of HOWTO:Graphics are available: > > http://web.archive.org/web/*/http://www.bioperl.org/wiki/ > HOWTO:Graphics Thanks for that! It doesn't quite fit this need (May 2008), but it's a cool trick! :) Can a wiki admin please glance at this page and see if they see "6 deleted revisions" or something? http://www.bioperl.org/w/index.php? title=HOWTO_Discussion:Graphics&limit=500&action=history For me history lists only the current page. Perhaps a Mediawiki admin will see more? Thanks, Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From maj at fortinbras.us Wed Feb 18 08:24:11 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Wed, 18 Feb 2009 08:24:11 -0500 Subject: [Bioperl-l] wiki HOWTO:Graphics - history gap? In-Reply-To: <1E47BB99-836F-4779-AB1E-0BCBA2D27D6E@jays.net> References: <2B47AC44-4974-4BF6-9009-0E177D3664B4@illinois.edu><427B865F-6F89-442A-AE35-3647D7D12DDE@jays.net><6A4B5D640862475D947F8F33A750E0F7@NewLife> <1E47BB99-836F-4779-AB1E-0BCBA2D27D6E@jays.net> Message-ID: <831E26BC4A3545C591F4B2A092C4EEA5@NewLife> Hey Jay-- I see that Chris moved the HOWTO_Discussion:Graphics to the Scrapbook on 21 Dec 2008. Is this your stuff? http://www.bioperl.org/wiki/Simple_graphical_alignment_overview cheers MAJ ----- Original Message ----- From: "Jay Hannah" To: Sent: Wednesday, February 18, 2009 8:08 AM Subject: Re: [Bioperl-l] wiki HOWTO:Graphics - history gap? > On Feb 17, 2009, at 9:39 AM, Mark A. Jensen wrote: >> I found the following on www.archive.org 's "Wayback Machine", dated Feb 08, >> 2008: >> >> http://web.archive.org/web/20080208210835/http://www.bioperl.org/ >> wiki/HOWTO_Discussion:Graphics >> >> Also the following archived versions of HOWTO:Graphics are available: >> >> http://web.archive.org/web/*/http://www.bioperl.org/wiki/ HOWTO:Graphics > > Thanks for that! It doesn't quite fit this need (May 2008), but it's a cool > trick! :) > > > > Can a wiki admin please glance at this page and see if they see "6 deleted > revisions" or something? > > http://www.bioperl.org/w/index.php? > title=HOWTO_Discussion:Graphics&limit=500&action=history > > For me history lists only the current page. Perhaps a Mediawiki admin will > see more? > > Thanks, > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Wed Feb 18 08:32:52 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 18 Feb 2009 07:32:52 -0600 Subject: [Bioperl-l] wiki HOWTO:Graphics - history gap? In-Reply-To: <1E47BB99-836F-4779-AB1E-0BCBA2D27D6E@jays.net> References: <2B47AC44-4974-4BF6-9009-0E177D3664B4@illinois.edu> <427B865F-6F89-442A-AE35-3647D7D12DDE@jays.net> <6A4B5D640862475D947F8F33A750E0F7@NewLife> <1E47BB99-836F-4779-AB1E-0BCBA2D27D6E@jays.net> Message-ID: <25B4B916-765B-4D50-B34A-CE68D42B4F2A@illinois.edu> Jay, Found it! You were right, you have to be admin to undelete pages (there is a very small link for undeletion when logged in). This might have been my fault, since I moved my script over to the scrapbook and deleted that page this past Dec. The page had been revcreated since then (thus hiding the history). I went ahead and moved this one to the scrapbook as well and left the page as is: http://www.bioperl.org/wiki/Adding_a_DNA_track chris On Feb 18, 2009, at 7:08 AM, Jay Hannah wrote: > On Feb 17, 2009, at 9:39 AM, Mark A. Jensen wrote: >> I found the following on www.archive.org 's "Wayback Machine", >> dated Feb 08, 2008: >> >> http://web.archive.org/web/20080208210835/http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics >> >> Also the following archived versions of HOWTO:Graphics are available: >> >> http://web.archive.org/web/*/http://www.bioperl.org/wiki/HOWTO:Graphics > > Thanks for that! It doesn't quite fit this need (May 2008), but it's > a cool trick! :) > > > > Can a wiki admin please glance at this page and see if they see "6 > deleted revisions" or something? > > http://www.bioperl.org/w/index.php?title=HOWTO_Discussion:Graphics&limit=500&action=history > > For me history lists only the current page. Perhaps a Mediawiki > admin will see more? > > Thanks, > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jay at jays.net Wed Feb 18 08:40:38 2009 From: jay at jays.net (Jay Hannah) Date: Wed, 18 Feb 2009 07:40:38 -0600 Subject: [Bioperl-l] wiki HOWTO:Graphics - history gap? In-Reply-To: <25B4B916-765B-4D50-B34A-CE68D42B4F2A@illinois.edu> References: <2B47AC44-4974-4BF6-9009-0E177D3664B4@illinois.edu> <427B865F-6F89-442A-AE35-3647D7D12DDE@jays.net> <6A4B5D640862475D947F8F33A750E0F7@NewLife> <1E47BB99-836F-4779-AB1E-0BCBA2D27D6E@jays.net> <25B4B916-765B-4D50-B34A-CE68D42B4F2A@illinois.edu> Message-ID: On Feb 18, 2009, at 7:32 AM, Chris Fields wrote: > http://www.bioperl.org/wiki/Adding_a_DNA_track woot! There it is! Thanks! I have to run right now, but I'll get caught up on the Scrapbook concept and other suggestions tonight or tomorrow. :) j From maj at fortinbras.us Wed Feb 18 08:45:16 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Wed, 18 Feb 2009 08:45:16 -0500 Subject: [Bioperl-l] wiki HOWTO:Graphics - history gap? In-Reply-To: <1E47BB99-836F-4779-AB1E-0BCBA2D27D6E@jays.net> References: <2B47AC44-4974-4BF6-9009-0E177D3664B4@illinois.edu><427B865F-6F89-442A-AE35-3647D7D12DDE@jays.net><6A4B5D640862475D947F8F33A750E0F7@NewLife> <1E47BB99-836F-4779-AB1E-0BCBA2D27D6E@jays.net> Message-ID: or this? http://www.bioperl.org/wiki/Adding_a_DNA_track ----- Original Message ----- From: "Jay Hannah" To: Sent: Wednesday, February 18, 2009 8:08 AM Subject: Re: [Bioperl-l] wiki HOWTO:Graphics - history gap? > On Feb 17, 2009, at 9:39 AM, Mark A. Jensen wrote: >> I found the following on www.archive.org 's "Wayback Machine", >> dated Feb 08, 2008: >> >> http://web.archive.org/web/20080208210835/http://www.bioperl.org/ >> wiki/HOWTO_Discussion:Graphics >> >> Also the following archived versions of HOWTO:Graphics are available: >> >> http://web.archive.org/web/*/http://www.bioperl.org/wiki/ >> HOWTO:Graphics > > Thanks for that! It doesn't quite fit this need (May 2008), but it's > a cool trick! :) > > > > Can a wiki admin please glance at this page and see if they see "6 > deleted revisions" or something? > > http://www.bioperl.org/w/index.php? > title=HOWTO_Discussion:Graphics&limit=500&action=history > > For me history lists only the current page. Perhaps a Mediawiki admin > will see more? > > Thanks, > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From maj at fortinbras.us Wed Feb 18 08:46:28 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Wed, 18 Feb 2009 08:46:28 -0500 Subject: [Bioperl-l] wiki HOWTO:Graphics - history gap? In-Reply-To: References: <2B47AC44-4974-4BF6-9009-0E177D3664B4@illinois.edu><427B865F-6F89-442A-AE35-3647D7D12DDE@jays.net><6A4B5D640862475D947F8F33A750E0F7@NewLife><1E47BB99-836F-4779-AB1E-0BCBA2D27D6E@jays.net><25B4B916-765B-4D50-B34A-CE68D42B4F2A@illinois.edu> Message-ID: <80F91F17664640A8847A18F2A7BA8ACB@NewLife> Hey dudes-- Shortly you'll see I found it too-- Allow me to Scrapbook it (I love scrapbooking; it make feel domestic.) MAJ ----- Original Message ----- From: "Jay Hannah" To: Sent: Wednesday, February 18, 2009 8:40 AM Subject: Re: [Bioperl-l] wiki HOWTO:Graphics - history gap? > On Feb 18, 2009, at 7:32 AM, Chris Fields wrote: >> http://www.bioperl.org/wiki/Adding_a_DNA_track > > woot! There it is! Thanks! > > I have to run right now, but I'll get caught up on the Scrapbook > concept and other suggestions tonight or tomorrow. :) > > j > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From markus.liebscher at gmx.de Wed Feb 18 10:55:31 2009 From: markus.liebscher at gmx.de (manni122) Date: Wed, 18 Feb 2009 07:55:31 -0800 (PST) Subject: [Bioperl-l] compare sequences codon by codon Message-ID: <22081753.post@talk.nabble.com> Hi there, I hope one of you can help me... I am looking for a way to compare two DNA sequences codon by codon that have been previously pairwise aligned. So I want to have the information on which codon position the pair is identical and on which not. Is there something like this implemented in Bioperl? Thanks for every help, manni122. -- View this message in context: http://www.nabble.com/compare-sequences-codon-by-codon-tp22081753p22081753.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From cjfields at illinois.edu Wed Feb 18 10:02:51 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 18 Feb 2009 09:02:51 -0600 Subject: [Bioperl-l] Working on final release of bioperl-run, db, network Message-ID: <4535FE93-E099-43E3-A6E3-8BD7384CF708@illinois.edu> All, I will be working on the final (non-alpha 1.6.0) release of BioPerl- run, db, and network this week with a final release scheduled for Monday. These will require a minimum core installation of v1.6.0 (or 1.006000, non-vstring long-form). If anything needs updating in the various packages now would be the time to get them committed so I can migrate changes over to the various branches and get them packaged up. So far CPAN testers has been showing passing/unknown with the exception of Bio::Tools::Run::Vista tests. I recently fixed these locally for me but they appear to be failing on various linux systems (I'm hoping there isn't another Vista.jar in CLASSPATH causing problems): http://bbbike.radzeit.de/~slaven/cpantestersmatrix.cgi?dist=BioPerl-run+1.5.9_2 As always let me know of any test issues. Cheers! chris From heikki.lehvaslaiho at gmail.com Wed Feb 18 11:15:31 2009 From: heikki.lehvaslaiho at gmail.com (Heikki Lehvaslaiho) Date: Wed, 18 Feb 2009 18:15:31 +0200 Subject: [Bioperl-l] novice project: automatically update expired Swiss-Prot IDs Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2764 Here is a blog entry describing how to use Expacy IDTracker to find a new id based on old one: http://nsaunders.wordpress.com/2008/03/07/missing-links-using-swissprot-idtracker-in-your-code/ Would be nice to add this into BioPerl Bio::DB::SwissProt->get_Seq_by_id(). (Although I still do not understand why anyone would insist on using IDs rather than accession numbers where this is done automatically.) This would be a good novice project. Any takers? -- -Heikki Heikki Lehvaslaiho - heikki lehvaslaiho gmail com From jason at bioperl.org Wed Feb 18 11:36:15 2009 From: jason at bioperl.org (Jason Stajich) Date: Wed, 18 Feb 2009 08:36:15 -0800 Subject: [Bioperl-l] compare sequences codon by codon In-Reply-To: <22081753.post@talk.nabble.com> References: <22081753.post@talk.nabble.com> Message-ID: <9504075F-2C9C-4529-9276-FC436B58D8E6@bioperl.org> Have you tried any of the code on aligning sequences at the protein level - mapping back to codons. To identify identical codons easiest is probably just walk back through the alignment columns in sets of. the pairwise_kaks script and the http://bioperl.org/wiki/HOWTO:PAML shows bits and pieces of this - the key routines is aa_to_dna_aln in the Bio::Align::Utilities module. Show us some code and I'm sure we can help better. On Feb 18, 2009, at 7:55 AM, manni122 wrote: > > Hi there, I hope one of you can help me... > I am looking for a way to compare two DNA sequences codon by codon > that have > been previously pairwise aligned. So I want to have the information > on which > codon position the pair is identical and on which not. > Is there something like this implemented in Bioperl? > Thanks for every help, > manni122. > -- > View this message in context: http://www.nabble.com/compare-sequences-codon-by-codon-tp22081753p22081753.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason at bioperl.org From markus.liebscher at gmx.de Wed Feb 18 11:52:51 2009 From: markus.liebscher at gmx.de (manni122) Date: Wed, 18 Feb 2009 08:52:51 -0800 (PST) Subject: [Bioperl-l] compare sequences codon by codon In-Reply-To: <9504075F-2C9C-4529-9276-FC436B58D8E6@bioperl.org> References: <22081753.post@talk.nabble.com> <9504075F-2C9C-4529-9276-FC436B58D8E6@bioperl.org> Message-ID: <22083044.post@talk.nabble.com> thanks for this suggestion. I will read through and hopefully present some code. best wishes, manni122 Jason Stajich-3 wrote: > > Have you tried any of the code on aligning sequences at the protein > level - mapping back to codons. To identify identical codons easiest > is probably just walk back through the alignment columns in sets of. > the pairwise_kaks script and the http://bioperl.org/wiki/HOWTO:PAML > shows bits and pieces of this - the key routines is aa_to_dna_aln in > the Bio::Align::Utilities module. > > Show us some code and I'm sure we can help better. > > On Feb 18, 2009, at 7:55 AM, manni122 wrote: > >> >> Hi there, I hope one of you can help me... >> I am looking for a way to compare two DNA sequences codon by codon >> that have >> been previously pairwise aligned. So I want to have the information >> on which >> codon position the pair is identical and on which not. >> Is there something like this implemented in Bioperl? >> Thanks for every help, >> manni122. >> -- >> View this message in context: >> http://www.nabble.com/compare-sequences-codon-by-codon-tp22081753p22081753.html >> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Jason Stajich > jason at bioperl.org > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- View this message in context: http://www.nabble.com/compare-sequences-codon-by-codon-tp22081753p22083044.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From jason at bioperl.org Wed Feb 18 12:26:01 2009 From: jason at bioperl.org (Jason Stajich) Date: Wed, 18 Feb 2009 09:26:01 -0800 Subject: [Bioperl-l] compare sequences codon by codon In-Reply-To: <9504075F-2C9C-4529-9276-FC436B58D8E6@bioperl.org> References: <22081753.post@talk.nabble.com> <9504075F-2C9C-4529-9276-FC436B58D8E6@bioperl.org> Message-ID: <4F0233F5-0293-47D4-9C07-5282BAF3C494@bioperl.org> wow - my incomplete typing was amazing... trying again :-> To identify identical codons easiest is probably just walk back through the alignment columns in sets of 3. Have a look at the pairwise_kaks script and the HOWTO at http://bioperl.org/wiki/ HOWTO:PAML. These shows bits and pieces which may be of interest. The key routine is aa_to_dna_aln in the Bio::Align::Utilities module. On Feb 18, 2009, at 8:36 AM, Jason Stajich wrote: > Have you tried any of the code on aligning sequences at the protein > level - mapping back to codons. To identify identical codons > easiest is probably just walk back through the alignment columns in > sets of. the pairwise_kaks script and the http://bioperl.org/wiki/HOWTO:PAML > shows bits and pieces of this - the key routines is aa_to_dna_aln > in the Bio::Align::Utilities module. > > Show us some code and I'm sure we can help better. > > On Feb 18, 2009, at 7:55 AM, manni122 wrote: > >> >> Hi there, I hope one of you can help me... >> I am looking for a way to compare two DNA sequences codon by codon >> that have >> been previously pairwise aligned. So I want to have the information >> on which >> codon position the pair is identical and on which not. >> Is there something like this implemented in Bioperl? >> Thanks for every help, >> manni122. >> -- >> View this message in context: http://www.nabble.com/compare-sequences-codon-by-codon-tp22081753p22081753.html >> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Jason Stajich > jason at bioperl.org > > > Jason Stajich jason at bioperl.org From maj at fortinbras.us Wed Feb 18 12:41:35 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Wed, 18 Feb 2009 12:41:35 -0500 Subject: [Bioperl-l] compare sequences codon by codon In-Reply-To: <4F0233F5-0293-47D4-9C07-5282BAF3C494@bioperl.org> References: <22081753.post@talk.nabble.com><9504075F-2C9C-4529-9276-FC436B58D8E6@bioperl.org> <4F0233F5-0293-47D4-9C07-5282BAF3C494@bioperl.org> Message-ID: <5A415CB7B70F4E78A3ECD059E7B2D12F@NewLife> Ah! Sets of 3! ----- Original Message ----- From: "Jason Stajich" To: "manni122 Liebscher" Cc: "bioperl list" Sent: Wednesday, February 18, 2009 12:26 PM Subject: Re: [Bioperl-l] compare sequences codon by codon > wow - my incomplete typing was amazing... trying again :-> > > To identify identical codons easiest is probably just walk back through the > alignment columns in sets of 3. Have a look at the pairwise_kaks script and > the HOWTO at http://bioperl.org/wiki/ HOWTO:PAML. These shows bits and pieces > which may be of interest. The key routine is aa_to_dna_aln in the > Bio::Align::Utilities module. > > On Feb 18, 2009, at 8:36 AM, Jason Stajich wrote: > >> Have you tried any of the code on aligning sequences at the protein level - >> mapping back to codons. To identify identical codons easiest is probably >> just walk back through the alignment columns in sets of. the pairwise_kaks >> script and the http://bioperl.org/wiki/HOWTO:PAML shows bits and pieces of >> this - the key routines is aa_to_dna_aln in the Bio::Align::Utilities >> module. >> >> Show us some code and I'm sure we can help better. >> >> On Feb 18, 2009, at 7:55 AM, manni122 wrote: >> >>> >>> Hi there, I hope one of you can help me... >>> I am looking for a way to compare two DNA sequences codon by codon that >>> have >>> been previously pairwise aligned. So I want to have the information on >>> which >>> codon position the pair is identical and on which not. >>> Is there something like this implemented in Bioperl? >>> Thanks for every help, >>> manni122. >>> -- >>> View this message in context: >>> http://www.nabble.com/compare-sequences-codon-by-codon-tp22081753p22081753.html >>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Jason Stajich >> jason at bioperl.org >> >> >> > > Jason Stajich > jason at bioperl.org > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Wed Feb 18 14:16:06 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 18 Feb 2009 13:16:06 -0600 Subject: [Bioperl-l] Possible core refactors for release 1.7 Message-ID: <2309A49E-C4D3-4EEF-8657-35F9113E1C5A@illinois.edu> I have put up a few pages on the wiki related to possible refactors related to GFF and Align-related stuff: http://www.bioperl.org/wiki/GFF_Refactor http://www.bioperl.org/wiki/Align_Refactor Regarding both pages: these are just my thoughts on the two issues; some spots definitely need clarification, so feel free to gripe/rant/ cheer to your hearts content! chris From litd99 at gmail.com Wed Feb 18 14:22:07 2009 From: litd99 at gmail.com (Tiandao Li) Date: Wed, 18 Feb 2009 12:22:07 -0700 Subject: [Bioperl-l] AceView Message-ID: <1ee6623e0902181122u79e5019bu9d394df6c0f7fbcc@mail.gmail.com> Hello, I had a list of human genes that I want to submit to AceView to look at expression, function, pathway, and GO annotation. The following link is for kras2 gene http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/av.cgi?exdb=AceView&db=36a&term=kras2&submit=Go I searched through the list archive, and can't find any codes to query AceView in batch, and save the output for further parsing, such as comparing expression levels, associated disease, or tissues of different genes. Cheers, Tiandao From florent.angly at gmail.com Wed Feb 18 15:26:47 2009 From: florent.angly at gmail.com (Florent Angly) Date: Wed, 18 Feb 2009 12:26:47 -0800 Subject: [Bioperl-l] Possible core refactors for release 1.7 In-Reply-To: <2309A49E-C4D3-4EEF-8657-35F9113E1C5A@illinois.edu> References: <2309A49E-C4D3-4EEF-8657-35F9113E1C5A@illinois.edu> Message-ID: <499C6F07.3060505@gmail.com> Added my 2 cents on the Bio::Assembly refactoring discussion based on the discussions, bugs and problems I remember: http://www.bioperl.org/w/index.php?title=Align_Refactor Cheers, Florent Chris Fields wrote: > I have put up a few pages on the wiki related to possible refactors > related to GFF and Align-related stuff: > > http://www.bioperl.org/wiki/GFF_Refactor > http://www.bioperl.org/wiki/Align_Refactor > > Regarding both pages: these are just my thoughts on the two issues; > some spots definitely need clarification, so feel free to > gripe/rant/cheer to your hearts content! > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From jason at bioperl.org Wed Feb 18 15:52:25 2009 From: jason at bioperl.org (Jason Stajich) Date: Wed, 18 Feb 2009 12:52:25 -0800 Subject: [Bioperl-l] Fwd: RemoteBlast has an upper limit fo 100 returned hits ? References: <8697393B889CD74C99B2E77056523781031AB9C1@rpbmsem01.nala.roche.com> Message-ID: <3A82EDA2-CC14-4CC6-AE0D-BCA8B549C291@bioperl.org> Philip - best to ask on the mailing list. fwding it on. -jason Begin forwarded message: > From: "Xiang, Philip" > Date: February 18, 2009 12:37:23 PM PST > To: > Subject: RemoteBlast has an upper limit fo 100 returned hits ? > > Hi Jason, > > > > I don't seem to be able to get more than 100 sequences returned with > Bio::Tools::Run RemoteBlast. > > I get 10 hits returned with the following piece of code. > > my @params = ( '-prog' => 'blastn', > > '-data' => 'nr', > > '-expect' => '1e-10', > > '-descriptions' => 10, > > '-alignments' => 10, > > '-readmethod' => 'SearchIO' ); > > my $factory = Bio::Tools::Run::RemoteBlast->new(@params); > > my $r = $factory->submit_blast($input); > > > > But I get only 100 hits returned if I change the code to: > '-descriptions' => 10000, '-alignments' => 10000, even though the > query > fetches a lot more hits on NCBI BLAST web interface. > > What did I do wrong? > > Thank you in advance for your help! > > Phil Xiang > > Phone: 925.730.8709 > > > Jason Stajich jason at bioperl.org From SMarkel at accelrys.com Wed Feb 18 20:32:36 2009 From: SMarkel at accelrys.com (Scott Markel) Date: Wed, 18 Feb 2009 20:32:36 -0500 Subject: [Bioperl-l] Fwd: RemoteBlast has an upper limit fo 100 returned hits ? In-Reply-To: <3A82EDA2-CC14-4CC6-AE0D-BCA8B549C291@bioperl.org> References: <8697393B889CD74C99B2E77056523781031AB9C1@rpbmsem01.nala.roche.com> <3A82EDA2-CC14-4CC6-AE0D-BCA8B549C291@bioperl.org> Message-ID: <1F1240778FB0AF46B4E5A72C44D2C74722CB512D@exch1-hi.accelrys.net> Philip, I think you want to look at setting the CGI headers for ALIGNMENTS and DESCRIPTIONS. The NCBI URL is http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html. ALIGNMENTS: blastall's -b option (http://www.ncbi.nlm.nih.gov/BLAST/Doc/node10.html) DESCRIPTIONS: blastall's -v option (http://www.ncbi.nlm.nih.gov/BLAST/Doc/node17.html) HITLIST_SIZE: max of "-b" and "-v" options (http://www.ncbi.nlm.nih.gov/BLAST/Doc/node30.html) NCBI sets defaults, but note that Bio::Tools::Run::RemoteBlast sets its own defaults. # Default values go in here for GET %RETRIEVALHEADER = ( 'CMD' => 'Get', 'ALIGNMENTS' => '50', 'ALIGNMENT_VIEW' => 'Pairwise', 'DESCRIPTIONS' => '100', 'FORMAT_TYPE' => 'Text', ); So you're seeing 100 = max(50, 100) hits. Change the header values for ALIGNMENTS or DESCRIPTIONS to get a larger number of hits. For example, $Bio::Tools::Run::RemoteBlast::HEADER{"ALIGNMENTS"} = 200; $Bio::Tools::Run::RemoteBlast::RETRIEVALHEADER{"ALIGNMENTS"} = 200; In our code I note that I set both HEADER and RETRIEVALHEADER. I had problems a few years ago if I only set the RETRIEVALHEADER value. Hope this helps. Scott Scott Markel, Ph.D. Principal Bioinformatics Architect email: smarkel at accelrys.com Accelrys (SciTegic R&D) mobile: +1 858 205 3653 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 San Diego, CA 92121 fax: +1 858 799 5222 USA web: http://www.accelrys.com http://www.linkedin.com/in/smarkel Vice President, Board of Directors: International Society for Computational Biology Co-chair: ISCB Publications Committee Associate Editor: PLoS Computational Biology Editorial Board: Briefings in Bioinformatics > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Jason Stajich > Sent: Wednesday, 18 February 2009 12:52 PM > To: BioPerl list > Cc: philip.xiang at roche.com > Subject: [Bioperl-l] Fwd: RemoteBlast has an upper limit fo 100 returned > hits ? > > Philip - best to ask on the mailing list. fwding it on. > -jason > Begin forwarded message: > > > From: "Xiang, Philip" > > Date: February 18, 2009 12:37:23 PM PST > > To: > > Subject: RemoteBlast has an upper limit fo 100 returned hits ? > > > > Hi Jason, > > > > > > > > I don't seem to be able to get more than 100 sequences returned with > > Bio::Tools::Run RemoteBlast. > > > > I get 10 hits returned with the following piece of code. > > > > my @params = ( '-prog' => 'blastn', > > > > '-data' => 'nr', > > > > '-expect' => '1e-10', > > > > '-descriptions' => 10, > > > > '-alignments' => 10, > > > > '-readmethod' => 'SearchIO' ); > > > > my $factory = Bio::Tools::Run::RemoteBlast->new(@params); > > > > my $r = $factory->submit_blast($input); > > > > > > > > But I get only 100 hits returned if I change the code to: > > '-descriptions' => 10000, '-alignments' => 10000, even though the > > query > > fetches a lot more hits on NCBI BLAST web interface. > > > > What did I do wrong? > > > > Thank you in advance for your help! > > > > Phil Xiang > > > > Phone: 925.730.8709 > > > > > > > > Jason Stajich > jason at bioperl.org > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From chrysain at gmail.com Thu Feb 19 03:37:11 2009 From: chrysain at gmail.com (Chrysanthi A.) Date: Thu, 19 Feb 2009 08:37:11 +0000 Subject: [Bioperl-l] function for duplication events? Message-ID: <66b602900902190037u76626243r955c8d56e44e975@mail.gmail.com> Hi, I was wondering if there is a method in order to visualize a tree by parsing a nexus format and also, I would like to identify the duplication events. Are there any functions that will give me the duplication events? thanks a lot, Chrysanthi From David.Messina at sbc.su.se Thu Feb 19 05:12:42 2009 From: David.Messina at sbc.su.se (Dave Messina) Date: Thu, 19 Feb 2009 11:12:42 +0100 Subject: [Bioperl-l] BioPerl on Wikipedia Message-ID: <628aabb70902190212s286fa8aeg3f530699f7b7d7a0@mail.gmail.com> Hey everybody, I just noticed that the Wikipedia page on BioPerlwas woefully out of date, at least with respect to the current version number. I updated it to reflect the new 1.6.0 release, but it'd be great if there was some automated way to keep the version info on there up-to-date. The BioPerl website is also running MediaWiki -- maybe that's a possible route? Does anyone know if there's an easy way to do something like this? (Mark, Jay, Mauricio, I'm looking in your direction.) In general, though, the info there is still pretty sparse so if anyone has the time or inclination to improve on it, please do so. Dave From shameer at ncbs.res.in Thu Feb 19 08:08:23 2009 From: shameer at ncbs.res.in (K. Shameer) Date: Thu, 19 Feb 2009 18:38:23 +0530 (IST) Subject: [Bioperl-l] function for duplication events? In-Reply-To: <66b602900902190037u76626243r955c8d56e44e975@mail.gmail.com> References: <66b602900902190037u76626243r955c8d56e44e975@mail.gmail.com> Message-ID: <50381.192.168.1.1.1235048903.squirrel@mail.ncbs.res.in> Hi Chrysanthi, You can create/visualize a tree by parsing a file (for example newick format). Hope the HOWTO:Trees section will help you to kick start http://www.bioperl.org/wiki/HOWTO:Trees If you are looking for feature-rich visualization, you can try Phyloxml http://www.bioperl.org/wiki/Phyloxml_Project_Demo I am not sure about the function to identify duplication events. Can you elaborate a bit on what you are looking for ? Cheers, Khader Shameer > Hi, > > I was wondering if there is a method in order to visualize a tree by > parsing > a format and also, I would like to identify the duplication events. > Are there any functions that will give me the duplication events? > > thanks a lot, > > Chrysanthi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From markus.liebscher at gmx.de Thu Feb 19 12:15:37 2009 From: markus.liebscher at gmx.de (manni122) Date: Thu, 19 Feb 2009 09:15:37 -0800 (PST) Subject: [Bioperl-l] I don't get the plain sequence Message-ID: <22104800.post@talk.nabble.com> Hi, maybe I am somehow unable to display the plain sequence from a Bio::AlignIO object. I put 2 aligned sequences in an array while (my $aln = $input->next_aln()) { push @nuc_seqs, $aln; $seq1 = $aln->get_seq_by_pos(1); $seq2 = $aln->get_seq_by_pos(2); } How do I get to know the sequences stored in the variables. I couldn't find a hint. Thanks for help... -- View this message in context: http://www.nabble.com/I-don%27t-get-the-plain-sequence-tp22104800p22104800.html Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From Kevin.M.Brown at asu.edu Thu Feb 19 12:23:00 2009 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Thu, 19 Feb 2009 10:23:00 -0700 Subject: [Bioperl-l] I don't get the plain sequence In-Reply-To: <22104800.post@talk.nabble.com> References: <22104800.post@talk.nabble.com> Message-ID: <1A4207F8295607498283FE9E93B775B405C80C69@EX02.asurite.ad.asu.edu> Bio::AlignIO returns a Bio::Align::AlignI object from next_aln. get_seq_by_pos returns a Bio::LocatableSeq object which has a seq method that sets or returns the sequence of the LocatableSeq http://bioperl.org/cgi-bin/deob_interface.cgi So, to get the actual sequence you would do: while (my $aln = $input->next_aln()) { push @nuc_seqs, $aln; $seq1 = $aln->get_seq_by_pos(1)->seq(); $seq2 = $aln->get_seq_by_pos(2)->seq(); } > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of manni122 > Sent: Thursday, February 19, 2009 10:16 AM > To: Bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] I don't get the plain sequence > > > Hi, maybe I am somehow unable to display the plain sequence from a > Bio::AlignIO object. > I put 2 aligned sequences in an array > > while (my $aln = $input->next_aln()) { > push @nuc_seqs, $aln; > $seq1 = $aln->get_seq_by_pos(1); > $seq2 = $aln->get_seq_by_pos(2); > } > > How do I get to know the sequences stored in the variables. I > couldn't find > a hint. Thanks for help... > -- > View this message in context: > http://www.nabble.com/I-don%27t-get-the-plain-sequence-tp22104 > 800p22104800.html > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From torsten.seemann at gmail.com Thu Feb 19 20:41:07 2009 From: torsten.seemann at gmail.com (Torsten Seemann) Date: Fri, 20 Feb 2009 12:41:07 +1100 Subject: [Bioperl-l] Question about HOWTO:StandAloneBlast In-Reply-To: <499C160A020000AA000B2E87@med-gwia-02a.med.umich.edu> References: <499C160A020000AA000B2E87@med-gwia-02a.med.umich.edu> Message-ID: Dongliang, Please submit questions to the bioperl-l at lists.open-bio.org mailing list. > I am writing to ask about a question regarding to locating the blastall.exe in windows. > Does following mean creating an autoexec.bat file containing the path? What does [fixme] mean? I am a neophyte of bioperl. > Windows: C:\AUTOEXEC.BAT [FIXME?] "FIXME" means that the author of the web page (myself) has left a message to himself that this web page is incomplete and needs fixing. As I don't use Windows, I wasn't 100% sure if those instructions were correct. > PATH=$PATH;C:\BLAST\BIN > BLASTDIR=C:\BLAST > > After running the autoexe.bat, I can execute the blastall.exe, but windows doesn't recognize perl as a command any more. How can I fix this path related problem? It seems doing the above has removed the Perl directory from your $PATH. To be honest, I do not know how to solve this Windows problem. Hopefully someone on this mailing list can enlighten me, and possibly update this page: http://www.bioperl.org/wiki/HOWTO:StandAloneBlast Thank you -- --Torsten Seemann --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash University, AUSTRALIA From maj at fortinbras.us Thu Feb 19 22:06:43 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Thu, 19 Feb 2009 22:06:43 -0500 Subject: [Bioperl-l] Question about HOWTO:StandAloneBlast In-Reply-To: References: <499C160A020000AA000B2E87@med-gwia-02a.med.umich.edu> Message-ID: <2D145BB8CCCD4CAA8EC533A8B6D8D7E5@NewLife> try PATH=%PATH%;C:\BLAST\BIN -MAJ ----- Original Message ----- From: "Torsten Seemann" To: "bioperl-l" Cc: "Dongliang Wu" Sent: Thursday, February 19, 2009 8:41 PM Subject: Re: [Bioperl-l] Question about HOWTO:StandAloneBlast > Dongliang, > > Please submit questions to the bioperl-l at lists.open-bio.org mailing list. > >> I am writing to ask about a question regarding to locating the blastall.exe >> in windows. >> Does following mean creating an autoexec.bat file containing the path? What >> does [fixme] mean? I am a neophyte of bioperl. >> Windows: C:\AUTOEXEC.BAT [FIXME?] > > "FIXME" means that the author of the web page (myself) has left a > message to himself that this web page is incomplete and needs fixing. > As I don't use Windows, I wasn't 100% sure if those instructions were > correct. > >> PATH=$PATH;C:\BLAST\BIN >> BLASTDIR=C:\BLAST >> >> After running the autoexe.bat, I can execute the blastall.exe, but windows >> doesn't recognize perl as a command any more. How can I fix this path related >> problem? > > It seems doing the above has removed the Perl directory from your > $PATH. To be honest, I do not know how to solve this Windows problem. > Hopefully someone on this mailing list can enlighten me, and possibly > update this page: > > http://www.bioperl.org/wiki/HOWTO:StandAloneBlast > > Thank you > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash > University, AUSTRALIA > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From maj at fortinbras.us Thu Feb 19 22:34:57 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Thu, 19 Feb 2009 22:34:57 -0500 Subject: [Bioperl-l] directing questions to the list via the module template Message-ID: Core - It seems like the off-list questions come (not unreasonably) from the the line # Cared for by P. Hacker which is right at the top of most modules. In the FEEDBACK pod section, bioperl-l is mentioned in the context of "comments", "suggestions", and "General discussion", but the issue of support isn't explicitly mentioned. Should the template look more like # Please direct questions and support issues to bioperl-l at lists.open-bio.org # Maintained by P. Hacker ph at bugfree.com with the support issue reiterated in the FEEDBACK section? I would be willing to work on an update of the trunk if the idea sounds good to all. MAJ From hlapp at gmx.net Thu Feb 19 22:45:51 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 19 Feb 2009 22:45:51 -0500 Subject: [Bioperl-l] directing questions to the list via the module template In-Reply-To: References: Message-ID: Sounds certainly good to me. -hilmar On Feb 19, 2009, at 10:34 PM, Mark A. Jensen wrote: > Core - > It seems like the off-list questions come (not unreasonably) from > the the line > > # Cared for by P. Hacker > > which is right at the top of most modules. In the FEEDBACK pod > section, > bioperl-l is mentioned in the context of "comments", "suggestions", > and > "General discussion", but the issue of support isn't explicitly > mentioned. > > Should the template look more like > > # Please direct questions and support issues to bioperl-l at lists.open-bio.org > # Maintained by P. Hacker ph at bugfree.com > > with the support issue reiterated in the FEEDBACK section? > > I would be willing to work on an update of the trunk if the idea > sounds good to all. > MAJ > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From maj at fortinbras.us Thu Feb 19 23:56:33 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Thu, 19 Feb 2009 23:56:33 -0500 Subject: [Bioperl-l] function for duplication events? In-Reply-To: <66b602900902190037u76626243r955c8d56e44e975@mail.gmail.com> References: <66b602900902190037u76626243r955c8d56e44e975@mail.gmail.com> Message-ID: <7DCAC5E1120E4BF1B6959D9CF6981FF9@NewLife> Chrysanthi- I believe the positive identification of duplicated genes is tricky- The question is, do you a have a list of known paralogous gene pairs, or do you need to sift the sequence for likely candidates. I think it should be "relatively straightforward" (haha!) to look down trees for the most recent common ancestor of known pairs, and use that node as an approximation for the duplication event. If you don't know what genes you have are duplicates of each other, then you might have a look at the algorithm in http://www.ncbi.nlm.nih.gov/pubmed/12836682 by Wen-Hsiung Li, Zhenglong Gu, et al. This paper is relatively old (2003)--it's not my specialty; I'm sure many developments have ensued since. Gu has done nice computational work in this area and you might want to track him down. (If you want this paper and can't access it, contact me off-list.) Mark ----- Original Message ----- From: "Chrysanthi A." To: Sent: Thursday, February 19, 2009 3:37 AM Subject: [Bioperl-l] function for duplication events? > Hi, > > I was wondering if there is a method in order to visualize a tree by parsing > a nexus format and also, I would like to identify the duplication events. > Are there any functions that will give me the duplication events? > > thanks a lot, > > Chrysanthi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Fri Feb 20 00:19:56 2009 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 19 Feb 2009 23:19:56 -0600 Subject: [Bioperl-l] directing questions to the list via the module template In-Reply-To: References: Message-ID: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu> Same here. Shouldn't be too hard to do. -chris On Feb 19, 2009, at 9:45 PM, Hilmar Lapp wrote: > Sounds certainly good to me. -hilmar > > On Feb 19, 2009, at 10:34 PM, Mark A. Jensen wrote: > >> Core - >> It seems like the off-list questions come (not unreasonably) from >> the the line >> >> # Cared for by P. Hacker >> >> which is right at the top of most modules. In the FEEDBACK pod >> section, >> bioperl-l is mentioned in the context of "comments", "suggestions", >> and >> "General discussion", but the issue of support isn't explicitly >> mentioned. >> >> Should the template look more like >> >> # Please direct questions and support issues to bioperl-l at lists.open-bio.org >> # Maintained by P. Hacker ph at bugfree.com >> >> with the support issue reiterated in the FEEDBACK section? >> >> I would be willing to work on an update of the trunk if the idea >> sounds good to all. >> MAJ >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at illinois.edu Fri Feb 20 00:28:03 2009 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 19 Feb 2009 23:28:03 -0600 Subject: [Bioperl-l] function for duplication events? In-Reply-To: <7DCAC5E1120E4BF1B6959D9CF6981FF9@NewLife> References: <66b602900902190037u76626243r955c8d56e44e975@mail.gmail.com> <7DCAC5E1120E4BF1B6959D9CF6981FF9@NewLife> Message-ID: <41492960-AD1B-4699-9CDF-0E9E896A1567@illinois.edu> There is also clustering using various algorithms such as MCL. I have used it for a project here, and Jason has directly donated code to the cause. See here: http://www.micans.org/mcl/ Note this requires an all-v-all BLASTP run prior to the clustering. chris On Feb 19, 2009, at 10:56 PM, Mark A. Jensen wrote: > Chrysanthi- > I believe the positive identification of duplicated genes is tricky- > The question is, do you a have a list of known paralogous gene pairs, > or do you need to sift the sequence for likely candidates. I think > it should be "relatively straightforward" (haha!) to look down trees > for the most recent common ancestor of known pairs, and use that node > as an approximation for the duplication event. > If you don't know what genes you have are duplicates of each other, > then you might have a look at the algorithm in http://www.ncbi.nlm.nih.gov/pubmed/12836682 > by Wen-Hsiung Li, Zhenglong Gu, et al. This paper is relatively > old (2003)--it's not my specialty; I'm sure many developments have > ensued since. Gu has done nice computational work in this area and > you might want to track him down. (If you want this paper and can't > access it, contact me off-list.) > Mark > ----- Original Message ----- From: "Chrysanthi A." > > To: > Sent: Thursday, February 19, 2009 3:37 AM > Subject: [Bioperl-l] function for duplication events? > > >> Hi, >> I was wondering if there is a method in order to visualize a tree >> by parsing >> a nexus format and also, I would like to identify the duplication >> events. >> Are there any functions that will give me the duplication events? >> thanks a lot, >> Chrysanthi >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason at bioperl.org Fri Feb 20 00:51:46 2009 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Feb 2009 21:51:46 -0800 Subject: [Bioperl-l] function for duplication events? In-Reply-To: <66b602900902190037u76626243r955c8d56e44e975@mail.gmail.com> References: <66b602900902190037u76626243r955c8d56e44e975@mail.gmail.com> Message-ID: If you mean infer duplication vs speciation events via a tree I would suggest taking at look at these tools. RIO, njtree/treebest, softparsemap, or Notung. http://rio.janelia.org/ http://treesoft.sourceforge.net/treebest.shtml http://www.cbu.uib.no/~steffpar/softparsmap/ http://www.cs.cmu.edu/~durand/Notung/ Probably njtree/treebest is the easiest to get started with. We talk about some of these a little in our tutorial from '07 and from the '06 hackathon. http://jason.open-bio.org/Bioperl_Tutorials/ISMB2007/ https://www.nescent.org/wg_phyloinformatics/Reconcile_Trees_Documentation . Also see Dannie Durand et al review http://dx.doi.org/10.1016/j.tig.2006.01.002 -jason On Feb 19, 2009, at 12:37 AM, Chrysanthi A. wrote: > Hi, > > I was wondering if there is a method in order to visualize a tree by > parsing > a nexus format and also, I would like to identify the duplication > events. > Are there any functions that will give me the duplication events? > > thanks a lot, > > Chrysanthi > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Jason Stajich jason at bioperl.org From jay at jays.net Fri Feb 20 09:24:16 2009 From: jay at jays.net (Jay Hannah) Date: Fri, 20 Feb 2009 08:24:16 -0600 Subject: [Bioperl-l] The BioPerl Scrapbook Message-ID: <223334F4-C6E8-4A25-8EB0-77855C10DC5A@jays.net> Wow. Just stumbled into this: http://www.bioperl.org/wiki/Category:Scrapbook Unless I'm mistaken, some *hard* problems I've overheard people working on at UNO recently are sitting here, already solved. I even have a couple author credits. :) Some deep, deep magic is sitting here, free for the taking. I love it! j http://www.bioperl.org/wiki/User:Jhannah From maj at fortinbras.us Fri Feb 20 09:45:35 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 20 Feb 2009 09:45:35 -0500 Subject: [Bioperl-l] directing questions to the list via the module template In-Reply-To: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu> References: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu> Message-ID: <606FD701B0E34A9992D217A45FE49E83@NewLife> Ok- Here's the text I'll add to relevant modules: <<<<< # Please direct questions and support issues to # ===== # Cared for by... and a new =head2 under FEEDBACK: <<<<< =head2 Support Please direct usage questions or support issues to the mailing list:\ L rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. ===== this between "Mailing Lists" and "Reporting Bugs". I will also update the templates in bioperl.lisp, if that's cool with Heikki- cheers, MAJ ----- Original Message ----- From: "Chris Fields" To: "Hilmar Lapp" Cc: "Mark A. Jensen" ; "bioperl list" Sent: Friday, February 20, 2009 12:19 AM Subject: Re: [Bioperl-l] directing questions to the list via the module template > Same here. Shouldn't be too hard to do. > > -chris > > On Feb 19, 2009, at 9:45 PM, Hilmar Lapp wrote: > >> Sounds certainly good to me. -hilmar >> >> On Feb 19, 2009, at 10:34 PM, Mark A. Jensen wrote: >> >>> Core - >>> It seems like the off-list questions come (not unreasonably) from the the >>> line >>> >>> # Cared for by P. Hacker >>> >>> which is right at the top of most modules. In the FEEDBACK pod section, >>> bioperl-l is mentioned in the context of "comments", "suggestions", and >>> "General discussion", but the issue of support isn't explicitly mentioned. >>> >>> Should the template look more like >>> >>> # Please direct questions and support issues to bioperl-l at lists.open-bio.org >>> # Maintained by P. Hacker ph at bugfree.com >>> >>> with the support issue reiterated in the FEEDBACK section? >>> >>> I would be willing to work on an update of the trunk if the idea >>> sounds good to all. >>> MAJ >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From mauricio at open-bio.org Fri Feb 20 10:27:43 2009 From: mauricio at open-bio.org (Mauricio Herrera Cuadra) Date: Fri, 20 Feb 2009 09:27:43 -0600 Subject: [Bioperl-l] directing questions to the list via the module template In-Reply-To: <606FD701B0E34A9992D217A45FE49E83@NewLife> References: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu> <606FD701B0E34A9992D217A45FE49E83@NewLife> Message-ID: <499ECBEF.3040608@open-bio.org> It'll be fun to see if someone in the bioperl-guts@ list calls you 'spammer' after the massive commit. I got that once :P Mark A. Jensen wrote: > Ok- Here's the text I'll add to relevant modules: > > <<<<< > # Please direct questions and support issues to > # > ===== > # Cared for by... > > and a new =head2 under FEEDBACK: > > <<<<< > =head2 Support > > Please direct usage questions or support issues to the mailing list:\ > > L > > rather than to the module maintainer directly. Many experienced and > reponsive experts will be able look at the problem and quickly > address it. Please include a thorough description of the problem > with code and data examples if at all possible. > ===== > > this between "Mailing Lists" and "Reporting Bugs". > > I will also update the templates in bioperl.lisp, if that's cool with > Heikki- > > cheers, > MAJ > > ----- Original Message ----- From: "Chris Fields" > To: "Hilmar Lapp" > Cc: "Mark A. Jensen" ; "bioperl list" > > Sent: Friday, February 20, 2009 12:19 AM > Subject: Re: [Bioperl-l] directing questions to the list via the module > template > > >> Same here. Shouldn't be too hard to do. >> >> -chris >> >> On Feb 19, 2009, at 9:45 PM, Hilmar Lapp wrote: >> >>> Sounds certainly good to me. -hilmar >>> >>> On Feb 19, 2009, at 10:34 PM, Mark A. Jensen wrote: >>> >>>> Core - >>>> It seems like the off-list questions come (not unreasonably) from >>>> the the line >>>> >>>> # Cared for by P. Hacker >>>> >>>> which is right at the top of most modules. In the FEEDBACK pod >>>> section, >>>> bioperl-l is mentioned in the context of "comments", "suggestions", >>>> and >>>> "General discussion", but the issue of support isn't explicitly >>>> mentioned. >>>> >>>> Should the template look more like >>>> >>>> # Please direct questions and support issues to >>>> bioperl-l at lists.open-bio.org >>>> # Maintained by P. Hacker ph at bugfree.com >>>> >>>> with the support issue reiterated in the FEEDBACK section? >>>> >>>> I would be willing to work on an update of the trunk if the idea >>>> sounds good to all. >>>> MAJ >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From maj at fortinbras.us Fri Feb 20 10:30:20 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 20 Feb 2009 10:30:20 -0500 Subject: [Bioperl-l] directing questions to the list via the module template In-Reply-To: <499ECBEF.3040608@open-bio.org> References: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu> <606FD701B0E34A9992D217A45FE49E83@NewLife> <499ECBEF.3040608@open-bio.org> Message-ID: <8AA920942FE74F8B8DB52601076C1E72@NewLife> It's a personal goal of mine to be listed in every 'blame'. (How art mirrors life....) ----- Original Message ----- From: "Mauricio Herrera Cuadra" To: "Mark A. Jensen" Cc: "Chris Fields" ; "Hilmar Lapp" ; "bioperl list" Sent: Friday, February 20, 2009 10:27 AM Subject: Re: [Bioperl-l] directing questions to the list via the module template > It'll be fun to see if someone in the bioperl-guts@ list calls you 'spammer' > after the massive commit. I got that once :P > > Mark A. Jensen wrote: >> Ok- Here's the text I'll add to relevant modules: >> >> <<<<< >> # Please direct questions and support issues to >> # >> ===== >> # Cared for by... >> >> and a new =head2 under FEEDBACK: >> >> <<<<< >> =head2 Support >> >> Please direct usage questions or support issues to the mailing list:\ >> >> L >> >> rather than to the module maintainer directly. Many experienced and >> reponsive experts will be able look at the problem and quickly >> address it. Please include a thorough description of the problem >> with code and data examples if at all possible. >> ===== >> >> this between "Mailing Lists" and "Reporting Bugs". >> >> I will also update the templates in bioperl.lisp, if that's cool with Heikki- >> >> cheers, >> MAJ >> >> ----- Original Message ----- From: "Chris Fields" >> To: "Hilmar Lapp" >> Cc: "Mark A. Jensen" ; "bioperl list" >> >> Sent: Friday, February 20, 2009 12:19 AM >> Subject: Re: [Bioperl-l] directing questions to the list via the module >> template >> >> >>> Same here. Shouldn't be too hard to do. >>> >>> -chris >>> >>> On Feb 19, 2009, at 9:45 PM, Hilmar Lapp wrote: >>> >>>> Sounds certainly good to me. -hilmar >>>> >>>> On Feb 19, 2009, at 10:34 PM, Mark A. Jensen wrote: >>>> >>>>> Core - >>>>> It seems like the off-list questions come (not unreasonably) from the the >>>>> line >>>>> >>>>> # Cared for by P. Hacker >>>>> >>>>> which is right at the top of most modules. In the FEEDBACK pod section, >>>>> bioperl-l is mentioned in the context of "comments", "suggestions", and >>>>> "General discussion", but the issue of support isn't explicitly >>>>> mentioned. >>>>> >>>>> Should the template look more like >>>>> >>>>> # Please direct questions and support issues to >>>>> bioperl-l at lists.open-bio.org >>>>> # Maintained by P. Hacker ph at bugfree.com >>>>> >>>>> with the support issue reiterated in the FEEDBACK section? >>>>> >>>>> I would be willing to work on an update of the trunk if the idea >>>>> sounds good to all. >>>>> MAJ >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> -- >>>> =========================================================== >>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>>> =========================================================== >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > From maj at fortinbras.us Fri Feb 20 10:39:10 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 20 Feb 2009 10:39:10 -0500 Subject: [Bioperl-l] directing questions to the list via the moduletemplate In-Reply-To: <606FD701B0E34A9992D217A45FE49E83@NewLife> References: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu> <606FD701B0E34A9992D217A45FE49E83@NewLife> Message-ID: <6DD3C1A5E67848B7B33F0B0ADEF750F7@NewLife> All- The commit ("spammit"?) is ready to go. I'll wait till this evening (EST) to do it to allow more time for vetos- cheers, MAJ ----- Original Message ----- From: "Mark A. Jensen" To: "Chris Fields" ; "Hilmar Lapp" Cc: "bioperl list" Sent: Friday, February 20, 2009 9:45 AM Subject: Re: [Bioperl-l] directing questions to the list via the moduletemplate > Ok- Here's the text I'll add to relevant modules: > > <<<<< > # Please direct questions and support issues to > # > ===== > # Cared for by... > > and a new =head2 under FEEDBACK: > > <<<<< > =head2 Support > > Please direct usage questions or support issues to the mailing list:\ > > L > > rather than to the module maintainer directly. Many experienced and > reponsive experts will be able look at the problem and quickly > address it. Please include a thorough description of the problem > with code and data examples if at all possible. > ===== > > this between "Mailing Lists" and "Reporting Bugs". > > I will also update the templates in bioperl.lisp, if that's cool with Heikki- > > cheers, > MAJ > > ----- Original Message ----- > From: "Chris Fields" > To: "Hilmar Lapp" > Cc: "Mark A. Jensen" ; "bioperl list" > > Sent: Friday, February 20, 2009 12:19 AM > Subject: Re: [Bioperl-l] directing questions to the list via the module > template > > >> Same here. Shouldn't be too hard to do. >> >> -chris >> >> On Feb 19, 2009, at 9:45 PM, Hilmar Lapp wrote: >> >>> Sounds certainly good to me. -hilmar >>> >>> On Feb 19, 2009, at 10:34 PM, Mark A. Jensen wrote: >>> >>>> Core - >>>> It seems like the off-list questions come (not unreasonably) from the the >>>> line >>>> >>>> # Cared for by P. Hacker >>>> >>>> which is right at the top of most modules. In the FEEDBACK pod section, >>>> bioperl-l is mentioned in the context of "comments", "suggestions", and >>>> "General discussion", but the issue of support isn't explicitly mentioned. >>>> >>>> Should the template look more like >>>> >>>> # Please direct questions and support issues to >>>> bioperl-l at lists.open-bio.org >>>> # Maintained by P. Hacker ph at bugfree.com >>>> >>>> with the support issue reiterated in the FEEDBACK section? >>>> >>>> I would be willing to work on an update of the trunk if the idea >>>> sounds good to all. >>>> MAJ >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at illinois.edu Fri Feb 20 14:22:19 2009 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 20 Feb 2009 13:22:19 -0600 Subject: [Bioperl-l] directing questions to the list via the module template In-Reply-To: <499ECBEF.3040608@open-bio.org> References: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu> <606FD701B0E34A9992D217A45FE49E83@NewLife> <499ECBEF.3040608@open-bio.org> Message-ID: Spammer! ;> Actually, I think I blasted that person for saying that (and not-so- gently pointed out the unsubscribe link). Just can't remember if I did it on or off-list. It's a dev list, what was s/he thinking? chris On Feb 20, 2009, at 9:27 AM, Mauricio Herrera Cuadra wrote: > It'll be fun to see if someone in the bioperl-guts@ list calls you > 'spammer' after the massive commit. I got that once :P > > Mark A. Jensen wrote: >> Ok- Here's the text I'll add to relevant modules: >> <<<<< >> # Please direct questions and support issues to > > >> # >> ===== >> # Cared for by... >> and a new =head2 under FEEDBACK: >> <<<<< >> =head2 Support >> Please direct usage questions or support issues to the mailing list:\ >> L >> rather than to the module maintainer directly. Many experienced and >> reponsive experts will be able look at the problem and quickly >> address it. Please include a thorough description of the problem >> with code and data examples if at all possible. >> ===== >> this between "Mailing Lists" and "Reporting Bugs". >> I will also update the templates in bioperl.lisp, if that's cool >> with Heikki- >> cheers, >> MAJ >> ----- Original Message ----- From: "Chris Fields" > > >> To: "Hilmar Lapp" >> Cc: "Mark A. Jensen" ; "bioperl list" > > >> Sent: Friday, February 20, 2009 12:19 AM >> Subject: Re: [Bioperl-l] directing questions to the list via the >> module template >>> Same here. Shouldn't be too hard to do. >>> >>> -chris >>> >>> On Feb 19, 2009, at 9:45 PM, Hilmar Lapp wrote: >>> >>>> Sounds certainly good to me. -hilmar >>>> >>>> On Feb 19, 2009, at 10:34 PM, Mark A. Jensen wrote: >>>> >>>>> Core - >>>>> It seems like the off-list questions come (not unreasonably) >>>>> from the the line >>>>> >>>>> # Cared for by P. Hacker >>>>> >>>>> which is right at the top of most modules. In the FEEDBACK pod >>>>> section, >>>>> bioperl-l is mentioned in the context of "comments", >>>>> "suggestions", and >>>>> "General discussion", but the issue of support isn't explicitly >>>>> mentioned. >>>>> >>>>> Should the template look more like >>>>> >>>>> # Please direct questions and support issues to bioperl-l at lists.open-bio.org >>>>> # Maintained by P. Hacker ph at bugfree.com >>>>> >>>>> with the support issue reiterated in the FEEDBACK section? >>>>> >>>>> I would be willing to work on an update of the trunk if the idea >>>>> sounds good to all. >>>>> MAJ >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> -- >>>> =========================================================== >>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>>> =========================================================== >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Fri Feb 20 19:57:38 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 20 Feb 2009 19:57:38 -0500 Subject: [Bioperl-l] directing questions to the list via the module template In-Reply-To: <6DD3C1A5E67848B7B33F0B0ADEF750F7@NewLife> References: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu><606FD701B0E34A9992D217A45FE49E83@NewLife> <6DD3C1A5E67848B7B33F0B0ADEF750F7@NewLife> Message-ID: Committed the big kahuna. Sheepishly wishing you "happy updates"- MAJ ----- Original Message ----- From: "Mark A. Jensen" To: "Mark A. Jensen" ; "Chris Fields" ; "Hilmar Lapp" Cc: "bioperl list" Sent: Friday, February 20, 2009 10:39 AM Subject: Re: [Bioperl-l] directing questions to the list via themoduletemplate > All- > The commit ("spammit"?) is ready to go. I'll wait till this evening (EST) to > do it > to allow more time for vetos- > cheers, > MAJ > > ----- Original Message ----- > From: "Mark A. Jensen" > To: "Chris Fields" ; "Hilmar Lapp" > Cc: "bioperl list" > Sent: Friday, February 20, 2009 9:45 AM > Subject: Re: [Bioperl-l] directing questions to the list via the > moduletemplate > > >> Ok- Here's the text I'll add to relevant modules: >> >> <<<<< >> # Please direct questions and support issues to >> # >> ===== >> # Cared for by... >> >> and a new =head2 under FEEDBACK: >> >> <<<<< >> =head2 Support >> >> Please direct usage questions or support issues to the mailing list:\ >> >> L >> >> rather than to the module maintainer directly. Many experienced and >> reponsive experts will be able look at the problem and quickly >> address it. Please include a thorough description of the problem >> with code and data examples if at all possible. >> ===== >> >> this between "Mailing Lists" and "Reporting Bugs". >> >> I will also update the templates in bioperl.lisp, if that's cool with Heikki- >> >> cheers, >> MAJ >> >> ----- Original Message ----- >> From: "Chris Fields" >> To: "Hilmar Lapp" >> Cc: "Mark A. Jensen" ; "bioperl list" >> >> Sent: Friday, February 20, 2009 12:19 AM >> Subject: Re: [Bioperl-l] directing questions to the list via the module >> template >> >> >>> Same here. Shouldn't be too hard to do. >>> >>> -chris >>> >>> On Feb 19, 2009, at 9:45 PM, Hilmar Lapp wrote: >>> >>>> Sounds certainly good to me. -hilmar >>>> >>>> On Feb 19, 2009, at 10:34 PM, Mark A. Jensen wrote: >>>> >>>>> Core - >>>>> It seems like the off-list questions come (not unreasonably) from the the >>>>> line >>>>> >>>>> # Cared for by P. Hacker >>>>> >>>>> which is right at the top of most modules. In the FEEDBACK pod section, >>>>> bioperl-l is mentioned in the context of "comments", "suggestions", and >>>>> "General discussion", but the issue of support isn't explicitly >>>>> mentioned. >>>>> >>>>> Should the template look more like >>>>> >>>>> # Please direct questions and support issues to >>>>> bioperl-l at lists.open-bio.org >>>>> # Maintained by P. Hacker ph at bugfree.com >>>>> >>>>> with the support issue reiterated in the FEEDBACK section? >>>>> >>>>> I would be willing to work on an update of the trunk if the idea >>>>> sounds good to all. >>>>> MAJ >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> -- >>>> =========================================================== >>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>>> =========================================================== >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From mmokrejs at ribosome.natur.cuni.cz Sat Feb 21 16:31:46 2009 From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Sat, 21 Feb 2009 22:31:46 +0100 Subject: [Bioperl-l] [ANNOUCEMENT] BioPerl 1.6 RC3 In-Reply-To: <4977A226.9010805@purdue.edu> References: <176C5E7F-4E8C-442A-82A1-8B56B70F7F3F@illinois.edu> <06477E94-3F95-48E0-9EA6-F6B1307325CC@verizon.net> <497718DE.6070002@purdue.edu> <03A17B48-CAF3-4491-ADE1-08C826BE1BA9@illinois.edu> <4977A226.9010805@purdue.edu> Message-ID: <49A072C2.8070304@ribosome.natur.cuni.cz> For the completeness of the archives, it seems there used to be a Gelminder tool to read gel files. I cannot find anymore other sources than a backup of RCS files. But somebody more skilled is probably able to recreate the sources from the RCS-generated diffs. ftp://ftp.sanger.ac.uk/pub/badger/gelminder ftp://ftp.sanger.ac.uk/pub/PRODUCTION_SOFTWARE/src/gelminder/ ftp://ftp.sanger.ac.uk/pub/PRODUCTION_SOFTWARE/src/gelmover/ For sure that should be helpful to somebody having the time to read in the sources how to interpret the file format. Not my case, though. ;) Martin Phillip San Miguel wrote: > Okay, here are bunch of them: > > http://www.genomics.purdue.edu/~pmiguel/technical/alx/ > > (Had them on a zip disk...) > > phred no longer appears to be able to read them. > > Chris Fields wrote: >> Might be worth a try if you can dig any files up. Frankly if it >> doesn't work we can probably deprecate that module, unless someone out >> there managed to get it working. >> >> chris >> >> On Jan 21, 2009, at 6:45 AM, Phillip San Miguel wrote: >> >>> And the late 90's! >>> The situation is a little more complex though. Pharmacia had an >>> older instrument or two called the "Alf" and/or "Alf-red". I never >>> saw one of those. But the Alfx -- that instrument rocked my world! >>> 700+ base reads were common and there was a cycle sequencing kit >>> available so I could sequence off 25+ kb subclones and lambda DNA. >>> Anyway, I can probably dig up some .alx files. But I think I tried >>> to read one with SeqIO once and it failed. So it may be that >>> Bio::SeqIO::alf really only reads the older .alf files, not the more >>> modern .alx trace file format. >>> Phred could read them--poorly. It used the raw, rather than the >>> processed traces, evidently. >>> >>> Phillip >>> >>> Brian Osborne wrote: >>>> Chris, >>>> >>>> This is my doing. Way back when I made an individual test file for >>>> each SeqIO module, then did my best to find example files for each >>>> format. I never did find an ALF output file, these machines were >>>> used in the early '90's. >>>> >>>> Brian O. >>>> >>>> On Jan 18, 2009, at 12:07 AM, Chris Fields wrote: >>>> >>>>> For some reason we have a test suite for Bio::SeqIO::alf but >>>>> apparently no test data! From cjfields at illinois.edu Sat Feb 21 17:05:44 2009 From: cjfields at illinois.edu (Chris Fields) Date: Sat, 21 Feb 2009 16:05:44 -0600 Subject: [Bioperl-l] [ANNOUCEMENT] BioPerl 1.6 RC3 In-Reply-To: <49A072C2.8070304@ribosome.natur.cuni.cz> References: <176C5E7F-4E8C-442A-82A1-8B56B70F7F3F@illinois.edu> <06477E94-3F95-48E0-9EA6-F6B1307325CC@verizon.net> <497718DE.6070002@purdue.edu> <03A17B48-CAF3-4491-ADE1-08C826BE1BA9@illinois.edu> <4977A226.9010805@purdue.edu> <49A072C2.8070304@ribosome.natur.cuni.cz> Message-ID: <55D48A1F-9C13-421A-A573-4454E65E1D8A@illinois.edu> Patches are always welcome. ;> chris On Feb 21, 2009, at 3:31 PM, Martin MOKREJ? wrote: > For the completeness of the archives, it seems there used to > be a Gelminder tool to read gel files. > I cannot find anymore other sources than a backup of RCS files. > But somebody more skilled is probably able to recreate the > sources from the RCS-generated diffs. > > ftp://ftp.sanger.ac.uk/pub/badger/gelminder > ftp://ftp.sanger.ac.uk/pub/PRODUCTION_SOFTWARE/src/gelminder/ > ftp://ftp.sanger.ac.uk/pub/PRODUCTION_SOFTWARE/src/gelmover/ > > For sure that should be helpful to somebody having the time > to read in the sources how to interpret the file format. > Not my case, though. ;) > Martin > > Phillip San Miguel wrote: >> Okay, here are bunch of them: >> >> http://www.genomics.purdue.edu/~pmiguel/technical/alx/ >> >> (Had them on a zip disk...) >> >> phred no longer appears to be able to read them. >> >> Chris Fields wrote: >>> Might be worth a try if you can dig any files up. Frankly if it >>> doesn't work we can probably deprecate that module, unless someone >>> out >>> there managed to get it working. >>> >>> chris >>> >>> On Jan 21, 2009, at 6:45 AM, Phillip San Miguel wrote: >>> >>>> And the late 90's! >>>> The situation is a little more complex though. Pharmacia had an >>>> older instrument or two called the "Alf" and/or "Alf-red". I never >>>> saw one of those. But the Alfx -- that instrument rocked my world! >>>> 700+ base reads were common and there was a cycle sequencing kit >>>> available so I could sequence off 25+ kb subclones and lambda DNA. >>>> Anyway, I can probably dig up some .alx files. But I think I tried >>>> to read one with SeqIO once and it failed. So it may be that >>>> Bio::SeqIO::alf really only reads the older .alf files, not the >>>> more >>>> modern .alx trace file format. >>>> Phred could read them--poorly. It used the raw, rather than the >>>> processed traces, evidently. >>>> >>>> Phillip >>>> >>>> Brian Osborne wrote: >>>>> Chris, >>>>> >>>>> This is my doing. Way back when I made an individual test file for >>>>> each SeqIO module, then did my best to find example files for each >>>>> format. I never did find an ALF output file, these machines were >>>>> used in the early '90's. >>>>> >>>>> Brian O. >>>>> >>>>> On Jan 18, 2009, at 12:07 AM, Chris Fields wrote: >>>>> >>>>>> For some reason we have a test suite for Bio::SeqIO::alf but >>>>>> apparently no test data! > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From maj at fortinbras.us Sat Feb 21 17:18:10 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 21 Feb 2009 17:18:10 -0500 Subject: [Bioperl-l] directing questions to the list via the module template In-Reply-To: <499ECBEF.3040608@open-bio.org> References: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu> <606FD701B0E34A9992D217A45FE49E83@NewLife> <499ECBEF.3040608@open-bio.org> Message-ID: <1C2A110491EE42EE9C0545ABB546B420@NewLife> I just signed on to bioperl-guts, then did commits to the auxiliary distributions. I now see exactly the motivation behind the accusation 'spammer'... ----- Original Message ----- From: "Mauricio Herrera Cuadra" To: "Mark A. Jensen" Cc: "Chris Fields" ; "Hilmar Lapp" ; "bioperl list" Sent: Friday, February 20, 2009 10:27 AM Subject: Re: [Bioperl-l] directing questions to the list via the module template > It'll be fun to see if someone in the bioperl-guts@ list calls you 'spammer' > after the massive commit. I got that once :P > > Mark A. Jensen wrote: >> Ok- Here's the text I'll add to relevant modules: >> >> <<<<< >> # Please direct questions and support issues to >> # >> ===== >> # Cared for by... >> >> and a new =head2 under FEEDBACK: >> >> <<<<< >> =head2 Support >> >> Please direct usage questions or support issues to the mailing list:\ >> >> L >> >> rather than to the module maintainer directly. Many experienced and >> reponsive experts will be able look at the problem and quickly >> address it. Please include a thorough description of the problem >> with code and data examples if at all possible. >> ===== >> >> this between "Mailing Lists" and "Reporting Bugs". >> >> I will also update the templates in bioperl.lisp, if that's cool with Heikki- >> >> cheers, >> MAJ >> >> ----- Original Message ----- From: "Chris Fields" >> To: "Hilmar Lapp" >> Cc: "Mark A. Jensen" ; "bioperl list" >> >> Sent: Friday, February 20, 2009 12:19 AM >> Subject: Re: [Bioperl-l] directing questions to the list via the module >> template >> >> >>> Same here. Shouldn't be too hard to do. >>> >>> -chris >>> >>> On Feb 19, 2009, at 9:45 PM, Hilmar Lapp wrote: >>> >>>> Sounds certainly good to me. -hilmar >>>> >>>> On Feb 19, 2009, at 10:34 PM, Mark A. Jensen wrote: >>>> >>>>> Core - >>>>> It seems like the off-list questions come (not unreasonably) from the the >>>>> line >>>>> >>>>> # Cared for by P. Hacker >>>>> >>>>> which is right at the top of most modules. In the FEEDBACK pod section, >>>>> bioperl-l is mentioned in the context of "comments", "suggestions", and >>>>> "General discussion", but the issue of support isn't explicitly >>>>> mentioned. >>>>> >>>>> Should the template look more like >>>>> >>>>> # Please direct questions and support issues to >>>>> bioperl-l at lists.open-bio.org >>>>> # Maintained by P. Hacker ph at bugfree.com >>>>> >>>>> with the support issue reiterated in the FEEDBACK section? >>>>> >>>>> I would be willing to work on an update of the trunk if the idea >>>>> sounds good to all. >>>>> MAJ >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> -- >>>> =========================================================== >>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>>> =========================================================== >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > From wgallin at ualberta.ca Sat Feb 21 17:59:00 2009 From: wgallin at ualberta.ca (Warren Gallin) Date: Sat, 21 Feb 2009 15:59:00 -0700 Subject: [Bioperl-l] Difference between Bio::SimpleAlign Object and Bio::Align::AlignI Object Message-ID: The following code snippet returns a Bio::SimpleAlign Object my $stream = Bio::AlignIO->new(-file => $infile, -format => "fasta"); my $prot_align = $stream -> next_aln; However, later on I want to use $prot_align to align the codons from the nucleic acid sequence my $dna_align = Bio::Align::Utilities -> aa_to_dna_aln($prot_align, \ %nucseq_hash); gives me the following error message: Must provide a valid Bio::Align::AlignI object as the first argument to aa_to_dna_aln, see the documentation for proper usage and the method signature at 090220Make_NA_Align_from_AA_Align.pl line 66 So my question is, how do I create a Bio::Align::AlignI object from the Bio::SimpleAlign object? I suspect that I am missing some subtlety here, but I haven't been able to find any documentation that matches these two up. Warren Gallin From maj at fortinbras.us Sat Feb 21 18:55:49 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 21 Feb 2009 18:55:49 -0500 Subject: [Bioperl-l] Difference between Bio::SimpleAlign Object andBio::Align::AlignI Object In-Reply-To: References: Message-ID: Hi Warren- The problem here is that aa_to_dna_aln() is not an object method; it is a "plain function" in Bio::Align::Utilities. You have a couple options; do my $dna_align = Bio::Align::Utilities::aa_to_dna_aln($prot_align, \%nucseq_hash); or use Bio::Align::Utilities qw(aa_to_dna_aln); #... my $dna_align = aa_to_dna_aln($prot_align, \%nucseq_hash); cheers, Mark [The problem with using the function reference form (Bio::Align::Utilities->aa_to_dna_aln) is that the class name ("Bio::Align::Utilities") is then made the first argument, with your args 2nd and 3rd. So when the function does the checking, ref($aln) returns the empty string, since $aln is then set to "Bio::Align::Utilities" on entry.] ----- Original Message ----- From: "Warren Gallin" To: "BioPerl List" Sent: Saturday, February 21, 2009 5:59 PM Subject: [Bioperl-l] Difference between Bio::SimpleAlign Object andBio::Align::AlignI Object > The following code snippet returns a Bio::SimpleAlign Object > > > my $stream = Bio::AlignIO->new(-file => $infile, -format => "fasta"); > > my $prot_align = $stream -> next_aln; > > > However, later on I want to use $prot_align to align the codons from the > nucleic acid sequence > > my $dna_align = Bio::Align::Utilities -> aa_to_dna_aln($prot_align, \ > %nucseq_hash); > > > gives me the following error message: > > Must provide a valid Bio::Align::AlignI object as the first argument to > aa_to_dna_aln, see the documentation for proper usage and the method > signature at 090220Make_NA_Align_from_AA_Align.pl line 66 > > So my question is, how do I create a Bio::Align::AlignI object from the > Bio::SimpleAlign object? > > I suspect that I am missing some subtlety here, but I haven't been able to > find any documentation that matches these two up. > > Warren Gallin > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From maj at fortinbras.us Sat Feb 21 18:58:20 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Sat, 21 Feb 2009 18:58:20 -0500 Subject: [Bioperl-l] Difference between Bio::SimpleAlign Object andBio::Align::AlignI Object In-Reply-To: References: Message-ID: (BTW, the Bio::SimpleAlign object is-a Bio::Align::AlignI object through inheritance, which suggested to me that that wasn't the real issue. MAJ) ----- Original Message ----- From: "Warren Gallin" To: "BioPerl List" Sent: Saturday, February 21, 2009 5:59 PM Subject: [Bioperl-l] Difference between Bio::SimpleAlign Object andBio::Align::AlignI Object > The following code snippet returns a Bio::SimpleAlign Object > > > my $stream = Bio::AlignIO->new(-file => $infile, -format => "fasta"); > > my $prot_align = $stream -> next_aln; > > > However, later on I want to use $prot_align to align the codons from the > nucleic acid sequence > > my $dna_align = Bio::Align::Utilities -> aa_to_dna_aln($prot_align, \ > %nucseq_hash); > > > gives me the following error message: > > Must provide a valid Bio::Align::AlignI object as the first argument to > aa_to_dna_aln, see the documentation for proper usage and the method > signature at 090220Make_NA_Align_from_AA_Align.pl line 66 > > So my question is, how do I create a Bio::Align::AlignI object from the > Bio::SimpleAlign object? > > I suspect that I am missing some subtlety here, but I haven't been able to > find any documentation that matches these two up. > > Warren Gallin > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From mauricio at open-bio.org Sun Feb 22 01:08:21 2009 From: mauricio at open-bio.org (Mauricio Herrera Cuadra) Date: Sun, 22 Feb 2009 00:08:21 -0600 Subject: [Bioperl-l] directing questions to the list via the module template In-Reply-To: <1C2A110491EE42EE9C0545ABB546B420@NewLife> References: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu> <606FD701B0E34A9992D217A45FE49E83@NewLife> <499ECBEF.3040608@open-bio.org> <1C2A110491EE42EE9C0545ABB546B420@NewLife> Message-ID: <49A0EBD5.3080704@open-bio.org> It actually went really smooth this time, apparently due to how SVN handles emails for commits :) Take a look to the July 2006 archives to get the idea of what I meant by spamming: http://lists.open-bio.org/pipermail/bioperl-guts-l/2006-July/author.html I earned an email with my name as the subject line... :P Mark A. Jensen wrote: > I just signed on to bioperl-guts, then did commits to the auxiliary > distributions. I now see exactly > the motivation behind the accusation 'spammer'... > ----- Original Message ----- From: "Mauricio Herrera Cuadra" > > To: "Mark A. Jensen" > Cc: "Chris Fields" ; "Hilmar Lapp" > ; "bioperl list" > Sent: Friday, February 20, 2009 10:27 AM > Subject: Re: [Bioperl-l] directing questions to the list via the module > template > > >> It'll be fun to see if someone in the bioperl-guts@ list calls you >> 'spammer' after the massive commit. I got that once :P >> >> Mark A. Jensen wrote: >>> Ok- Here's the text I'll add to relevant modules: >>> >>> <<<<< >>> # Please direct questions and support issues to >>> # >>> ===== >>> # Cared for by... >>> >>> and a new =head2 under FEEDBACK: >>> >>> <<<<< >>> =head2 Support >>> >>> Please direct usage questions or support issues to the mailing list:\ >>> >>> L >>> >>> rather than to the module maintainer directly. Many experienced and >>> reponsive experts will be able look at the problem and quickly >>> address it. Please include a thorough description of the problem >>> with code and data examples if at all possible. >>> ===== >>> >>> this between "Mailing Lists" and "Reporting Bugs". >>> >>> I will also update the templates in bioperl.lisp, if that's cool with >>> Heikki- >>> >>> cheers, >>> MAJ >>> >>> ----- Original Message ----- From: "Chris Fields" >>> >>> To: "Hilmar Lapp" >>> Cc: "Mark A. Jensen" ; "bioperl list" >>> >>> Sent: Friday, February 20, 2009 12:19 AM >>> Subject: Re: [Bioperl-l] directing questions to the list via the >>> module template >>> >>> >>>> Same here. Shouldn't be too hard to do. >>>> >>>> -chris >>>> >>>> On Feb 19, 2009, at 9:45 PM, Hilmar Lapp wrote: >>>> >>>>> Sounds certainly good to me. -hilmar >>>>> >>>>> On Feb 19, 2009, at 10:34 PM, Mark A. Jensen wrote: >>>>> >>>>>> Core - >>>>>> It seems like the off-list questions come (not unreasonably) from >>>>>> the the line >>>>>> >>>>>> # Cared for by P. Hacker >>>>>> >>>>>> which is right at the top of most modules. In the FEEDBACK pod >>>>>> section, >>>>>> bioperl-l is mentioned in the context of "comments", >>>>>> "suggestions", and >>>>>> "General discussion", but the issue of support isn't explicitly >>>>>> mentioned. >>>>>> >>>>>> Should the template look more like >>>>>> >>>>>> # Please direct questions and support issues to >>>>>> bioperl-l at lists.open-bio.org >>>>>> # Maintained by P. Hacker ph at bugfree.com >>>>>> >>>>>> with the support issue reiterated in the FEEDBACK section? >>>>>> >>>>>> I would be willing to work on an update of the trunk if the idea >>>>>> sounds good to all. >>>>>> MAJ >>>>>> _______________________________________________ >>>>>> Bioperl-l mailing list >>>>>> Bioperl-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>> >>>>> -- >>>>> =========================================================== >>>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>>>> =========================================================== >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From heikki.lehvaslaiho at gmail.com Mon Feb 23 01:33:07 2009 From: heikki.lehvaslaiho at gmail.com (Heikki Lehvaslaiho) Date: Mon, 23 Feb 2009 08:33:07 +0200 Subject: [Bioperl-l] directing questions to the list via the module template In-Reply-To: <606FD701B0E34A9992D217A45FE49E83@NewLife> References: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu> <606FD701B0E34A9992D217A45FE49E83@NewLife> Message-ID: 2009/2/20 Mark A. Jensen : > I will also update the templates in bioperl.lisp, if that's cool with > Heikki- Of course. (Why me. I think Hilmar has been active maintaining this file. We are both "HL") :) -Heikki From bosborne11 at verizon.net Sun Feb 22 23:02:24 2009 From: bosborne11 at verizon.net (Brian Osborne) Date: Sun, 22 Feb 2009 23:02:24 -0500 Subject: [Bioperl-l] Difference between Bio::SimpleAlign Object and Bio::Align::AlignI Object In-Reply-To: References: Message-ID: Warren, We can't tell what %nucseq_hash is from your message, so this is hard to diagnose. If you want to see functional code take a look at t/Align/ AlignUtil.t: my $in = Bio::AlignIO->new(-format => 'clustalw', -file => test_input_file('pep-266.aln')); my $aln = $in->next_aln(); isa_ok($aln, 'Bio::Align::AlignI'); $in->close(); my $seqin = Bio::SeqIO->new(-format => 'fasta', -file => test_input_file('cds-266.fas')); # get the cds sequences my %cds_seq; while( my $seq = $seqin->next_seq ) { $cds_seq{$seq->display_id} = $seq; } my $cds_aln = &aa_to_dna_aln($aln,\%cds_seq); Brian O. On Feb 21, 2009, at 5:59 PM, Warren Gallin wrote: > aa_to_dna_aln From maj at fortinbras.us Mon Feb 23 07:08:18 2009 From: maj at fortinbras.us (Mark A. Jensen) Date: Mon, 23 Feb 2009 07:08:18 -0500 Subject: [Bioperl-l] directing questions to the list via the module template In-Reply-To: References: <2061458E-AE2C-4902-89D8-AC01AD78A228@illinois.edu> <606FD701B0E34A9992D217A45FE49E83@NewLife> Message-ID: (I think it's because you're in the $Id$ line; but it's a CVS version from '07....hmmm....) ----- Original Message ----- From: "Heikki Lehvaslaiho" To: "Mark A. Jensen" Cc: "Chris Fields" ; "Hilmar Lapp" ; "bioperl list" Sent: Monday, February 23, 2009 1:33 AM Subject: Re: [Bioperl-l] directing questions to the list via the module template > 2009/2/20 Mark A. Jensen : > >> I will also update the templates in bioperl.lisp, if that's cool with >> Heikki- > > > Of course. (Why me. I think Hilmar has been active maintaining this > file. We are both "HL") :) > > > -Heikki > > From avilella at gmail.com Mon Feb 23 11:06:21 2009 From: avilella at gmail.com (Albert Vilella) Date: Mon, 23 Feb 2009 16:06:21 +0000 Subject: [Bioperl-l] Call to users/developers -- user cases that bring Bioperl to its knees Message-ID: <358f4d650902230806k196c7163qe5e0d31ec439eb51@mail.gmail.com> Hi all, I've discovered the profiling wonders of "perl -d:NYTProf -S" and I would like to play with it and Bioperl. Can interested users/developers provide a URL with a dataset that brings bioperl to its knees in terms of CPU usage for say, about 1h? Preferably no net access, no calling external programs or other complications, just data churning within Bioperl. And the data must be public, of course. The idea would be to try to identify optimizations in the code that could benefit us all, Cheers, Albert. From vecchi.b at gmail.com Mon Feb 23 11:52:57 2009 From: vecchi.b at gmail.com (Bruno Vecchi) Date: Mon, 23 Feb 2009 14:52:57 -0200 Subject: [Bioperl-l] Call to users/developers -- user cases that bring Bioperl to its knees In-Reply-To: <358f4d650902230806k196c7163qe5e0d31ec439eb51@mail.gmail.com> References: <358f4d650902230806k196c7163qe5e0d31ec439eb51@mail.gmail.com> Message-ID: <1a0c1b750902230852p41ee2ae0s74d44acf7a2ca77b@mail.gmail.com> This trivial example, applied to a large input sequence, could help optimize what I think is one of the most important BioPerl modules: Bio::SeqIO. #!/usr/bin/perl > use strict; > use warnings; > > use Bio::SeqIO; > > my $infile = 'sequences.gp' > > my $seqI = Bio::SeqIO->new( > -file => '<' . $infile, > -format => 'genbank', > -flush => 0, # This makes it go faster > ); > > my $seqO = Bio::SeqIO->new( > -fh => \*STDOUT, > -format => 'fasta', > ); > > while (my $seq = $seqI->next_seq) { > $seqO->write_seq($seq); > } > Since I don't know what the policy is on file attachments on the mailing list, I'll refrain from sending you the >4MB file that I had prepared for profiling. I could send it to you directly if you ask me to, although any sequence file will do. Please notice that scripts running under NYTProf's eye are several times slower; you won't need to code a lot before you can have some scripts whose profiles will be useful. Cheers, Bruno. 2009/2/23 Albert Vilella > Hi all, > > I've discovered the profiling wonders of "perl -d:NYTProf -S" and I > would like to play with it and Bioperl. > > Can interested users/developers provide a URL with a dataset that > brings bioperl to its knees in > terms of CPU usage for say, about 1h? > > Preferably no net access, no calling external programs or other > complications, just data churning within Bioperl. > And the data must be public, of course. > > The idea would be to try to identify optimizations in the code that > could benefit us all, > > Cheers, > > Albert. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From Kevin.M.Brown at asu.edu Mon Feb 23 13:10:32 2009 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 23 Feb 2009 11:10:32 -0700 Subject: [Bioperl-l] Call to users/developers -- user cases that bringBioperl to its knees In-Reply-To: <358f4d650902230806k196c7163qe5e0d31ec439eb51@mail.gmail.com> References: <358f4d650902230806k196c7163qe5e0d31ec439eb51@mail.gmail.com> Message-ID: <1A4207F8295607498283FE9E93B775B405C8104D@EX02.asurite.ad.asu.edu> Try loading the Genbank file or files for the human genome. ftp://ftp.ncbi.nlm.nih.gov/genomes/H_sapiens/ Last time I tried to load them each chromosome took 30+ minutes to load and chewed up lots of memory to hold onto them. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of > Albert Vilella > Sent: Monday, February 23, 2009 9:06 AM > To: bioperl-l > Subject: [Bioperl-l] Call to users/developers -- user cases > that bringBioperl to its knees > > Hi all, > > I've discovered the profiling wonders of "perl -d:NYTProf -S" and I > would like to play with it and Bioperl. > > Can interested users/developers provide a URL with a dataset that > brings bioperl to its knees in > terms of CPU usage for say, about 1h? > > Preferably no net access, no calling external programs or other > complications, just data churning within Bioperl. > And the data must be public, of course. > > The idea would be to try to identify optimizations in the code that > could benefit us all, > > Cheers, > > Albert. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From charles-listes+bioperl at plessy.org Wed Feb 25 01:11:57 2009 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Wed, 25 Feb 2009 15:11:57 +0900 Subject: [Bioperl-l] Call to users/developers -- user cases that bring Bioperl to its knees In-Reply-To: <358f4d650902230806k196c7163qe5e0d31ec439eb51@mail.gmail.com> References: <358f4d650902230806k196c7163qe5e0d31ec439eb51@mail.gmail.com> Message-ID: <20090225061157.GA26826@kunpuu.plessy.org> Le Mon, Feb 23, 2009 at 04:06:21PM +0000, Albert Vilella a ?crit : > > Can interested users/developers provide a URL with a dataset that > brings bioperl to its knees in > terms of CPU usage for say, about 1h? Dear Albert, I do not know if it fits your requirements, but I found that bp_seqconvert or bp_sreformat are not fast enough to be used efficiently with million of sequences in fastq format. You can download an example file here (that I have chosen randomly): ftp://ftp.era.ebi.ac.uk/vol1/fastq/ERR002/ERR002479/ERR002479_2.fastq.gz This file will give an error as the sequence name is not duplicated in the quality header, but I compared with a local file that does not have this problem, and confirmed that the error is not the slowing factor. (Unfortunately, I could not find public fastq files in which the file name is given in both the sequence and quality header, probably because it makes the file heavier). Have a nice day, -- Charles Plessy Tsurumi, Kanagawa, Japan From avilella at gmail.com Wed Feb 25 09:00:07 2009 From: avilella at gmail.com (Albert Vilella) Date: Wed, 25 Feb 2009 14:00:07 +0000 Subject: [Bioperl-l] Call to users/developers -- user cases that bring Bioperl to its knees In-Reply-To: <20090225061157.GA26826@kunpuu.plessy.org> References: <358f4d650902230806k196c7163qe5e0d31ec439eb51@mail.gmail.com> <20090225061157.GA26826@kunpuu.plessy.org> Message-ID: <358f4d650902250600g6a1bf94nc628728049a33a38@mail.gmail.com> Hi, Any parameters in particular for the script? Can you give me yours? On Wed, Feb 25, 2009 at 6:11 AM, Charles Plessy wrote: > Le Mon, Feb 23, 2009 at 04:06:21PM +0000, Albert Vilella a ?crit : >> >> Can interested users/developers provide a URL with a dataset that >> brings bioperl to its knees in >> terms of CPU usage for say, about 1h? > > Dear Albert, > > I do not know if it fits your requirements, but I found that bp_seqconvert or > bp_sreformat are not fast enough to be used efficiently with million of > sequences in fastq format. > > You can download an example file here (that I have chosen randomly): > ftp://ftp.era.ebi.ac.uk/vol1/fastq/ERR002/ERR002479/ERR002479_2.fastq.gz > > This file will give an error as the sequence name is not duplicated in the > quality header, but I compared with a local file that does not have this > problem, and confirmed that the error is not the slowing factor. > (Unfortunately, I could not find public fastq files in which the file name is > given in both the sequence and quality header, probably because it makes the > file heavier). > > Have a nice day, > > -- > Charles Plessy > Tsurumi, Kanagawa, Japan > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From charles-listes+bioperl at plessy.org Wed Feb 25 09:06:14 2009 From: charles-listes+bioperl at plessy.org (Charles Plessy) Date: Wed, 25 Feb 2009 23:06:14 +0900 Subject: [Bioperl-l] Call to users/developers -- user cases that bring Bioperl to its knees In-Reply-To: <358f4d650902250600g6a1bf94nc628728049a33a38@mail.gmail.com> References: <358f4d650902230806k196c7163qe5e0d31ec439eb51@mail.gmail.com> <20090225061157.GA26826@kunpuu.plessy.org> <358f4d650902250600g6a1bf94nc628728049a33a38@mail.gmail.com> Message-ID: <20090225140614.GB7999@kunpuu.plessy.org> Le Wed, Feb 25, 2009 at 02:00:07PM +0000, Albert Vilella a ?crit : > > Any parameters in particular for the script? Hi, I used either --from fastq --to fasta, or --from fastq --to raw. Have a nice day, -- Charles From kanzure at gmail.com Wed Feb 25 09:13:29 2009 From: kanzure at gmail.com (Bryan Bishop) Date: Wed, 25 Feb 2009 08:13:29 -0600 Subject: [Bioperl-l] Is Bio::KEGG:Enzyme implemented? Message-ID: <55ad6af70902250613g98e63cejd85c68c964e95ea5@mail.gmail.com> Hey all, I began writing a program to extract a reaction pathway from KEGG/reactome and "transplant" it into another genome, along with all of the required genetic 'dependencies' (in the sense of deb/rpm, apt/yum dependencies). To do this, I started writing a program that would parse KEGG's database format-- here's a reference discussion: http://groups.google.com/group/diybio/browse_frm/thread/d6ec92a5df6b4e74/3b22b31a504f29ca?#3b22b31a504f29ca Anyway, at first I thought bioperl hadn't implemented KEGG::Enzyme, but now I've noticed that biopython does: http://www.bioinform